Discussion:
[rancid] mtlogin error: Error TIMEOUT reached (2.3.8 deb)
Lukasz Sokol
2015-03-26 11:15:16 UTC
Permalink
Hi,

How can I debug, what's given to expect as timeout during rancid-run ?

(other than using & analyzing strace, possibly)

TL;DR:

still using the version from debian package on wheezy+backports,

***@george:~$ sudo apt-cache show rancid
Package: rancid
Version: 2.3.8-3

***@george:~$ bin/mtrancid -V
rancid 2.3.8

On one router, while rancid-run, i get in the logs:

myrouter mtlogin error: Error: TIMEOUT reached
myrouter: missed cmd(s): system license print,system routerboard print,export,system package print detail without-paging
myrouter: End of run not found

and predictably no config is collected.

So I split the cmd(s) to separate mtlogin -c invocations, slapped them into a script and ran it under time,

revealing

time ./myscript > /dev/null

real 0m57.423s
user 0m0.148s
sys 0m0.060s


Running each mtlogin -c command separately, reveals that in particular "export" command can take its time,
interrupting its output for up to 40-45 seconds at a time.

I have increased timeo and timeout settings in mtrancid and mtlogin, to 120s each, but the
"TIMEOUT reached" in logs still appears.

Neither command, when ran separately, is hanging forever, i.e. each one completes and exits successfully.

I googled and found one occurrence on this m/l when someone asked, is the router having a lot or routes,
but that, as far as I could follow, wasn't answered; in my case, the answer is YES, and NO I can't put any less on it.

Hope this can still be helped ;)

el es
Tim Warnock
2015-03-26 11:19:07 UTC
Permalink
Post by Lukasz Sokol
I googled and found one occurrence on this m/l when someone asked, is the
router having a lot or routes,
but that, as far as I could follow, wasn't answered; in my case, the answer is
YES, and NO I can't put any less on it.
Hope this can still be helped ;)
el es
Ok simple test.

How long does an /export take if you run it from ssh/winbox?
Lukasz Sokol
2015-03-26 12:58:16 UTC
Permalink
Post by Tim Warnock
Post by Lukasz Sokol
I googled and found one occurrence on this m/l when someone asked, is the
router having a lot or routes,
but that, as far as I could follow, wasn't answered; in my case, the answer is
YES, and NO I can't put any less on it.
Hope this can still be helped ;)
el es
Ok simple test.
How long does an /export take if you run it from ssh/winbox?
***@george:~$ time ~/bin/mtlogin -c "export" myrouter > /dev/null

real 0m48.548s
user 0m0.060s
sys 0m0.024s

and in test with a stopwatch,

(doing ssh -i my_ident ***@myrouter, then issuing export by hand)

give or take 36s lag has been encountered
after first 8s of running. Give or take because there were a few of them
and not exactly sure how to catch...

hope this helps,

el es
Lukasz Sokol
2015-03-26 15:13:22 UTC
Permalink
Post by Lukasz Sokol
Post by Tim Warnock
Post by Lukasz Sokol
I googled and found one occurrence on this m/l when someone asked, is the
router having a lot or routes,
but that, as far as I could follow, wasn't answered; in my case, the answer is
YES, and NO I can't put any less on it.
Hope this can still be helped ;)
el es
Ok simple test.
How long does an /export take if you run it from ssh/winbox?
real 0m48.548s
user 0m0.060s
sys 0m0.024s
and in test with a stopwatch,
give or take 36s lag has been encountered
after first 8s of running. Give or take because there were a few of them
and not exactly sure how to catch...
Looking at run_commands,

it strikes me, that sending the command, it is allowed to timeout, just like that -

but unfortunately :

- trying to catch the timeout at first and making it go exp_continue instead, doesn't work
- trying to insert another expect clause in front of the main one, just to catch the timeout, doesn't work

Looks like this calls for a 'for' loop and checking whether the cause of expect breakout was timeout...
Post by Lukasz Sokol
hope this helps,
el es
heasley
2015-03-26 15:36:21 UTC
Permalink
Post by Lukasz Sokol
Post by Lukasz Sokol
real 0m48.548s
user 0m0.060s
sys 0m0.024s
and in test with a stopwatch,
give or take 36s lag has been encountered
after first 8s of running. Give or take because there were a few of them
and not exactly sure how to catch...
Does that mean that the actualy run-time for the command was 8s when done
manually? if so, please try mtlogin from rancid 3.2.
Post by Lukasz Sokol
Looking at run_commands,
it strikes me, that sending the command, it is allowed to timeout, just like that -
- trying to catch the timeout at first and making it go exp_continue instead, doesn't work
- trying to insert another expect clause in front of the main one, just to catch the timeout, doesn't work
Looks like this calls for a 'for' loop and checking whether the cause of expect breakout was timeout...
I do not understand. If you think confident in the issue you are trying to
explain, please try to explain it in a different manner.

I do not have a mikrotik and the code was a contribution, but mtlogin looks
reasonable, though this match is odd:
-re "^\[^\n\r ]*>>.*$reprompt" { exp_continue }
heasley
2015-03-26 15:37:37 UTC
Permalink
Also note that the default timeout in mtlogin is 45 seconds. it can be
adjusted in cloginrc or on the command-line.
Lukasz Sokol
2015-03-26 15:43:54 UTC
Permalink
I looked deeper into what "myrouter" was...

Seems the name was pointing into wrong IP address,
someone editing nslookup (or DNS, not sure now...)
entered wrong IP there :(

It works now, without mods (just with increased timeout to 60 to accomodate for the 'export' time)

Sorry for the noise.

el es

Alan McKinnon
2015-03-26 11:24:05 UTC
Permalink
Post by Lukasz Sokol
Hi,
How can I debug, what's given to expect as timeout during rancid-run ?
(other than using & analyzing strace, possibly)
still using the version from debian package on wheezy+backports,
Package: rancid
Version: 2.3.8-3
rancid 2.3.8
myrouter mtlogin error: Error: TIMEOUT reached
myrouter: missed cmd(s): system license print,system routerboard print,export,system package print detail without-paging
myrouter: End of run not found
and predictably no config is collected.
So I split the cmd(s) to separate mtlogin -c invocations, slapped them into a script and ran it under time,
revealing
time ./myscript > /dev/null
real 0m57.423s
user 0m0.148s
sys 0m0.060s
Running each mtlogin -c command separately, reveals that in particular "export" command can take its time,
interrupting its output for up to 40-45 seconds at a time.
I have increased timeo and timeout settings in mtrancid and mtlogin, to 120s each, but the
"TIMEOUT reached" in logs still appears.
Neither command, when ran separately, is hanging forever, i.e. each one completes and exits successfully.
I googled and found one occurrence on this m/l when someone asked, is the router having a lot or routes,
but that, as far as I could follow, wasn't answered; in my case, the answer is YES, and NO I can't put any less on it.
Hope this can still be helped ;)
This is a common problem, and there's no easy way to solve it.
Basically, mtrancid is scanning the output from the device line by line,
expecting to find the router's cli prompt followed by a command run. It
then processes the entire output until it finds the next prompt.

This works well until mtrancid gets some input it's not expecting, or
the input stops for whatever reason. rancid itself is in no position to
figure out what the problem is, so the closest you get is the "End of
run not found" message. Which essentially means "something went wrong",
no more and no less.

mtrancid -d <device name or ip>

usually gives enough info to figure out what is really going wrong
--
Alan McKinnon
***@gmail.com
Lukasz Sokol
2015-03-26 13:09:54 UTC
Permalink
Post by Alan McKinnon
Post by Lukasz Sokol
Hi,
How can I debug, what's given to expect as timeout during rancid-run ?
[...]
Post by Alan McKinnon
This is a common problem, and there's no easy way to solve it.
Basically, mtrancid is scanning the output from the device line by line,
expecting to find the router's cli prompt followed by a command run. It
then processes the entire output until it finds the next prompt.
This works well until mtrancid gets some input it's not expecting, or
the input stops for whatever reason. rancid itself is in no position to
figure out what the problem is, so the closest you get is the "End of
run not found" message. Which essentially means "something went wrong",
no more and no less.
mtrancid -d <device name or ip>
usually gives enough info to figure out what is really going wrong
mmm...
(please note I'm on debian and done sudo su - rancid to run this from command line: )

***@george:~$ PATH=$PATH:~/bin; mtrancid -d myrouter

// had to add ~/bin to path otherwise mtrancid would not find mtlogin; but this probably is by design.
// the below is mtrancid -d output:

executing mtlogin -t 120 -c"system package print detail without-paging;system routerboard print;system license print;export" myrouter

PROMPT MATCH: ] >
HIT COMMAND:[***@myrouter] > system package print detail without-paging
In SystemPackagePrintDetail: [***@myrouter] > system package print detail without-paging
HIT COMMAND:[***@myrouter] > system routerboard print
In SystemRouterboardPrint: [***@myrouter] > system routerboard print
HIT COMMAND:[***@myrouter] > system license print
In SystemLicensePrint: [***@myrouter] > system license print
HIT COMMAND:[***@myrouter] > export
In Export: [***@myrouter] > export
[waited here for a while and then ...]
***@george:~$

this just exited, _no_error_.

but also no logs and not a sign of what it was doing...

?!
el es
Lukasz Sokol
2015-03-26 13:13:59 UTC
Permalink
Post by Lukasz Sokol
Post by Alan McKinnon
Post by Lukasz Sokol
Hi,
How can I debug, what's given to expect as timeout during rancid-run ?
[...]
Post by Alan McKinnon
This is a common problem, and there's no easy way to solve it.
Basically, mtrancid is scanning the output from the device line by line,
expecting to find the router's cli prompt followed by a command run. It
then processes the entire output until it finds the next prompt.
This works well until mtrancid gets some input it's not expecting, or
the input stops for whatever reason. rancid itself is in no position to
figure out what the problem is, so the closest you get is the "End of
run not found" message. Which essentially means "something went wrong",
no more and no less.
mtrancid -d <device name or ip>
usually gives enough info to figure out what is really going wrong
mmm...
(please note I'm on debian and done sudo su - rancid to run this from command line: )
// had to add ~/bin to path otherwise mtrancid would not find mtlogin; but this probably is by design.
executing mtlogin -t 120 -c"system package print detail without-paging;system routerboard print;system license print;export" myrouter
PROMPT MATCH: ] >
[waited here for a while and then ...]
this just exited, _no_error_.
but also no logs and not a sign of what it was doing...
Having looked closer, I found the output file... sorry. and it seems to be intact too.
Post by Lukasz Sokol
el es
Lukasz Sokol
2015-03-26 13:23:15 UTC
Permalink
Post by Lukasz Sokol
Post by Alan McKinnon
Post by Lukasz Sokol
Hi,
How can I debug, what's given to expect as timeout during rancid-run ?
[...]
Post by Alan McKinnon
This is a common problem, and there's no easy way to solve it.
Basically, mtrancid is scanning the output from the device line by line,
expecting to find the router's cli prompt followed by a command run. It
then processes the entire output until it finds the next prompt.
This works well until mtrancid gets some input it's not expecting, or
the input stops for whatever reason. rancid itself is in no position to
figure out what the problem is, so the closest you get is the "End of
run not found" message. Which essentially means "something went wrong",
no more and no less.
mtrancid -d <device name or ip>
usually gives enough info to figure out what is really going wrong
mmm...
(please note I'm on debian and done sudo su - rancid to run this from command line: )
and here it is with 'time' added on to run on mtrancid:

***@george:~$ PATH=$PATH:~/bin; time mtrancid -d myrouter
executing mtlogin -t 120 -c"system package print detail without-paging;system routerboard print;system license print;export" myrouter
PROMPT MATCH: ] >
HIT COMMAND:[***@myrouter] > system package print detail without-paging
In SystemPackagePrintDetail: [***@myrouter] > system package print detail without-paging
HIT COMMAND:[***@myrouter] > system routerboard print
In SystemRouterboardPrint: [***@myrouter] > system routerboard print
HIT COMMAND:[***@myrouter] > system license print
In SystemLicensePrint: [***@myrouter] > system license print
HIT COMMAND:[***@myrouter] > export
In Export: [***@myrouter] > export

real 0m45.292s
user 0m0.100s
sys 0m0.024s

HTH
el es

el es
Loading...