Discussion:
[rancid] Rancid and Nortel 8600
AJ Schroeder
2013-10-28 17:00:11 UTC
Permalink
Hello list,

I know that this subject has come up many times before and I have been searching for answers on this subject and keep running into dead-ends. I am trying to backup some old 8600 switches with RANCID and much like the switches themselves, the backups aren't working very well. I have the device set to "baynet" in the router.db file and this is what I get in the logs (names changed to x.x.x.x):

Trying to get all of the configs.
x.x.x.x blogin error: Error: TIMEOUT reached
x.x.x.x: missed cmd(s): bcc,exit,show config,show config -all
x.x.x.x: End of run not found
!
=====================================
Getting missed routers: round 1.
x.x.x.x blogin error: Error: TIMEOUT reached
x.x.x.x: missed cmd(s): bcc,exit,show config,show config -all
x.x.x.x: End of run not found
!

I have googled this issue and apparently there were some custom login scripts created for these devices called "pplogin" and "pprancid" but those scripts seem to have vanished as well. If anyone has any tips on how to get past these end of run errors it would be greatly appreciated.

Thanks,

AJ Schroeder
Alan McKinnon
2013-10-28 19:47:52 UTC
Permalink
Post by AJ Schroeder
Hello list,
I know that this subject has come up many times before and I have been
searching for answers on this subject and keep running into dead-ends. I
am trying to backup some old 8600 switches with RANCID and much like the
switches themselves, the backups aren’t working very well. I have the
device set to “baynet” in the router.db file and this is what I get in
Trying to get all of the configs.
x.x.x.x blogin error: Error: TIMEOUT reached
^^^^^^^^^^^^^^^^^^^^^^

This is your problem. The code is trying to log into the device and it
does not succeed. At this point you need to apply regular network
troubleshooting techniques as the root cause is usually nothing to do
with the rancid code. The "End of run not found" error at this point is
not worth looking at further, it means the same as "something went wrong".

It's worth repeating at this point that trying various code dumps that
show up in google is unlikely to work well for you, you need to take a
more structured approach.

First, telnet or ssh as appropriate from your rancid host to the device,
establish if that works. Then assuming that brancid and blogin are the
correct scripts for your device type, run blogin manually and see what
happens. This can fail in so many ways, for example:

no connectivity between rancid host and device
ports 22 and 23 firewalled out
device not listening on ports 22 and 23
blogin trying to use an invalid username
blogin trying to use an incorrect password
and more

Don't forget to check that the contents of your ~/.cloginrc are valid
and correct.

Until you have done these steps, there is very little anyone can do to
assist you meaningfully.
--
Alan McKinnon
***@gmail.com
AJ Schroeder
2013-10-29 14:53:07 UTC
Permalink
Alan,

Thanks for the reply. I ran blogin manually against the device and it logs me right in, however after letting it idle about 30 seconds I get the TIMEOUT reached message:

***@linux-server:~> bin/blogin host.example.com
host.example.com
spawn ssh -c 3des -x -l rwa host.example.com
***@host.example.com's password:

Nortel8600:5#
Error: TIMEOUT reached
***@linux-server:~>

I may have some fairly aggressive idle timeout configured on the switch, but blogin is successful from the server to the switch. There is no firewall in play either.

As far as .cloginrc I only have these three lines uncommented:

add user * user
add password * pass
add method * ssh telnet

Thanks,

AJ

-----Original Message-----
From: rancid-discuss-***@shrubbery.net [mailto:rancid-discuss-***@shrubbery.net] On Behalf Of Alan McKinnon
Sent: Monday, October 28, 2013 2:48 PM
To: rancid-***@shrubbery.net
Subject: Re: [rancid] Rancid and Nortel 8600
Post by AJ Schroeder
Hello list,
I know that this subject has come up many times before and I have been
searching for answers on this subject and keep running into dead-ends.
I am trying to backup some old 8600 switches with RANCID and much like
the switches themselves, the backups aren't working very well. I have
the device set to "baynet" in the router.db file and this is what I
Trying to get all of the configs.
x.x.x.x blogin error: Error: TIMEOUT reached
^^^^^^^^^^^^^^^^^^^^^^

This is your problem. The code is trying to log into the device and it does not succeed. At this point you need to apply regular network troubleshooting techniques as the root cause is usually nothing to do with the rancid code. The "End of run not found" error at this point is not worth looking at further, it means the same as "something went wrong".

It's worth repeating at this point that trying various code dumps that show up in google is unlikely to work well for you, you need to take a more structured approach.

First, telnet or ssh as appropriate from your rancid host to the device, establish if that works. Then assuming that brancid and blogin are the correct scripts for your device type, run blogin manually and see what happens. This can fail in so many ways, for example:

no connectivity between rancid host and device ports 22 and 23 firewalled out device not listening on ports 22 and 23 blogin trying to use an invalid username blogin trying to use an incorrect password and more

Don't forget to check that the contents of your ~/.cloginrc are valid and correct.

Until you have done these steps, there is very little anyone can do to assist you meaningfully.



--
Alan McKinnon
***@gmail.com
Alan McKinnon
2013-10-30 06:27:12 UTC
Permalink
Hi AJ,

That would explain it.

It's most unusual that it happens when brancid is running though, as
it's not idle, it's running commands in rapid succession. Tuning back
the timeout settings on the switch will probably solve your issue as
everything else you mention looks fine.

For troubleshooting things like this, I find

brancid -d <hostname>

very useful. It dumps the entire *login command to the console where you
can copy-paste it and run it repeatedly, plus lots of error output.
Post by AJ Schroeder
Alan,
host.example.com
spawn ssh -c 3des -x -l rwa host.example.com
Nortel8600:5#
Error: TIMEOUT reached
I may have some fairly aggressive idle timeout configured on the switch, but blogin is successful from the server to the switch. There is no firewall in play either.
add user * user
add password * pass
add method * ssh telnet
Thanks,
AJ
-----Original Message-----
Sent: Monday, October 28, 2013 2:48 PM
Subject: Re: [rancid] Rancid and Nortel 8600
Post by AJ Schroeder
Hello list,
I know that this subject has come up many times before and I have been
searching for answers on this subject and keep running into dead-ends.
I am trying to backup some old 8600 switches with RANCID and much like
the switches themselves, the backups aren't working very well. I have
the device set to "baynet" in the router.db file and this is what I
Trying to get all of the configs.
x.x.x.x blogin error: Error: TIMEOUT reached
^^^^^^^^^^^^^^^^^^^^^^
This is your problem. The code is trying to log into the device and it does not succeed. At this point you need to apply regular network troubleshooting techniques as the root cause is usually nothing to do with the rancid code. The "End of run not found" error at this point is not worth looking at further, it means the same as "something went wrong".
It's worth repeating at this point that trying various code dumps that show up in google is unlikely to work well for you, you need to take a more structured approach.
no connectivity between rancid host and device ports 22 and 23 firewalled out device not listening on ports 22 and 23 blogin trying to use an invalid username blogin trying to use an incorrect password and more
Don't forget to check that the contents of your ~/.cloginrc are valid and correct.
Until you have done these steps, there is very little anyone can do to assist you meaningfully.
--
Alan McKinnon
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
--
Alan McKinnon
***@gmail.com
heasley
2013-10-30 22:18:57 UTC
Permalink
Post by Alan McKinnon
Hi AJ,
That would explain it.
It's most unusual that it happens when brancid is running though, as
it's not idle, it's running commands in rapid succession. Tuning back
the timeout settings on the switch will probably solve your issue as
everything else you mention looks fine.
For troubleshooting things like this, I find
brancid -d <hostname>
very useful. It dumps the entire *login command to the console where you
can copy-paste it and run it repeatedly, plus lots of error output.
Post by AJ Schroeder
Alan,
its not completing the login - its not matching the prompt that its looking
for, it'd gone into interactive mode at this point and you'd be able to type
at the prompt.

blogin -d host will help you debug.
Post by Alan McKinnon
Post by AJ Schroeder
host.example.com
spawn ssh -c 3des -x -l rwa host.example.com
Nortel8600:5#
Error: TIMEOUT reached
AJ Schroeder
2013-11-01 18:41:00 UTC
Permalink
I'm not a guru, but from what I can tell rancid or expect is having issues processing the login sequence and then just sits there. Here is a snippet from a debug of blogin command (for reference, the command I ran is "blogin -t 90 -c "show config;show config -all;exit" switch.example.com"):

expect: set expect_out(0,string) "password:"
expect: set expect_out(1,string) "password"
expect: set expect_out(spawn_id) "exp4"
expect: set expect_out(buffer) "***@switch.example.com's password:"
send: sending "rwa\r" to { exp4 }
Gate keeper glob pattern for '[Pp]assword:' is '?assword:'. Activating booster.

expect: does " " (spawn_id exp4) match regular expression "[Pp]assword:"? Gate "?assword:"? gate=no
">"? no


expect: does " \r\n" (spawn_id exp4) match regular expression "[Pp]assword:"? Gate "?assword:"? gate=no
">"? no

ERS-8600:5#
expect: does " \r\n\r\nERS-8600:5# " (spawn_id exp4) match regular expression "[Pp]assword:"? Gate "?assword:"? gate=no
">"? no
expect: timed out

Error: TIMEOUT reached

After the timeout is reached, I am logged out of the switch.

Thanks,

AJ

-----Original Message-----
From: rancid-discuss-***@shrubbery.net [mailto:rancid-discuss-***@shrubbery.net] On Behalf Of heasley
Sent: Wednesday, October 30, 2013 5:19 PM
To: Alan McKinnon
Cc: rancid-***@shrubbery.net
Subject: Re: [rancid] Rancid and Nortel 8600
Post by Alan McKinnon
Hi AJ,
That would explain it.
It's most unusual that it happens when brancid is running though, as
it's not idle, it's running commands in rapid succession. Tuning back
the timeout settings on the switch will probably solve your issue as
everything else you mention looks fine.
For troubleshooting things like this, I find
brancid -d <hostname>
very useful. It dumps the entire *login command to the console where
you can copy-paste it and run it repeatedly, plus lots of error output.
Post by AJ Schroeder
Alan,
its not completing the login - its not matching the prompt that its looking for, it'd gone into interactive mode at this point and you'd be able to type at the prompt.

blogin -d host will help you debug.
Post by Alan McKinnon
Post by AJ Schroeder
Nortel8600:5#
Error: TIMEOUT reached
AJ Schroeder
2013-11-04 20:55:01 UTC
Permalink
I am making some progress. I followed the article at http://www.shrubbery.net/pipermail/rancid-discuss/2004-July/000808.html and created a "passrancid" and a "passlogin" based on the diff outputs. I am now able to login using those two files and rancid executes a show config, but the switch disconnects during the show config command. I continued to search and found the thread http://www.gossamer-threads.com/lists/rancid/users/2825 describing the same issue that I am seeing, but I don't think there ever was any resolution to the problem. The last comment was that expect was probably hanging on a pager, so I executed this command:

expect -d /usr/bin/passlogin -c "show config" switch.example.com

And then I see this output and where I get cut off:

expect: does "vlan 208 ports remove 1/1-1/15,2/1-2/15,3/1-3/48,4/" (spawn_id exp4) match glob pattern "\n"? no
1-4/48,7/1-7/48,8/1-8/48,9/1-9/48,10/1-10/48 member portmember
vlan 208 ports add 1/16,2/16 member portmember
vlan 214 create byport 1 name "VLAN" color 3
vlan 214 add-mlt 4
vlan 214 ports remove 1/1-1/15,2/1-2/15,3/1-3/48,4/1-4/48,7/1-7/48,8/1-8/48,9/1-9/48,10/1-10/48 member portmember
vlan 214 ports add 1/16,2/16 member portmember
vlan 223 create byport 1 name "VOIP" color 3
vlan 223 add-mlt 4
vlan 223 ports remove 1/1-1/15,2/1-2/15,3/1-3/Connection to switch.example.com closed by remote host.
Connection to switch.example.com closed.

When I use passlogin to simply login to the switch I can manually run "show config" and everything works fine.

Any idea on how to fix this?

Thanks,

AJ

-----Original Message-----
From: rancid-discuss-***@shrubbery.net [mailto:rancid-discuss-***@shrubbery.net] On Behalf Of heasley
Sent: Wednesday, October 30, 2013 5:19 PM
To: Alan McKinnon
Cc: rancid-***@shrubbery.net
Subject: Re: [rancid] Rancid and Nortel 8600
Post by Alan McKinnon
Hi AJ,
That would explain it.
It's most unusual that it happens when brancid is running though, as
it's not idle, it's running commands in rapid succession. Tuning back
the timeout settings on the switch will probably solve your issue as
everything else you mention looks fine.
For troubleshooting things like this, I find
brancid -d <hostname>
very useful. It dumps the entire *login command to the console where
you can copy-paste it and run it repeatedly, plus lots of error output.
Post by AJ Schroeder
Alan,
its not completing the login - its not matching the prompt that its looking for, it'd gone into interactive mode at this point and you'd be able to type at the prompt.

blogin -d host will help you debug.
Post by Alan McKinnon
Post by AJ Schroeder
Nortel8600:5#
Error: TIMEOUT reached
'heasley'
2013-11-05 23:02:47 UTC
Permalink
Post by AJ Schroeder
expect -d /usr/bin/passlogin -c "show config" switch.example.com
expect: does "vlan 208 ports remove 1/1-1/15,2/1-2/15,3/1-3/48,4/" (spawn_id exp4) match glob pattern "\n"? no
1-4/48,7/1-7/48,8/1-8/48,9/1-9/48,10/1-10/48 member portmember
vlan 208 ports add 1/16,2/16 member portmember
vlan 214 create byport 1 name "VLAN" color 3
vlan 214 add-mlt 4
vlan 214 ports remove 1/1-1/15,2/1-2/15,3/1-3/48,4/1-4/48,7/1-7/48,8/1-8/48,9/1-9/48,10/1-10/48 member portmember
vlan 214 ports add 1/16,2/16 member portmember
vlan 223 create byport 1 name "VOIP" color 3
vlan 223 add-mlt 4
vlan 223 ports remove 1/1-1/15,2/1-2/15,3/1-3/Connection to switch.example.com closed by remote host.
Connection to switch.example.com closed.
When I use passlogin to simply login to the switch I can manually run "show config" and everything works fine.
Any idea on how to fix this?
buy a juniper? look for core dumps from expect, or run the commands your
script uses with -c of the login command and look for the disconnect/hang,
or tcpdump. its probably the device failing.

Loading...