Discussion:
[rancid] clogin and rancid good, rancid-run fails
Ken Celenza
2015-10-26 13:18:12 UTC
Permalink
I have been running rancid for a little over a year, and over the last month about 20 devices out of a few hundred stopped working. These are cisco devices, most of which have not been upgraded or rebooted in years. So I went through the normal debugging procedures that I know about. clogin works, rancid with debug "HIT"s all of the commands, but when I run it via "rancid-run -r <device>" it does not work.

From the logs:

-------------------------------------------

Trying to get all of the configs.
<device>: missed cmd(s): all commands
<device> clogin error: Error: Connection closed (ssh): <device>
<device>: End of run not found
!
=====================================
Getting missed routers: round 1.
<device>: missed cmd(s): all commands
<device> clogin error: Error: Connection closed (ssh): <device>
<device>: End of run not found
!
=====================================
Getting missed routers: round 2.
<device>: missed cmd(s): all commands
<device> clogin error: Error: Connection closed (ssh): <device>
<device>: End of run not found


-------------------------------------------


Is there any way to get the raw clogin output when running rancid-run sent to the log as well?

To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.
Alan McKinnon
2015-10-26 13:42:13 UTC
Permalink
On 26/10/2015 15:18, Ken Celenza wrote:
> I have been running rancid for a little over a year, and over the last month about 20 devices out of a few hundred stopped working. These are cisco devices, most of which have not been upgraded or rebooted in years. So I went through the normal debugging procedures that I know about. clogin works, rancid with debug "HIT"s all of the commands, but when I run it via "rancid-run -r <device>" it does not work.
>
> From the logs:
>
> -------------------------------------------
>
> Trying to get all of the configs.
> <device>: missed cmd(s): all commands
> <device> clogin error: Error: Connection closed (ssh): <device>
> <device>: End of run not found
> !
> =====================================
> Getting missed routers: round 1.
> <device>: missed cmd(s): all commands
> <device> clogin error: Error: Connection closed (ssh): <device>
> <device>: End of run not found
> !
> =====================================
> Getting missed routers: round 2.
> <device>: missed cmd(s): all commands
> <device> clogin error: Error: Connection closed (ssh): <device>
> <device>: End of run not found
>
>
> -------------------------------------------
>
>
> Is there any way to get the raw clogin output when running rancid-run sent to the log as well?
>
> To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.


What version of rancid are you using?

The main point of departure in your results is that one uses router.db,
the other does not. I would start by verifying that router.db entries
for those problem devices are OK.


--
Alan McKinnon
***@gmail.com
Ken Celenza
2015-10-26 16:44:42 UTC
Permalink
> Sent: Monday, October 26, 2015 at 9:42 AM
> From: "Alan McKinnon" <***@gmail.com>
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> On 26/10/2015 15:18, Ken Celenza wrote:
> > I have been running rancid for a little over a year, and over the last month about 20 devices out of a few hundred stopped working. These are cisco devices, most of which have not been upgraded or rebooted in years. So I went through the normal debugging procedures that I know about. clogin works, rancid with debug "HIT"s all of the commands, but when I run it via "rancid-run -r <device>" it does not work.
> >
> > From the logs:
> >
> > -------------------------------------------
> >
> > Trying to get all of the configs.
> > <device>: missed cmd(s): all commands
> > <device> clogin error: Error: Connection closed (ssh): <device>
> > <device>: End of run not found
> > !
> > =====================================
> > Getting missed routers: round 1.
> > <device>: missed cmd(s): all commands
> > <device> clogin error: Error: Connection closed (ssh): <device>
> > <device>: End of run not found
> > !
> > =====================================
> > Getting missed routers: round 2.
> > <device>: missed cmd(s): all commands
> > <device> clogin error: Error: Connection closed (ssh): <device>
> > <device>: End of run not found
> >
> >
> > -------------------------------------------
> >
> >
> > Is there any way to get the raw clogin output when running rancid-run sent to the log as well?
> >
> > To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.
>
>
> What version of rancid are you using?
>
> The main point of departure in your results is that one uses router.db,
> the other does not. I would start by verifying that router.db entries
> for those problem devices are OK.
>
>
> --
> Alan McKinnon
> ***@gmail.com
>
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>


Version "

$Id: rancid.in 2820 2014-04-25 19:03:59Z heas $
rancid 3.1
"

Good call, but verified the router.db, and it's using ";". I actually think I have more of a hint. It is a suse server and was upgraded to suse 11 SP4, so I suspect one of those packages caused as issue.

Any other suggestions?
heasley
2015-10-26 17:00:32 UTC
Permalink
Mon, Oct 26, 2015 at 05:44:42PM +0100, Ken Celenza:
> > > Trying to get all of the configs.
> > > <device>: missed cmd(s): all commands
> > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > <device>: End of run not found
> > > !
> > > =====================================
> > > Getting missed routers: round 1.
> > > <device>: missed cmd(s): all commands
> > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > <device>: End of run not found
> > > !
> > > =====================================
> > > Getting missed routers: round 2.
> > > <device>: missed cmd(s): all commands
> > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > <device>: End of run not found
> > >
> > > Is there any way to get the raw clogin output when running rancid-run sent to the log as well?

not easily, but this is a good feature idea.

> > > To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.

what is similar about the devices that are failing?
Ken Celenza
2015-10-26 18:25:25 UTC
Permalink
> Sent: Monday, October 26, 2015 at 1:00 PM
> From: heasley <***@shrubbery.net>
> To: "Ken Celenza" <***@mail.com>
> Cc: "Alan McKinnon" <***@gmail.com>, rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> Mon, Oct 26, 2015 at 05:44:42PM +0100, Ken Celenza:
> > > > Trying to get all of the configs.
> > > > <device>: missed cmd(s): all commands
> > > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > > <device>: End of run not found
> > > > !
> > > > =====================================
> > > > Getting missed routers: round 1.
> > > > <device>: missed cmd(s): all commands
> > > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > > <device>: End of run not found
> > > > !
> > > > =====================================
> > > > Getting missed routers: round 2.
> > > > <device>: missed cmd(s): all commands
> > > > <device> clogin error: Error: Connection closed (ssh): <device>
> > > > <device>: End of run not found
> > > >
> > > > Is there any way to get the raw clogin output when running rancid-run sent to the log as well?
>
> not easily, but this is a good feature idea.
>
> > > > To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.
>
> what is similar about the devices that are failing?
>

They are all: 12.4(24)T(X) code, cisco routers

e.g.
12.4(24)T
12.4(24)T4
12.4(24)T6
12.4(24)T8

routers
7204VXR
7206VXR
3825
3845
1841
Alex DEKKER
2015-10-27 12:35:57 UTC
Permalink
On 26/10/15 18:25, Ken Celenza wrote:
>
> They are all: 12.4(24)T(X) code, cisco routers
>
> e.g.
> 12.4(24)T
> 12.4(24)T4
> 12.4(24)T6
> 12.4(24)T8
>
> routers
> 7204VXR
> 7206VXR
> 3825
> 3845
> 1841
>

Can you SSH onto them from that box without any special parameters to
SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
the default key size, perhaps?] used by older IOS, which means you have
to add some -o option to make it work.

alexd
Ken Celenza
2015-10-27 16:23:43 UTC
Permalink
> Sent: Tuesday, October 27, 2015 at 8:35 AM
> From: "Alex DEKKER" <***@ale.cx>
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> On 26/10/15 18:25, Ken Celenza wrote:
> >
> > They are all: 12.4(24)T(X) code, cisco routers
> >
> > e.g.
> > 12.4(24)T
> > 12.4(24)T4
> > 12.4(24)T6
> > 12.4(24)T8
> >
> > routers
> > 7204VXR
> > 7206VXR
> > 3825
> > 3845
> > 1841
> >
>
> Can you SSH onto them from that box without any special parameters to
> SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> the default key size, perhaps?] used by older IOS, which means you have
> to add some -o option to make it work.
>
> alexd
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>

I think this is it. It's still weird that it works fine with ./rancid but not ./rancid-run. That being said, I turned on telnet, it worked fine, and I got a list of the packages that were updated. No changes to perl or expect, but openssh was updated and I found this.

https://www.suse.com/support/kb/doc.php?id=7016904

Trying to get it downgraded.

Thanks for everyone's help, and I'll report back if it did in fact fix the issue.
Lee Rian (CENSUS/TCO FED)
2015-10-27 17:04:38 UTC
Permalink
> openssh was updated and I found this.
>
> https://www.suse.com/support/kb/doc.php?id=7016904

hrmm.. interesting. I ran into problems after upgrading to openssh 7.something but it was very consistent - things either worked or no. It didn't make any difference using clogin or rancid-run

> Trying to get it downgraded.

Can you try a few things before downgrading?

My .cloginrc - don't use 3DES for ssh:
# add cyphertype * {3des}
add cyphertype * {aes256-cbc}

My ~/.ssh/config - allow sha1
KexAlgorithms +diffie-hellman-group1-sha1

I don't remember if this was required or no, but I did
ssh-keygen -l -f ~/.ssh/known_hosts | sort -rn

and regenerated the ssh keys on anything that had a key length < 1024 bits

Regards,
Lee


________________________________________
From: Rancid-discuss <rancid-discuss-***@shrubbery.net> on behalf of Ken Celenza <***@mail.com>
Sent: Tuesday, October 27, 2015 12:23 PM
To: rancid-***@shrubbery.net
Subject: Re: [rancid] clogin and rancid good, rancid-run fails

> Sent: Tuesday, October 27, 2015 at 8:35 AM
> From: "Alex DEKKER" <***@ale.cx>
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> On 26/10/15 18:25, Ken Celenza wrote:
> >
> > They are all: 12.4(24)T(X) code, cisco routers
> >
> > e.g.
> > 12.4(24)T
> > 12.4(24)T4
> > 12.4(24)T6
> > 12.4(24)T8
> >
> > routers
> > 7204VXR
> > 7206VXR
> > 3825
> > 3845
> > 1841
> >
>
> Can you SSH onto them from that box without any special parameters to
> SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> the default key size, perhaps?] used by older IOS, which means you have
> to add some -o option to make it work.
>
> alexd
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>

I think this is it. It's still weird that it works fine with ./rancid but not ./rancid-run. That being said, I turned on telnet, it worked fine, and I got a list of the packages that were updated. No changes to perl or expect, but openssh was updated and I found this.

https://www.suse.com/support/kb/doc.php?id=7016904

Trying to get it downgraded.

Thanks for everyone's help, and I'll report back if it did in fact fix the issue.
Ken Celenza
2015-10-27 19:27:37 UTC
Permalink
> Sent: Tuesday, October 27, 2015 at 1:04 PM
> From: "Lee Rian (CENSUS/TCO FED)" <***@census.gov>
> To: "Ken Celenza" <***@mail.com>, "rancid-***@shrubbery.net" <rancid-***@shrubbery.net>
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> > openssh was updated and I found this.
> >
> > https://www.suse.com/support/kb/doc.php?id=7016904
>
> hrmm.. interesting. I ran into problems after upgrading to openssh 7.something but it was very consistent - things either worked or no. It didn't make any difference using clogin or rancid-run
>
> > Trying to get it downgraded.
>
> Can you try a few things before downgrading?
>
> My .cloginrc - don't use 3DES for ssh:
> # add cyphertype * {3des}
> add cyphertype * {aes256-cbc}
>
> My ~/.ssh/config - allow sha1
> KexAlgorithms +diffie-hellman-group1-sha1
>
> I don't remember if this was required or no, but I did
> ssh-keygen -l -f ~/.ssh/known_hosts | sort -rn
>
> and regenerated the ssh keys on anything that had a key length < 1024 bits
>
> Regards,
> Lee
>
>
> ________________________________________
> From: Rancid-discuss <rancid-discuss-***@shrubbery.net> on behalf of Ken Celenza <***@mail.com>
> Sent: Tuesday, October 27, 2015 12:23 PM
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> > Sent: Tuesday, October 27, 2015 at 8:35 AM
> > From: "Alex DEKKER" <***@ale.cx>
> > To: rancid-***@shrubbery.net
> > Subject: Re: [rancid] clogin and rancid good, rancid-run fails
> >
> > On 26/10/15 18:25, Ken Celenza wrote:
> > >
> > > They are all: 12.4(24)T(X) code, cisco routers
> > >
> > > e.g.
> > > 12.4(24)T
> > > 12.4(24)T4
> > > 12.4(24)T6
> > > 12.4(24)T8
> > >
> > > routers
> > > 7204VXR
> > > 7206VXR
> > > 3825
> > > 3845
> > > 1841
> > >
> >
> > Can you SSH onto them from that box without any special parameters to
> > SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> > the default key size, perhaps?] used by older IOS, which means you have
> > to add some -o option to make it work.
> >
> > alexd
> > _______________________________________________
> > Rancid-discuss mailing list
> > Rancid-***@shrubbery.net
> > http://www.shrubbery.net/mailman/listinfo/rancid-discuss
> >
>
> I think this is it. It's still weird that it works fine with ./rancid but not ./rancid-run. That being said, I turned on telnet, it worked fine, and I got a list of the packages that were updated. No changes to perl or expect, but openssh was updated and I found this.
>
> https://www.suse.com/support/kb/doc.php?id=7016904
>
> Trying to get it downgraded.
>
> Thanks for everyone's help, and I'll report back if it did in fact fix the issue.
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>



It did not work with those changes. I did not adjust my known_host file, but I have my known_host always sent to null, so it should not be an issue.
Jethro R Binks
2015-10-27 18:48:50 UTC
Permalink
On Tue, 27 Oct 2015, Ken Celenza wrote:

> > Sent: Tuesday, October 27, 2015 at 8:35 AM
> > From: "Alex DEKKER" <***@ale.cx>
> >
> > Can you SSH onto them from that box without any special parameters to
> > SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> > the default key size, perhaps?] used by older IOS, which means you have
> > to add some -o option to make it work.
> >
> > alexd
>
> I think this is it. It's still weird that it works fine with ./rancid
> but not ./rancid-run. That being said, I turned on telnet, it worked
> fine, and I got a list of the packages that were updated. No changes to
> perl or expect, but openssh was updated and I found this.

Holy Batman;

I've had a problem with a couple of systems for a while which I've only
half-heartedly looked at, and then when I set them to 'down' forgot about
completely for a while more.

But inspired by the above comments, I tested each of /usr/bin/ssh and
/usr/local/bin/ssh, and the latter works but the former does not. This
explains why, like one of the OPs, rancid-run on the command-line worked,
but not when run from cron - a variant of the usual reason, that the
environment is different (in this case, $PATH).

I changed the order in the PATH in rancid.conf, and now it can connect to
the systems concerned (and I see form the diffs that they started to fail
after an update that changed some SSL/TLS settings).

The system /usr/bin/ssh was giving the following error:

no matching cipher found: client aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,aes192-cbc,aes256-cbc,rijndael-***@lysator.liu.se server aes128-ctr,aes192-ctr,aes256-ctr

Unfortunately his never made it to a rancid logfile that I could see so I
was completely in the dark. Is there any way that ssh errors like this
could be caught and logged?

Happy Jethro.

. . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks, Network Manager,
Information Services Directorate, University Of Strathclyde, Glasgow, UK

The University of Strathclyde is a charitable body, registered in
Scotland, number SC015263.
Ken Celenza
2015-10-27 19:32:44 UTC
Permalink
> Sent: Tuesday, October 27, 2015 at 2:48 PM
> From: "Jethro R Binks" <***@strath.ac.uk>
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
> On Tue, 27 Oct 2015, Ken Celenza wrote:
>
> > > Sent: Tuesday, October 27, 2015 at 8:35 AM
> > > From: "Alex DEKKER" <***@ale.cx>
> > >
> > > Can you SSH onto them from that box without any special parameters to
> > > SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> > > the default key size, perhaps?] used by older IOS, which means you have
> > > to add some -o option to make it work.
> > >
> > > alexd
> >
> > I think this is it. It's still weird that it works fine with ./rancid
> > but not ./rancid-run. That being said, I turned on telnet, it worked
> > fine, and I got a list of the packages that were updated. No changes to
> > perl or expect, but openssh was updated and I found this.
>
> Holy Batman;
>
> I've had a problem with a couple of systems for a while which I've only
> half-heartedly looked at, and then when I set them to 'down' forgot about
> completely for a while more.
>
> But inspired by the above comments, I tested each of /usr/bin/ssh and
> /usr/local/bin/ssh, and the latter works but the former does not. This
> explains why, like one of the OPs, rancid-run on the command-line worked,
> but not when run from cron - a variant of the usual reason, that the
> environment is different (in this case, $PATH).
>
> I changed the order in the PATH in rancid.conf, and now it can connect to
> the systems concerned (and I see form the diffs that they started to fail
> after an update that changed some SSL/TLS settings).
>
> The system /usr/bin/ssh was giving the following error:
>
> no matching cipher found: client aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,aes192-cbc,aes256-cbc,rijndael-***@lysator.liu.se server aes128-ctr,aes192-ctr,aes256-ctr
>
> Unfortunately his never made it to a rancid logfile that I could see so I
> was completely in the dark. Is there any way that ssh errors like this
> could be caught and logged?
>
> Happy Jethro.
>
> . . . . . . . . . . . . . . . . . . . . . . . . .
> Jethro R Binks, Network Manager,
> Information Services Directorate, University Of Strathclyde, Glasgow, UK
>
> The University of Strathclyde is a charitable body, registered in
> Scotland, number SC015263.
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>


Brilliant!! So yes, I can confirm when running ssh from /usr/bin it fails, when I run the ssh I have it works no problem. Now what's still weird is my $path it shows /usr/bin second, but when I run it via rancid-run, it comes up first and fails, not exactly sure why. I was able to confirm this by monitoring my processes spawning with "strace -feprocess $SHELL"

I saw this:
[pid 6384] execve("/src/rancid/rancid/bin/ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = -1 ENOENT (No such file or directory)
[pid 6384] execve("/src/rancid/rancid//ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = -1 ENOENT (No such file or directory)
[pid 6384] execve("/usr/bin/ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = 0
[pid 6384] arch_prctl(ARCH_SET_FS, 0x7fc8024117c0) = 0
[pid 6384] exit_group(255) = ?
Process 6384 detached
[pid 6383] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 6383] wait4(6384, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], 0, NULL) = 6384
[pid 6383] clone(Process 6387 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4eddb709d0) = 6387
[pid 6387] --- SIGWINCH (Window changed) @ 0 (0) ---
[pid 6387] clone(Process 6388 attached
child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7ffe9f44aeb8) = 6388

In reference to:
""" This explains why, like one of the OPs, rancid-run on the command-line worked, but not when run from cron - a variant of the usual reason, that the environment is different (in this case, $PATH). """

Actually didn't work via command line or cron.
Ken Celenza
2015-10-29 17:07:33 UTC
Permalink
> Sent: Tuesday, October 27, 2015 at 3:32 PM
> From: "Ken Celenza" <***@mail.com>
> To: rancid-***@shrubbery.net
> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>
>
>
> > Sent: Tuesday, October 27, 2015 at 2:48 PM
> > From: "Jethro R Binks" <***@strath.ac.uk>
> > To: rancid-***@shrubbery.net
> > Subject: Re: [rancid] clogin and rancid good, rancid-run fails
> >
> > On Tue, 27 Oct 2015, Ken Celenza wrote:
> >
> > > > Sent: Tuesday, October 27, 2015 at 8:35 AM
> > > > From: "Alex DEKKER" <***@ale.cx>
> > > >
> > > > Can you SSH onto them from that box without any special parameters to
> > > > SSH? ISTR recent-ish versions of OpenSSH deprecating the algorithms [or
> > > > the default key size, perhaps?] used by older IOS, which means you have
> > > > to add some -o option to make it work.
> > > >
> > > > alexd
> > >
> > > I think this is it. It's still weird that it works fine with ./rancid
> > > but not ./rancid-run. That being said, I turned on telnet, it worked
> > > fine, and I got a list of the packages that were updated. No changes to
> > > perl or expect, but openssh was updated and I found this.
> >
> > Holy Batman;
> >
> > I've had a problem with a couple of systems for a while which I've only
> > half-heartedly looked at, and then when I set them to 'down' forgot about
> > completely for a while more.
> >
> > But inspired by the above comments, I tested each of /usr/bin/ssh and
> > /usr/local/bin/ssh, and the latter works but the former does not. This
> > explains why, like one of the OPs, rancid-run on the command-line worked,
> > but not when run from cron - a variant of the usual reason, that the
> > environment is different (in this case, $PATH).
> >
> > I changed the order in the PATH in rancid.conf, and now it can connect to
> > the systems concerned (and I see form the diffs that they started to fail
> > after an update that changed some SSL/TLS settings).
> >
> > The system /usr/bin/ssh was giving the following error:
> >
> > no matching cipher found: client aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,aes192-cbc,aes256-cbc,rijndael-***@lysator.liu.se server aes128-ctr,aes192-ctr,aes256-ctr
> >
> > Unfortunately his never made it to a rancid logfile that I could see so I
> > was completely in the dark. Is there any way that ssh errors like this
> > could be caught and logged?
> >
> > Happy Jethro.
> >
> > . . . . . . . . . . . . . . . . . . . . . . . . .
> > Jethro R Binks, Network Manager,
> > Information Services Directorate, University Of Strathclyde, Glasgow, UK
> >
> > The University of Strathclyde is a charitable body, registered in
> > Scotland, number SC015263.
> > _______________________________________________
> > Rancid-discuss mailing list
> > Rancid-***@shrubbery.net
> > http://www.shrubbery.net/mailman/listinfo/rancid-discuss
> >
>
>
> Brilliant!! So yes, I can confirm when running ssh from /usr/bin it fails, when I run the ssh I have it works no problem. Now what's still weird is my $path it shows /usr/bin second, but when I run it via rancid-run, it comes up first and fails, not exactly sure why. I was able to confirm this by monitoring my processes spawning with "strace -feprocess $SHELL"
>
> I saw this:
> [pid 6384] execve("/src/rancid/rancid/bin/ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = -1 ENOENT (No such file or directory)
> [pid 6384] execve("/src/rancid/rancid//ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = -1 ENOENT (No such file or directory)
> [pid 6384] execve("/usr/bin/ssh", ["ssh", "-c", "3des", "-x", "-l", "user", "device", "-o", "UserKnownHostsFile=/dev/null", "-o", "StrictHostKeyChecking=no"], [/* 68 vars */]) = 0
> [pid 6384] arch_prctl(ARCH_SET_FS, 0x7fc8024117c0) = 0
> [pid 6384] exit_group(255) = ?
> Process 6384 detached
> [pid 6383] --- SIGCHLD (Child exited) @ 0 (0) ---
> [pid 6383] wait4(6384, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], 0, NULL) = 6384
> [pid 6383] clone(Process 6387 attached
> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4eddb709d0) = 6387
> [pid 6387] --- SIGWINCH (Window changed) @ 0 (0) ---
> [pid 6387] clone(Process 6388 attached
> child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7ffe9f44aeb8) = 6388
>
> In reference to:
> """ This explains why, like one of the OPs, rancid-run on the command-line worked, but not when run from cron - a variant of the usual reason, that the environment is different (in this case, $PATH). """
>
> Actually didn't work via command line or cron.
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>

Just to finalize this, my /usr/bin/ssh was downgraded and now everything is working fine. I'm still perplexed as to why it didn't take my $PATH ordering into account.

Just something to keep in mind if people are having similar issue in the future.
Alan McKinnon
2015-10-27 05:29:51 UTC
Permalink
On 26/10/2015 18:44, Ken Celenza wrote:
>
>> Sent: Monday, October 26, 2015 at 9:42 AM
>> From: "Alan McKinnon" <***@gmail.com>
>> To: rancid-***@shrubbery.net
>> Subject: Re: [rancid] clogin and rancid good, rancid-run fails
>>
>> On 26/10/2015 15:18, Ken Celenza wrote:
>>> I have been running rancid for a little over a year, and over the last month about 20 devices out of a few hundred stopped working. These are cisco devices, most of which have not been upgraded or rebooted in years. So I went through the normal debugging procedures that I know about. clogin works, rancid with debug "HIT"s all of the commands, but when I run it via "rancid-run -r <device>" it does not work.
>>>
>>> From the logs:
>>>
>>> -------------------------------------------
>>>
>>> Trying to get all of the configs.
>>> <device>: missed cmd(s): all commands
>>> <device> clogin error: Error: Connection closed (ssh): <device>
>>> <device>: End of run not found
>>> !
>>> =====================================
>>> Getting missed routers: round 1.
>>> <device>: missed cmd(s): all commands
>>> <device> clogin error: Error: Connection closed (ssh): <device>
>>> <device>: End of run not found
>>> !
>>> =====================================
>>> Getting missed routers: round 2.
>>> <device>: missed cmd(s): all commands
>>> <device> clogin error: Error: Connection closed (ssh): <device>
>>> <device>: End of run not found
>>>
>>>
>>> -------------------------------------------
>>>
>>>
>>> Is there any way to get the raw clogin output when running rancid-run sent to the log as well?
>>>
>>> To recap, it works for most devices, but a few stopped working about a month ago and even though they work fine with clogin and rancid, cannot get it to work with rancid-run.
>>
>>
>> What version of rancid are you using?
>>
>> The main point of departure in your results is that one uses router.db,
>> the other does not. I would start by verifying that router.db entries
>> for those problem devices are OK.
>>
>>
>> --
>> Alan McKinnon
>> ***@gmail.com
>>
>> _______________________________________________
>> Rancid-discuss mailing list
>> Rancid-***@shrubbery.net
>> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>>
>
>
> Version "
>
> $Id: rancid.in 2820 2014-04-25 19:03:59Z heas $
> rancid 3.1
> "

You should probably update top latest version if possible


>
> Good call, but verified the router.db, and it's using ";". I actually think I have more of a hint. It is a suse server and was upgraded to suse 11 SP4, so I suspect one of those packages caused as issue.
>
> Any other suggestions?


I've had to debug an issue in this area only once; what I did was the
classic method: edit the rancid-run script and scatter print calls
throughout; and find the call to the actual rancid parser and launch
that with -d.

Then investigate further depending on what you find.

--
Alan McKinnon
***@gmail.com
Loading...