Discussion:
[rancid] Problem with some F5 devices
Michael Sloan
2013-12-02 13:49:13 UTC
Permalink
I'm relatively new to using RANCID, although it has been in use for a couple of years in my (new) workplace. We have been using RANCID with Cisco and Juniper equipment, and I recently added some devices from Aruba and F5 to the list of devices being archived with RANCID.

We have 4 separate F5 chasses doing load-balancing and reverse proxy, and these work flawlessly with RANCID (once I found an F5 script that supports version 11 of the F5 OS, anyway). On these chasses, we have several vCMPs for different clients. The vCMPs have their own IP, and respond to the same F5 commands that the chasses do.

The files generated in the configs directory for the vCMPs are all zero-length files, even though the physical chasses produce 23k-47k files in the configs directory. I have verified that clogin works, and clogin -c "bigpipe version' <F5-vCMP> does in fact produce the correct output. Running "f5rancid <F5-vCMP>" produces a 17k file in a test directory, so I know the process works for the vCMPs (see directory listings below).

I have tried removing the entries for the vCMPs in router.db, started 'run-rancid', then added the entries back, and RANCID created zero-length files for the vCMPS a second time.

We are using RANCID 2.3.6, on a CentOS 6 system, with Expect 5.43

Has anyone encountered this problem or have any ideas how to resolve it?

A typical logfile:

Trying to get all of the configs.
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
Getting missed routers: round 4.
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static

cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs
Checking in configs/10.255.128.143;
/usr/local/rancid/var/CVS/other/configs/10.255.128.143,v <-- 10.255.128.143
new revision: 1.647; previous revision: 1.646
done
Checking in configs/10.255.128.144;
/usr/local/rancid/var/CVS/other/configs/10.255.128.144,v <-- 10.255.128.144
new revision: 1.283; previous revision: 1.282
done


10.255.128.145 and 10.255.128.146 are two of the physical chasses, while the IPs from .147 and above are vCMPs.

My router.db file:

10.255.128.143:f5:up
10.255.128.144:f5:up
10.255.128.145:f5:up
10.255.128.146:f5:up
10.254.200.2:f5:up
10.255.128.147:f5:up
10.255.128.148:f5:up
10.255.128.151:f5:up
10.255.128.152:f5:up
10.255.128.153:f5:up
10.255.128.154:f5:up
10.255.128.155:f5:up
10.255.128.156:f5:up
10.255.128.157:f5:up
10.255.128.158:f5:up
10.255.128.159:f5:up

And lastly, the directory listing for the configs directory:

-bash-3.1$ ls -l
total 592
-rw-r----- 1 rancid netadm 470068 Dec 2 08:17 10.254.200.2
-rw-r----- 1 rancid netadm 31335 Dec 2 08:17 10.255.128.143
-rw-r----- 1 rancid netadm 27155 Dec 2 08:17 10.255.128.144
-rw-r----- 1 rancid netadm 28406 Nov 5 09:33 10.255.128.145
-rw-r----- 1 rancid netadm 23159 Nov 5 09:33 10.255.128.146
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.147
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.148
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.151
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.152
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.153
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.154
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.155
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.156
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.157
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.158
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.159
drwxr-x--- 2 rancid netadm 4096 Dec 2 08:21 CVS
-rw-r----- 1 rancid netadm 11256 Dec 2 08:18 wlc.nsrc.private

And my test from 'f5rancid 10.255.128.147' in a temp directory:

-bash-3.1$ ls -l
total 20
-rw-r--r-- 1 rancid netadm 17700 Dec 2 08:05 10.255.128.147.new



Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
***@nsrc.myflorida.com<mailto:***@nsrc.myflorida.com>
Alan McKinnon
2013-12-02 14:43:24 UTC
Permalink
Your tests described below are quite sensible, but also incomplete

We know that clogin works on your f5 with a simple command
We know that clogin works on a vCMP with a simple command
We know that f5rancid works on your physical chassis

What we don't know is if clogin and f5rancid works correctly on a vCMP
using the full command set. There must be some difference between what
the physical chassis and the vCMPs sending back, otherwise both would
work. I suspect some part of the vCMP output is upsetting the f5rancid
script causing it to exit early.

You need the big troubleshooting guns (this process is almost always
what you need to do anyway if adding a device to router.db doesn't work
out):

1. Run this test in a temp directory (not the usual rancid dir) as the
rancid user
2. Pick a vCMP
3. Run "f5rancid -d <vCMP>"
4. This will give lots of screen output plus a new file with the full
text output from the device in the current directory
5. In the screen output will be the full clogin command used. Copy paste
that command and run it manually. Verify that the full command set works
as expected on a vCMP
6. Look inside the raw data file from step 3. Somewhere near the end I
expect to see error messages of some kind. Those errors will tell you
were we look next.

Note that "missed cmd(s)" and "End of run not found" messages are
useless for debugging purposes, they are catch-all output and only
indicate that something went wrong. They give no clue as to why.
I’m relatively new to using RANCID, although it has been in use for a
couple of years in my (new) workplace. We have been using RANCID with
Cisco and Juniper equipment, and I recently added some devices from
Aruba and F5 to the list of devices being archived with RANCID.
We have 4 separate F5 chasses doing load-balancing and reverse proxy,
and these work flawlessly with RANCID (once I found an F5 script that
supports version 11 of the F5 OS, anyway). On these chasses, we have
several vCMPs for different clients. The vCMPs have their own IP, and
respond to the same F5 commands that the chasses do.
The files generated in the configs directory for the vCMPs are all
zero-length files, even though the physical chasses produce 23k-47k
files in the configs directory. I have verified that clogin works, and
clogin –c “bigpipe version’ <F5-vCMP> does in fact produce the correct
output. Running “f5rancid <F5-vCMP>” produces a 17k file in a test
directory, so I know the process works for the vCMPs (see directory
listings below).
I have tried removing the entries for the vCMPs in router.db, started
‘run-rancid’, then added the entries back, and RANCID created
zero-length files for the vCMPS a second time.
We are using RANCID 2.3.6, on a CentOS 6 system, with Expect 5.43
Has anyone encountered this problem or have any ideas how to resolve it?
Trying to get all of the configs.
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
Getting missed routers: round 4.
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs
Checking in configs/10.255.128.143;
/usr/local/rancid/var/CVS/other/configs/10.255.128.143,v <--
10.255.128.143
new revision: 1.647; previous revision: 1.646
done
Checking in configs/10.255.128.144;
/usr/local/rancid/var/CVS/other/configs/10.255.128.144,v <--
10.255.128.144
new revision: 1.283; previous revision: 1.282
done
10.255.128.145 and 10.255.128.146 are two of the physical chasses, while
the IPs from .147 and above are vCMPs.
10.255.128.143:f5:up
10.255.128.144:f5:up
10.255.128.145:f5:up
10.255.128.146:f5:up
10.254.200.2:f5:up
10.255.128.147:f5:up
10.255.128.148:f5:up
10.255.128.151:f5:up
10.255.128.152:f5:up
10.255.128.153:f5:up
10.255.128.154:f5:up
10.255.128.155:f5:up
10.255.128.156:f5:up
10.255.128.157:f5:up
10.255.128.158:f5:up
10.255.128.159:f5:up
-bash-3.1$ ls -l
total 592
-rw-r----- 1 rancid netadm 470068 Dec 2 08:17 10.254.200.2
-rw-r----- 1 rancid netadm 31335 Dec 2 08:17 10.255.128.143
-rw-r----- 1 rancid netadm 27155 Dec 2 08:17 10.255.128.144
-rw-r----- 1 rancid netadm 28406 Nov 5 09:33 10.255.128.145
-rw-r----- 1 rancid netadm 23159 Nov 5 09:33 10.255.128.146
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.147
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.148
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.151
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.152
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.153
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.154
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.155
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.156
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.157
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.158
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.159
drwxr-x--- 2 rancid netadm 4096 Dec 2 08:21 CVS
-rw-r----- 1 rancid netadm 11256 Dec 2 08:18 wlc.nsrc.private
-bash-3.1$ ls -l
total 20
-rw-r--r-- 1 rancid netadm 17700 Dec 2 08:05 10.255.128.147.new
Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
--
Alan McKinnon
***@gmail.com
Michael Sloan
2013-12-03 17:44:35 UTC
Permalink
Thank you for the additional troubleshooting suggestions, although I'm not sure that I'm closer to a solution with this problem. I'll recap what I've learned from troubleshooting, and then show the file/screen output.

The troubleshooting/debugging recap:

Manually executing 'f5rancid <F5 device>' as the rancid user produces a <F5 device>.new file.
Manually executing 'f5rancid <F5 vCMP>' as the rancid user produces a <F5 vCMP>.new file.

The f5rancid script first connects and determines the version of the F5 OS in use, and then initiates a second connection to the F5 or vCMP to issue the commands for the newer version of the F5 OS. If you run this second clogin command as the rancid user, you see all the correct screen output, but no file is created - this is true for both the F5 physical chassis and any vCMP.

As far as I can see and tell, there aren't any differences in the behavior of the F5 chassis and the F5 vCMP, so I'm at a loss as to why the F5 chassis output files are created and the vCMP files are not.


-----
The troubleshooting/debugging information:

The screen output from "f5rancid -d <vCMP>':

-bash-3.1$ f5rancid -d 10.255.128.148
executing clogin -t 90 -c "bigpipe version 2>&1" 10.255.128.148
The F5 says to use tmsh, using tmsh command table for config collection.
executing clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148
PROMPT MATCH: \[***@test-prod2:/S1-green-P:Active:In Sync\] config #
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys version
In ShowVersion: [***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys version
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys hardware
In ShowHardware: [***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys hardware
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys license
In ShowLicense: [***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys license
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # cat /config/ZebOS.conf
In ShowZebOSconf: [***@test-prod2:/S1-green-P:Active:In Sync] config # cat /config/ZebOS.conf
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # lsof -i :179
In ShowZebOSsockets: [***@test-prod2:/S1-green-P:Active:In Sync] config # lsof -i :179
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /net route static
In ShowRouteStatic: [***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /net route static
HIT COMMAND:[***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh -q list
In WriteTerm: [***@test-prod2:/S1-green-P:Active:In Sync] config # tmsh -q list

And the file 10.255.128.148.new is created (about 17k in size).

If you use clogin <vCMP> to connect to the device and try the commands listed in the second "executing clogin" sequence, several produce no output (for instance, 'tmsh show /net route static' - because there are no static routes), one produces an error message ('cat /config/ZebOS.conf') because the file doesn't exist anywhere on the vCMP filesystem or on the F5 physical chassis filesystem. The rest produce the expected output.

There are no error messages in the *.new output flle, aside from the 'file not found' error message from the above-mentioned 'cat' command. Both 'f5rancid <vCMP>' and 'f5rancid -d <vCMP>' produce vCMP.new files. The actual clogin command executed second:

clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148

produces no file on the RANCID server, even though the screen output displays the correct output. As an additional test, running that same clogin command on one of the physical chasses produces no file, although 'f5rancid <F5 chassis> does.


Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
***@nsrc.myflorida.com



-----Original Message-----
From: rancid-discuss-***@shrubbery.net [mailto:rancid-discuss-***@shrubbery.net] On Behalf Of Alan McKinnon
Sent: Monday, December 02, 2013 9:43 AM
To: rancid-***@shrubbery.net
Subject: Re: [rancid] Problem with some F5 devices

Your tests described below are quite sensible, but also incomplete

We know that clogin works on your f5 with a simple command We know that clogin works on a vCMP with a simple command We know that f5rancid works on your physical chassis

What we don't know is if clogin and f5rancid works correctly on a vCMP using the full command set. There must be some difference between what the physical chassis and the vCMPs sending back, otherwise both would work. I suspect some part of the vCMP output is upsetting the f5rancid script causing it to exit early.

You need the big troubleshooting guns (this process is almost always what you need to do anyway if adding a device to router.db doesn't work
out):

1. Run this test in a temp directory (not the usual rancid dir) as the rancid user 2. Pick a vCMP 3. Run "f5rancid -d <vCMP>"
4. This will give lots of screen output plus a new file with the full text output from the device in the current directory 5. In the screen output will be the full clogin command used. Copy paste that command and run it manually. Verify that the full command set works as expected on a vCMP 6. Look inside the raw data file from step 3. Somewhere near the end I expect to see error messages of some kind. Those errors will tell you were we look next.

Note that "missed cmd(s)" and "End of run not found" messages are useless for debugging purposes, they are catch-all output and only indicate that something went wrong. They give no clue as to why.
Post by Michael Sloan
I'm relatively new to using RANCID, although it has been in use for a
couple of years in my (new) workplace. We have been using RANCID with
Cisco and Juniper equipment, and I recently added some devices from
Aruba and F5 to the list of devices being archived with RANCID.
We have 4 separate F5 chasses doing load-balancing and reverse proxy,
and these work flawlessly with RANCID (once I found an F5 script that
supports version 11 of the F5 OS, anyway). On these chasses, we have
several vCMPs for different clients. The vCMPs have their own IP, and
respond to the same F5 commands that the chasses do.
The files generated in the configs directory for the vCMPs are all
zero-length files, even though the physical chasses produce 23k-47k
files in the configs directory. I have verified that clogin works, and
clogin -c "bigpipe version' <F5-vCMP> does in fact produce the correct
output. Running "f5rancid <F5-vCMP>" produces a 17k file in a test
directory, so I know the process works for the vCMPs (see directory
listings below).
I have tried removing the entries for the vCMPs in router.db, started
'run-rancid', then added the entries back, and RANCID created
zero-length files for the vCMPS a second time.
We are using RANCID 2.3.6, on a CentOS 6 system, with Expect 5.43
Has anyone encountered this problem or have any ideas how to resolve it?
Trying to get all of the configs.
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
Getting missed routers: round 4.
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs
Checking in configs/10.255.128.143;
/usr/local/rancid/var/CVS/other/configs/10.255.128.143,v <--
10.255.128.143
new revision: 1.647; previous revision: 1.646
done
Checking in configs/10.255.128.144;
/usr/local/rancid/var/CVS/other/configs/10.255.128.144,v <--
10.255.128.144
new revision: 1.283; previous revision: 1.282
done
10.255.128.145 and 10.255.128.146 are two of the physical chasses,
while the IPs from .147 and above are vCMPs.
10.255.128.143:f5:up
10.255.128.144:f5:up
10.255.128.145:f5:up
10.255.128.146:f5:up
10.254.200.2:f5:up
10.255.128.147:f5:up
10.255.128.148:f5:up
10.255.128.151:f5:up
10.255.128.152:f5:up
10.255.128.153:f5:up
10.255.128.154:f5:up
10.255.128.155:f5:up
10.255.128.156:f5:up
10.255.128.157:f5:up
10.255.128.158:f5:up
10.255.128.159:f5:up
-bash-3.1$ ls -l
total 592
-rw-r----- 1 rancid netadm 470068 Dec 2 08:17 10.254.200.2
-rw-r----- 1 rancid netadm 31335 Dec 2 08:17 10.255.128.143
-rw-r----- 1 rancid netadm 27155 Dec 2 08:17 10.255.128.144
-rw-r----- 1 rancid netadm 28406 Nov 5 09:33 10.255.128.145
-rw-r----- 1 rancid netadm 23159 Nov 5 09:33 10.255.128.146
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.147
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.148
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.151
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.152
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.153
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.154
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.155
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.156
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.157
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.158
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.159
drwxr-x--- 2 rancid netadm 4096 Dec 2 08:21 CVS
-rw-r----- 1 rancid netadm 11256 Dec 2 08:18 wlc.nsrc.private
-bash-3.1$ ls -l
total 20
-rw-r--r-- 1 rancid netadm 17700 Dec 2 08:05 10.255.128.147.new
Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
--
Alan McKinnon
***@gmail.com
Alan McKinnon
2013-12-03 20:16:17 UTC
Permalink
Hi Michael,

All the info you've given here indicates that things are working
correctly. The f5rancid -d output with the "HIT COMMAND" sections
especially shows that data was collected and it's in a useable format -
the parser detected the prompt and then found the expected commands in
the expected order.

This is good news, as you've narrowed down considerably the piece of
code that contains your bug. Briefly, how rancid runs is:

- rancid-run is the script you launch
- rancid-run launches control_rancid for each group of devices in turn
- control_rancid launches par
- par runs PAR_CONT number of parallel sub-processes, one per device
- Each of those sub-processes starts rancid-fe which uses the device
type from router.db to start the appropriate rancid script (in your case
f5rancid)

- f5rancid runs clogin to fetch all the config info from the device,
usually it goes into a .raw disk file, but there is an option to use
pipes as well
- f5rancid then goes through that saved output line by line making sense
out of it, discarding unwanted text and writing the full desired output
to a .new file

The next bit is where I'm somewhat fuzzy (it's never failed me yet):

- the .new file is diff'ed with the previous fetched config, renamed and
booked into CVS and various mail notifications are generated and sent.



Your setup appears to be working correctly up to the point where a .new
file is generated, and everything else is common code. This doesn't
leave much to exmine, basically the last 20 lines of f5rancid after the
main loop labelled TOP.

I can't meaningfully help much further than this, I don't have any F5s
so I think you need to debug further by reading the code. How's your perl?
Post by Michael Sloan
Thank you for the additional troubleshooting suggestions, although I'm not sure that I'm closer to a solution with this problem. I'll recap what I've learned from troubleshooting, and then show the file/screen output.
Manually executing 'f5rancid <F5 device>' as the rancid user produces a <F5 device>.new file.
Manually executing 'f5rancid <F5 vCMP>' as the rancid user produces a <F5 vCMP>.new file.
The f5rancid script first connects and determines the version of the F5 OS in use, and then initiates a second connection to the F5 or vCMP to issue the commands for the newer version of the F5 OS. If you run this second clogin command as the rancid user, you see all the correct screen output, but no file is created - this is true for both the F5 physical chassis and any vCMP.
As far as I can see and tell, there aren't any differences in the behavior of the F5 chassis and the F5 vCMP, so I'm at a loss as to why the F5 chassis output files are created and the vCMP files are not.
-----
-bash-3.1$ f5rancid -d 10.255.128.148
executing clogin -t 90 -c "bigpipe version 2>&1" 10.255.128.148
The F5 says to use tmsh, using tmsh command table for config collection.
executing clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148
And the file 10.255.128.148.new is created (about 17k in size).
If you use clogin <vCMP> to connect to the device and try the commands listed in the second "executing clogin" sequence, several produce no output (for instance, 'tmsh show /net route static' - because there are no static routes), one produces an error message ('cat /config/ZebOS.conf') because the file doesn't exist anywhere on the vCMP filesystem or on the F5 physical chassis filesystem. The rest produce the expected output.
clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148
produces no file on the RANCID server, even though the screen output displays the correct output. As an additional test, running that same clogin command on one of the physical chasses produces no file, although 'f5rancid <F5 chassis> does.
Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
-----Original Message-----
Sent: Monday, December 02, 2013 9:43 AM
Subject: Re: [rancid] Problem with some F5 devices
Your tests described below are quite sensible, but also incomplete
We know that clogin works on your f5 with a simple command We know that clogin works on a vCMP with a simple command We know that f5rancid works on your physical chassis
What we don't know is if clogin and f5rancid works correctly on a vCMP using the full command set. There must be some difference between what the physical chassis and the vCMPs sending back, otherwise both would work. I suspect some part of the vCMP output is upsetting the f5rancid script causing it to exit early.
You need the big troubleshooting guns (this process is almost always what you need to do anyway if adding a device to router.db doesn't work
1. Run this test in a temp directory (not the usual rancid dir) as the rancid user 2. Pick a vCMP 3. Run "f5rancid -d <vCMP>"
4. This will give lots of screen output plus a new file with the full text output from the device in the current directory 5. In the screen output will be the full clogin command used. Copy paste that command and run it manually. Verify that the full command set works as expected on a vCMP 6. Look inside the raw data file from step 3. Somewhere near the end I expect to see error messages of some kind. Those errors will tell you were we look next.
Note that "missed cmd(s)" and "End of run not found" messages are useless for debugging purposes, they are catch-all output and only indicate that something went wrong. They give no clue as to why.
Post by Michael Sloan
I'm relatively new to using RANCID, although it has been in use for a
couple of years in my (new) workplace. We have been using RANCID with
Cisco and Juniper equipment, and I recently added some devices from
Aruba and F5 to the list of devices being archived with RANCID.
We have 4 separate F5 chasses doing load-balancing and reverse proxy,
and these work flawlessly with RANCID (once I found an F5 script that
supports version 11 of the F5 OS, anyway). On these chasses, we have
several vCMPs for different clients. The vCMPs have their own IP, and
respond to the same F5 commands that the chasses do.
The files generated in the configs directory for the vCMPs are all
zero-length files, even though the physical chasses produce 23k-47k
files in the configs directory. I have verified that clogin works, and
clogin -c "bigpipe version' <F5-vCMP> does in fact produce the correct
output. Running "f5rancid <F5-vCMP>" produces a 17k file in a test
directory, so I know the process works for the vCMPs (see directory
listings below).
I have tried removing the entries for the vCMPs in router.db, started
'run-rancid', then added the entries back, and RANCID created
zero-length files for the vCMPS a second time.
We are using RANCID 2.3.6, on a CentOS 6 system, with Expect 5.43
Has anyone encountered this problem or have any ideas how to resolve it?
Trying to get all of the configs.
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
Getting missed routers: round 4.
10.255.128.148: missed cmd(s): tmsh show /net route static
10.255.128.145: missed cmd(s): tmsh show /net route static
10.255.128.147: missed cmd(s): tmsh show /net route static
10.255.128.146: missed cmd(s): tmsh show /net route static
10.255.128.151: missed cmd(s): tmsh show /net route static
10.255.128.152: missed cmd(s): tmsh show /net route static
10.255.128.153: missed cmd(s): tmsh show /net route static
10.255.128.156: missed cmd(s): tmsh show /net route static
10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show /sys hardware
10.255.128.155: missed cmd(s): tmsh show /net route static
10.255.128.157: missed cmd(s): tmsh show /net route static
10.255.128.158: missed cmd(s): tmsh show /net route static
10.255.128.159: missed cmd(s): tmsh show /net route static
cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs
Checking in configs/10.255.128.143;
/usr/local/rancid/var/CVS/other/configs/10.255.128.143,v <--
10.255.128.143
new revision: 1.647; previous revision: 1.646
done
Checking in configs/10.255.128.144;
/usr/local/rancid/var/CVS/other/configs/10.255.128.144,v <--
10.255.128.144
new revision: 1.283; previous revision: 1.282
done
10.255.128.145 and 10.255.128.146 are two of the physical chasses,
while the IPs from .147 and above are vCMPs.
10.255.128.143:f5:up
10.255.128.144:f5:up
10.255.128.145:f5:up
10.255.128.146:f5:up
10.254.200.2:f5:up
10.255.128.147:f5:up
10.255.128.148:f5:up
10.255.128.151:f5:up
10.255.128.152:f5:up
10.255.128.153:f5:up
10.255.128.154:f5:up
10.255.128.155:f5:up
10.255.128.156:f5:up
10.255.128.157:f5:up
10.255.128.158:f5:up
10.255.128.159:f5:up
-bash-3.1$ ls -l
total 592
-rw-r----- 1 rancid netadm 470068 Dec 2 08:17 10.254.200.2
-rw-r----- 1 rancid netadm 31335 Dec 2 08:17 10.255.128.143
-rw-r----- 1 rancid netadm 27155 Dec 2 08:17 10.255.128.144
-rw-r----- 1 rancid netadm 28406 Nov 5 09:33 10.255.128.145
-rw-r----- 1 rancid netadm 23159 Nov 5 09:33 10.255.128.146
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.147
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.148
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.151
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.152
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.153
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.154
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.155
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.156
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.157
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.158
-rw-r----- 1 rancid netadm 0 Nov 27 11:17 10.255.128.159
drwxr-x--- 2 rancid netadm 4096 Dec 2 08:21 CVS
-rw-r----- 1 rancid netadm 11256 Dec 2 08:18 wlc.nsrc.private
-bash-3.1$ ls -l
total 20
-rw-r--r-- 1 rancid netadm 17700 Dec 2 08:05 10.255.128.147.new
Michael Sloan
Systems Programmer Network Support
Office: (850) 922-5476
Northwood Shared Resource Center
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
--
Alan McKinnon
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
--
Alan McKinnon
***@gmail.com
Loading...