Discussion:
[rancid] control_rancid slow start
Robert Drake
2014-11-10 06:10:15 UTC
Permalink
has anyone looked at the control_rancid script recently? Here are some
timestamps from an example run from me (with a couple of added date
stamps to show where the time goes). I'm only running rancid against
one file, but there are 1400 total devices in the group.

rancid-run -r <gw1-test-node> <testgroup>

starting: Mon Nov 10 00:33:34 EST 2014
begin control_rancid: Mon Nov 10 00:33:34 EST 2014

Trying to get all of the configs. Mon Nov 10 00:35:33 EST 2014
All routers sucessfully completed.

cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs

ending: Mon Nov 10 00:35:37 EST 2014


If I comment the following code out it runs in less than 3 seconds:

# check for 'up' routers missing in RCS. no idea how this happens to
some folks
for router in `cut -d\; -f1 ../routers.up` ; do
if [ $RCSSYS = cvs ] ; then
cvs status $router | grep -i 'status: unknown' > /dev/null 2>&1
else
svn status $router | grep '^?' > /dev/null 2>&1
fi
if [ $? -eq 0 ] ; then
touch $router
if [ $RCSSYS = cvs ] ; then
cvs add -ko $router
else
svn add $router
fi
echo "$RCSSYS added missing router $router"
fi
done

Possible better option would be this (I think this will work with svn
but I don't have a tree to test it on):

cut -d: -f1 ../routers.up | xargs cvs status | grep -i 'status: unknown'

Example test case:

(echo test ; cut -d: -f1 ../routers.up) | xargs cvs status | grep -i
'status: unknown'
cvs status: nothing known about test
File: no file test Status: Unknown


Another option might be to have a CLI argument that says "skip
rebuilding router.db.* and checking CVS stuff because we're reasonably
certain that is fine right now". Finally, I would recommend abstracting
most of the router.db.* rebuild into another script and rewriting it in
perl because it's almost unreadable now.

I can submit patches if this is too much for informal email.

Thanks,
Robert
heasley
2014-11-13 01:23:00 UTC
Permalink
Mon, Nov 10, 2014 at 01:10:15AM -0500, Robert Drake:
> has anyone looked at the control_rancid script recently? Here are some
> timestamps from an example run from me (with a couple of added date
> stamps to show where the time goes). I'm only running rancid against
> one file, but there are 1400 total devices in the group.
>
> rancid-run -r <gw1-test-node> <testgroup>
>
> starting: Mon Nov 10 00:33:34 EST 2014
> begin control_rancid: Mon Nov 10 00:33:34 EST 2014
>
> Trying to get all of the configs. Mon Nov 10 00:35:33 EST 2014

it takes ~3s for 200 devices in svn. i haven't timed cvs yet. is it possible
that you have defined cvswrappers that are slow? or you have a massive cvs
history file that is slowing making the operation slow?

> All routers sucessfully completed.
>
> cvs diff: Diffing .
> cvs diff: Diffing configs
> cvs commit: Examining .
> cvs commit: Examining configs
>
> ending: Mon Nov 10 00:35:37 EST 2014
>
>
> If I comment the following code out it runs in less than 3 seconds:
>
> # check for 'up' routers missing in RCS. no idea how this happens to
> some folks
> for router in `cut -d\; -f1 ../routers.up` ; do
> if [ $RCSSYS = cvs ] ; then
> cvs status $router | grep -i 'status: unknown' > /dev/null 2>&1
> else
> svn status $router | grep '^?' > /dev/null 2>&1
> fi
> if [ $? -eq 0 ] ; then
> touch $router
> if [ $RCSSYS = cvs ] ; then
> cvs add -ko $router
> else
> svn add $router
> fi
> echo "$RCSSYS added missing router $router"
> fi
> done
>
> Possible better option would be this (I think this will work with svn
> but I don't have a tree to test it on):
>
> cut -d: -f1 ../routers.up | xargs cvs status | grep -i 'status: unknown'
>
> Example test case:
>
> (echo test ; cut -d: -f1 ../routers.up) | xargs cvs status | grep -i
> 'status: unknown'
> cvs status: nothing known about test
> File: no file test Status: Unknown

that doesnt quite work for non-existent files.

> Another option might be to have a CLI argument that says "skip
> rebuilding router.db.* and checking CVS stuff because we're reasonably
> certain that is fine right now". Finally, I would recommend abstracting
> most of the router.db.* rebuild into another script and rewriting it in
> perl because it's almost unreadable now.

i can see treating -r differently, but still maintain the integrity check.

> I can submit patches if this is too much for informal email.
>
> Thanks,
> Robert
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-***@shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
Robert Drake
2016-08-16 06:50:35 UTC
Permalink
On 11/12/2014 8:23 PM, heasley wrote:
>> If I comment the following code out it runs in less than 3 seconds:
>>
>> # check for 'up' routers missing in RCS. no idea how this happens to
>> some folks
>> for router in `cut -d\; -f1 ../routers.up` ; do
>> if [ $RCSSYS = cvs ] ; then
>> cvs status $router | grep -i 'status: unknown' > /dev/null 2>&1
>> else
>> svn status $router | grep '^?' > /dev/null 2>&1
>> fi
>> if [ $? -eq 0 ] ; then
>> touch $router
>> if [ $RCSSYS = cvs ] ; then
>> cvs add -ko $router
>> else
>> svn add $router
>> fi
>> echo "$RCSSYS added missing router $router"
>> fi
>> done
>>
>> Possible better option would be this (I think this will work with svn
>> but I don't have a tree to test it on):
>>
>> cut -d: -f1 ../routers.up | xargs cvs status | grep -i 'status: unknown'
>>
>> Example test case:
>>
>> (echo test ; cut -d: -f1 ../routers.up) | xargs cvs status | grep -i
>> 'status: unknown'
>> cvs status: nothing known about test
>> File: no file test Status: Unknown
> that doesnt quite work for non-existent files.
The test case shows it does work for non-existent files. "test" is a
non-existent file. Unless I'm mistaken about what you mean by that.
Depending on where the file is non-existent, you can grep two different
ways.

File exists on disk in the configs/ dir but it's not in CVS:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'Status: Unknown' | awk '{print $2}' | xargs cvs add -ko

Router name exists is routers.up, file does not exist in configs/ dir or in CVS:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'cvs status: nothing' | awk '{print $6}' | xargs cvs add -ko

If you need to do two operations, touch + cvs add then you've got a
choice. You could put the output of the pipe into a temp file, then run
cat $TEMPFILE | xargs touch && cat $TEMPFILE | xargs cvs add -ko, or you
could take advantage of magic xargs flags to do something like:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'cvs status: nothing' | awk '{print $6}' | xargs -I % sh -c 'touch %; cvs add -ko %'

Rather than going this route I think I would have two pipelines. The
first part dumps into a tempfile based on what RCS is running, and the
second part (touch/add/commit) is also broken out with a second case
statement that tells it what to do.

In almost every case in control_rancid for loops are going to be slower
than a pipeline due to the nature of shell scripting. In most cases it
doesn't matter because the potential work is not that high, but in the
cases where you might need to run a cvs command 1000 times, it's much
better to run it once or twice on a long list of files.
Charles T. Brooks
2016-08-16 19:41:14 UTC
Permalink
Pretty much any time you call grep and awk in the same pipeline, there's a better, faster, easier way.

For example, this code:

grep 'cvs status: nothing' | awk '{print $6}' | xargs -I % sh -c 'touch %; cvs add -ko %'

looks like a complicated way to do this:

gawk '/cvs status: nothing/{system("touch " $6 ";cvs add -ko " $6)}'

...but I don't have or want CVS so unfortunately I can't test it.

Note you can do any number of operations in sequence (and conditionally dependent on each other, if you want) in that system() call without any need for temporary files.

--Charlie

Arnold Robbins: AWK is a language similar to PERL, only considerably more elegant.
Larry Wall: Hey!

________________________________________
From: Rancid-discuss [rancid-discuss-***@shrubbery.net] on behalf of Robert Drake [***@direcpath.com]
Sent: Tuesday, August 16, 2016 2:50 AM
To: heasley
Cc: rancid-***@shrubbery.net
Subject: Re: [rancid] control_rancid slow start

On 11/12/2014 8:23 PM, heasley wrote:
>> If I comment the following code out it runs in less than 3 seconds:
>>
>> # check for 'up' routers missing in RCS. no idea how this happens to
>> some folks
>> for router in `cut -d\; -f1 ../routers.up` ; do
>> if [ $RCSSYS = cvs ] ; then
>> cvs status $router | grep -i 'status: unknown' > /dev/null 2>&1
>> else
>> svn status $router | grep '^?' > /dev/null 2>&1
>> fi
>> if [ $? -eq 0 ] ; then
>> touch $router
>> if [ $RCSSYS = cvs ] ; then
>> cvs add -ko $router
>> else
>> svn add $router
>> fi
>> echo "$RCSSYS added missing router $router"
>> fi
>> done
>>
>> Possible better option would be this (I think this will work with svn
>> but I don't have a tree to test it on):
>>
>> cut -d: -f1 ../routers.up | xargs cvs status | grep -i 'status: unknown'
>>
>> Example test case:
>>
>> (echo test ; cut -d: -f1 ../routers.up) | xargs cvs status | grep -i
>> 'status: unknown'
>> cvs status: nothing known about test
>> File: no file test Status: Unknown
> that doesnt quite work for non-existent files.
The test case shows it does work for non-existent files. "test" is a
non-existent file. Unless I'm mistaken about what you mean by that.
Depending on where the file is non-existent, you can grep two different
ways.

File exists on disk in the configs/ dir but it's not in CVS:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'Status: Unknown' | awk '{print $2}' | xargs cvs add -ko

Router name exists is routers.up, file does not exist in configs/ dir or in CVS:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'cvs status: nothing' | awk '{print $6}' | xargs cvs add -ko

If you need to do two operations, touch + cvs add then you've got a
choice. You could put the output of the pipe into a temp file, then run
cat $TEMPFILE | xargs touch && cat $TEMPFILE | xargs cvs add -ko, or you
could take advantage of magic xargs flags to do something like:

cut -d\; -f1 ../routers.up | xargs cvs status 2>&1 | grep 'cvs status: nothing' | awk '{print $6}' | xargs -I % sh -c 'touch %; cvs add -ko %'

Rather than going this route I think I would have two pipelines. The
first part dumps into a tempfile based on what RCS is running, and the
second part (touch/add/commit) is also broken out with a second case
statement that tells it what to do.

In almost every case in control_rancid for loops are going to be slower
than a pipeline due to the nature of shell scripting. In most cases it
doesn't matter because the potential work is not that high, but in the
cases where you might need to run a cvs command 1000 times, it's much
better to run it once or twice on a long list of files.

_______________________________________________
Rancid-discuss mailing list
Rancid-***@shrubbery.net
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
------------------ CONFIDENTIALITY NOTICE ---------------

This message, including any attachments, is for the sole use of the
intended recipient(s) and may contain privileged confidential information
protected by law. Any unauthorized review, use, disclosure or distribution
of this message is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of this message.

------------------ CONFIDENTIALITY NOTICE ---------------
Loading...