Discussion:
[rancid] problem with new Aruba/HP 2920
Doug Hughes
2018-01-27 15:57:56 UTC
Permalink
got a new HP/Aruba 2920 to replace an old failed 2910al (POE power
supply failure - very common).. Weird thing is, hlogin doesn't work on
it. I get an EOF right after trying to send enable and it tries to match
the many stupid formatting characters that the Aruba folks have tried to
put into the output.

I have manually set the switch to vt100 terminal type and reloaded, but
still it persists and it's loaded with those characters. I can't say
definitively that they are the issue, but something sure is strange. I
started looking into it and debugging and noticed all that. Also, clogin
seems to work (aside from command incompatibility), but hlogin does not
and gets an EOF prematurely.

Anybody seen this?

I did just upgraed to rancid3.7 from 3.4.1 to see if that would help,
and it did not. Same behavior.
heasley
2018-01-27 16:23:34 UTC
Permalink
Post by Doug Hughes
got a new HP/Aruba 2920 to replace an old failed 2910al (POE power
supply failure - very common).. Weird thing is, hlogin doesn't work on
it. I get an EOF right after trying to send enable and it tries to match
the many stupid formatting characters that the Aruba folks have tried to
put into the output.
I have manually set the switch to vt100 terminal type and reloaded, but
still it persists and it's loaded with those characters. I can't say
definitively that they are the issue, but something sure is strange. I
started looking into it and debugging and noticed all that. Also, clogin
seems to work (aside from command incompatibility), but hlogin does not
and gets an EOF prematurely.
Anybody seen this?
I havent any of these. Have one that I can poke remotely? else, collect
debug info from hlogin -d -c ... devicename
Post by Doug Hughes
I did just upgraed to rancid3.7 from 3.4.1 to see if that would help,
and it did not. Same behavior.
_______________________________________________
Rancid-discuss mailing list
http://www.shrubbery.net/mailman/listinfo/rancid-discuss
Doug Hughes
2018-01-27 16:37:17 UTC
Permalink
Post by heasley
Post by Doug Hughes
got a new HP/Aruba 2920 to replace an old failed 2910al (POE power
supply failure - very common).. Weird thing is, hlogin doesn't work on
it. I get an EOF right after trying to send enable and it tries to match
the many stupid formatting characters that the Aruba folks have tried to
put into the output.
I have manually set the switch to vt100 terminal type and reloaded, but
still it persists and it's loaded with those characters. I can't say
definitively that they are the issue, but something sure is strange. I
started looking into it and debugging and noticed all that. Also, clogin
seems to work (aside from command incompatibility), but hlogin does not
and gets an EOF prematurely.
Anybody seen this?
I havent any of these. Have one that I can poke remotely? else, collect
debug info from hlogin -d -c ... devicename
since my post I have tracked it down to a segfault in hpuifilter.

as soon as I type (or rancid sends) enable, it crashes in an memmove
here:
420                     tlen = 0;
421                     tbuf[0] = '\0';
422                     break;
423                 } else if (bytes > 0) {
424                     tlen -= bytes;
425                     memmove(tbuf, tbuf + bytes, tlen + 1);
426                     if (tlen < 1)
427                         pfds[1].events &= ~POLLOUT;
428                 }
429             }
(gdb)
(gdb) display tbuf
4: tbuf =
"h\000[24;1H\000[24;1H\000[24;1H\000[24;1H\000[24;1H\000\033[24;1H\000[24;1H\000\062\064;1H\000[24;1H\000[24;1H\000;1H\000;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000[24;17H\000\062\064;17H\000\064;17H\000\062\064;17H\000\062\064;17H\000\061\067H\000\061\067H\000tandard
commercial license.\r\n\r\n\000ns"...
(gdb) display bytes
5: bytes = -556149
(gdb) display len
No symbol "len" in current context.
(gdb) display tlen
6: tlen = 936748722493063168
(gdb)

somehow tlen get really, really big and that causes a wraparound which
results in a negative size of bytes which causes memmove to segfault in
hpuifilter.

I'm trying to debug to see what I can see .
Doug Hughes
2018-01-27 17:15:11 UTC
Permalink
Post by Doug Hughes
Post by heasley
Post by Doug Hughes
got a new HP/Aruba 2920 to replace an old failed 2910al (POE power
supply failure - very common).. Weird thing is, hlogin doesn't work on
it. I get an EOF right after trying to send enable and it tries to match
the many stupid formatting characters that the Aruba folks have tried to
put into the output.
I have manually set the switch to vt100 terminal type and reloaded, but
still it persists and it's loaded with those characters. I can't say
definitively that they are the issue, but something sure is strange. I
started looking into it and debugging and noticed all that. Also, clogin
seems to work (aside from command incompatibility), but hlogin does not
and gets an EOF prematurely.
Anybody seen this?
I havent any of these. Have one that I can poke remotely? else, collect
debug info from hlogin -d -c ... devicename
since my post I have tracked it down to a segfault in hpuifilter.
as soon as I type (or rancid sends) enable, it crashes in an memmove
420                     tlen = 0;
421                     tbuf[0] = '\0';
422                     break;
423                 } else if (bytes > 0) {
424                     tlen -= bytes;
425                     memmove(tbuf, tbuf + bytes, tlen + 1);
426                     if (tlen < 1)
427                         pfds[1].events &= ~POLLOUT;
428                 }
429             }
(gdb)
(gdb) display tbuf
4: tbuf =
"h\000[24;1H\000[24;1H\000[24;1H\000[24;1H\000[24;1H\000\033[24;1H\000[24;1H\000\062\064;1H\000[24;1H\000[24;1H\000;1H\000;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000\062\064;17H\000[24;17H\000\062\064;17H\000\064;17H\000\062\064;17H\000\062\064;17H\000\061\067H\000\061\067H\000tandard
commercial license.\r\n\r\n\000ns"...
(gdb) display bytes
5: bytes = -556149
(gdb) display len
No symbol "len" in current context.
(gdb) display tlen
6: tlen = 936748722493063168
(gdb)
somehow tlen get really, really big and that causes a wraparound which
results in a negative size of bytes which causes memmove to segfault in
hpuifilter.
I'm trying to debug to see what I can see .
Further oddness:
[***@services bin]$ ls -l hpuifilter
-rwxr-xr-x 1 rancid rancid 26908 Jan 27 10:53 hpuifilter
[***@services bin]$ ls -l ../download/rancid-3.7/bin/hpuifilter
-rwxrwxr-x 1 rancid rancid 27068 Jan 27 10:51
../download/rancid-3.7/bin/hpuifilter

so, the one in the bin directory after make install has a more recent
timestamp.. Makes sense..
but if I run gdb on the one in the download directory, it doesn't core
dump and seems to work fine.
Does make install do something funky to the one in the bin directory?

I copied the build version from the build directory into my bin
directory and no more problems with hlogin or hpuifilter.

I wasn't getting very far with gdb.. clearly there's an overwrite in
there somewhere, but I wasn't able to easily setup breakpoints
sufficient to catch it. "Debugging with stdin and terminal response is
hard."
heasley
2018-02-02 18:17:21 UTC
Permalink
Post by Doug Hughes
-rwxr-xr-x 1 rancid rancid 26908 Jan 27 10:53 hpuifilter
-rwxrwxr-x 1 rancid rancid 27068 Jan 27 10:51
../download/rancid-3.7/bin/hpuifilter
so, the one in the bin directory after make install has a more recent
timestamp.. Makes sense..
but if I run gdb on the one in the download directory, it doesn't core
dump and seems to work fine.
Does make install do something funky to the one in the bin directory?
sorry for the delay; been very busy.

it might affect the timestamp, but not the size, unless your install is stripping
symbols or it was using libtool. rancid does not use libtool. you can test the
strip premise by removing -s from install in bin/Makefile:

INSTALL_STRIP_PROGRAM = $(install_sh) -c -s

It is possible that your environment has something else that alters the elf
header, eg: set lib search paths. 160b is a small difference; so perhaps.
Post by Doug Hughes
I copied the build version from the build directory into my bin
directory and no more problems with hlogin or hpuifilter.
I wasn't getting very far with gdb.. clearly there's an overwrite in
there somewhere, but I wasn't able to easily setup breakpoints
sufficient to catch it. "Debugging with stdin and terminal response is
hard."
Loading...