Discussion:
[rancid] Any idea on why ssh would resolve hostnames differently from an interactive shell?
Janet Plato
2007-11-26 02:07:02 UTC
Permalink
Hello,

I hope this is an appropriate place to ask questions, if not I
apologize and would appreciate a pointer to the correct location. The
essence of my question is, when clogin spawns ssh directly, the ssh
process fails to resolve the hostname (it generates a rather cryptic 17
byte request when viewed from strace -f that does not appear on the
wire as far as I can tell), but when clogin spawns bash -c ssh, it
opens a socket on port 53 and sends a normal volley of requests, one
for each domain in resolv.conf, and ends up getting an answer and
connecting successfully.

I can also have clogin spawn a bash shell and interact, in which case
eveything works great.

I am running a config management system more or less based on rancid
using the clogin expect script. I have tweaked the script a bit to
deal with a variety of cisco and non-cisco devices, as well as to
handle running commands that take longer to execute such as archive
download-software. I would be glad to provide copies of the source to
anyone interested, the changes are minimal but possibly of interest to
someone.

I find myself extending the types of service to include ssh v2 and I
am having some trouble, when I have expect "spawn ssh -x ***@device"
it fails to resolve the hostname and returns EOF, which makes clogin
exit. When I "spawn bash" and interact, or "spawn /bin/bash -c 'ssh
device'" it works just fine. I've used strace to determine what is
different and I am having trouble understanding the output, it appears
my DNS resolution method changes when I spawn ssh versus when I spawn
bash -c ssh.

I assume most folks are familiar with the foreach device loop in
clogin, and contained within it the case statement where for each
device you try to determine the connection method and spawn the
relevant code. I am copying just the bit of case statement with some
comments

# Log into the router.
proc login { router user userpswd passwd enapasswd cmethod cyphertype }
{
global spawn_id in_proc do_command do_script platform sshver
global prompt u_prompt p_prompt e_prompt
set in_proc 1
set uprompt_seen 0

# debug 1
# exp_internal 1

# try each of the connection methods in $cmethod until one is
successful
set progs [llength $cmethod]
foreach prog [lrange $cmethod 0 end] {
if [string match "telnet*" $prog] {
... code for telnet deleted ...
} elseif ![string compare $prog "bash-ssh"] {
# if bash spawns ssh, I can login fine
if [ catch {spawn /bin/bash -c "ssh $sshver -c $cyphertype
-x $user@$router"} reason ] {
send_user "\nError: ssh failed: $reason\n"
exit 1
}
} elseif ![string compare $prog "ssh"] {
# this fails, ssh returns EOF when it cannot determine the host name
# spawn ssh -c 3des -x ***@fa-cssc-b280c-3-ban-pri
# ssh: : Name or service not known
# sniffing the wire shows no query, strace shows a cryptic 17 bytes
# send followed by a DNS failure
if [ catch {spawn ssh $sshver -c $cyphertype -x
$user@$router} reason ] {
send_user "\nError: ssh failed: $reason\n"
exit 1
}


Below is the strace -f from a failed clogin attempt, note the
send which has to be the DNS request, since I just opened a socket on
port 53. I do not understand how it could be though, it does not look
like a valid packet or fragment.

[pid 27642] send(4, "\205\r\1\0\0\1\0\0\0\0\0\0\0\0\1\0\1", 17, 0) =
17

and the following recvfrom

[pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0" ..., 1024, 0,
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.104.254.254")}, [16]) = 92
[pid 27642] close(4) = 0
[pid 27642] write(2, "ssh: : Name or service not known"..., 34) = 34
[pid 27642] exit_group(255)

The strace manpage says the \### stuff is supposed to be in a format
a c programmer would understand, but I do not understand it. Is it a
mix of octal and the \t, \r, \n we all know and love? In some cases I
have seen \Dg which kind of throws the octal and normal escape sequence
theory out the window. Knowing what strace is telling me would be a
fine start for me.

Strace from a failed attempt
------------------------------------------------------------
[pid 27642] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
[pid 27642] connect(4, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_add
r("128.104.254.254")}, 28) = 0
[pid 27642] fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
[pid 27642] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 27642] gettimeofday({1195955591, 14548}, NULL) = 0
[pid 27642] poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
[pid 27642] send(4, "\205\r\1\0\0\1\0\0\0\0\0\0\0\0\1\0\1", 17, 0) = 17
[pid 27642] poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
[pid 27642] ioctl(4, FIONREAD, [92]) = 0
[pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0"
..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.10
4.254.254")}, [16]) = 92
[pid 27642] close(4) = 0
[pid 27642] write(2, "ssh: : Name or service not known"..., 34) = 34
[pid 27642] exit_group(255)

Here is the strace -f from a successful attempt
--------------------------------------------------------------
[pid 17019] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
[pid 17019] connect(4, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_add
r("128.104.254.254")}, 28) = 0
[pid 17019] fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
[pid 17019] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 17019] gettimeofday({1195953633, 817845}, NULL) = 0
[pid 17019] poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
[pid 17019] send(4,
"\233\233\1\0\0\1\0\0\0\0\0\0\10fa-janet\3net\4wisc\3e"...,
39, 0) = 39
[pid 17019] poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
[pid 17019] ioctl(4, FIONREAD, [184]) = 0
[pid 17019] recvfrom(4,
"\233\233\201\200\0\1\0\1\0\3\0\4\10fa-janet\3net\4wisc"
..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.10
4.254.254")}, [16]) = 184
[pid 17019] close(4) = 0
[pid 17019] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 17019] connect(4, {sa_family=AF_INET, sin_port=htons(22),
sin_addr=inet_add
r("128.104.137.36")}, 16) = 0

Anyway, if anyone can shed light on the following questions I would be
quite grateful.

- why would ssh resolve hostnames differently when spawned by expect
versus when invoked by bash (which was spawned from expect)

- what are the \### in the strace output telling me, especially the 17
byte send that appears to be a DNS requests that is doomed to fail.

I hope everyone had a great thanksgiving, gobble, gobble,

Janet Plato


____________________________________________________________________________________
Be a better pen pal.
Text or chat with friends inside Yahoo! Mail. See how. http://overview.mail.yahoo.com/
john heasley
2007-11-28 19:45:05 UTC
Permalink
Post by Janet Plato
I find myself extending the types of service to include ssh v2 and I
[pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0" ..., 1024, 0,
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.104.254.254")}, [16]) = 92
[pid 27642] close(4) = 0
[pid 27642] write(2, "ssh: : Name or service not known"..., 34) = 34
[pid 27642] exit_group(255)
The strace manpage says the \### stuff is supposed to be in a format
a c programmer would understand, but I do not understand it. Is it a
mix of octal and the \t, \r, \n we all know and love? In some cases I
have seen \Dg which kind of throws the octal and normal escape sequence
theory out the window. Knowing what strace is telling me would be a
fine start for me.
my guess would be those are decimal. I'd expect octals to be \0xxx.
Janet Plato
2007-11-28 20:14:34 UTC
Permalink
Hi John,

Thanks for replying. So I figured this out yesterday. One of the
variable $sshver is occasionally empty. When spawn creates the ARGV
array it creates it with an empty item, ssh tries to resolve the empty
item thinking it is a hostname and fails. When it closes, clogin sees
an EOF.

If spawn sends it to bash -c "ssh $sshver etc" the empty $sshver simply
becomes a blank spot in the string, and ssh has to build the argv array
before exec'ing ssh, and so it works.

I replaced this:
} elseif ![string compare $prog "ssh"] {
if [ catch {spawn ssh $sshver -c $cyphertype -x -l $user
$router} reason ] {

with this:

} elseif ![string compare $prog "ssh"] {
set mycommand [concat ssh -c $cyphertype $sshver -x
$user@$router]
if [ catch "spawn $mycommand" reason ] {
send_user "\nError: ssh failed: $reason\n"
exit 1
}


and all is well.

Cheers,

Janet
Post by Janet Plato
Post by Janet Plato
I find myself extending the types of service to include ssh v2
and I
Post by Janet Plato
am having some trouble, when I have expect "spawn ssh -x
[pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0" ..., 1024,
0,
Post by Janet Plato
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.104.254.254")}, [16]) = 92
[pid 27642] close(4) = 0
[pid 27642] write(2, "ssh: : Name or service not known"..., 34) =
34
Post by Janet Plato
[pid 27642] exit_group(255)
The strace manpage says the \### stuff is supposed to be in a
format
Post by Janet Plato
a c programmer would understand, but I do not understand it. Is it
a
Post by Janet Plato
mix of octal and the \t, \r, \n we all know and love? In some
cases I
Post by Janet Plato
have seen \Dg which kind of throws the octal and normal escape
sequence
Post by Janet Plato
theory out the window. Knowing what strace is telling me would be
a
Post by Janet Plato
fine start for me.
my guess would be those are decimal. I'd expect octals to be \0xxx.
____________________________________________________________________________________
Be a better pen pal.
Text or chat with friends inside Yahoo! Mail. See how. http://overview.mail.yahoo.com/
Chris Moody
2007-11-29 09:36:12 UTC
Permalink
Whom should I contact to (re)submit a patch I contributed to handle
Tacacs "PASSCODE" prompts on C* devices?

I just pulled the latest source and don't see the PASSCODE bits in
clogin anyplace. Am I overlooking that prompt type being handled
another way?

Cheers,
-Chris

Loading...