Friday, October 14, 2005

sh not ending (sometimes) from inside a script, why?

ssh not ending (sometimes) from inside a script, why?

Okay I have this script file that runs from cron job (on a unix box
running solaris 9 with SSH version Sun_SSH_1.0 protocols 1.5/2.0) and
most the time it works just find. Except every so often one of the
three ssh commands I have in the script just doesn't know it's done
and that of course causes the whole thing to hang!

The ssh command has executed. I can tell this because the command is
as follows:

ssh username@hostname "ls /somedirectory" >somedir.file

and a new somedir.file has been created on the machine running the
cron script.

Also, when this happens I can use the "kill sshprocessid" command and
the cron script will continue, fat dumb and happy!

The machine I'm "ssh"ing to is a unix box with OpenSSH_3.7.1p2
protocols 1.5/2.0, OpenSSL 0.9.6 on it.

I have tried running the ssh command in -v mode and looking at the out
put but can't really make much from it but can see a difference in the
order of things when the command hangs.

The –v output when ssh ended properly is as follows:

1 -debug1: Entering interactive session.
2 - debug1: client_init id 0 arg 0
3 - debug1: Sending command: ls /Rawdata/Archive2/*VCID4.tlm.gz
4 - debug1: channel 0: open confirm rwindow 0 rmax 32768
5 - debug1: channel 0: read<=0 rfd 6 len 0
6 - debug1: channel 0: read failed
7 - debug1: channel 0: input open->drain
8 - debug1: channel 0: close_read
9 - debug1: channel 0: input: no drain shortcut
10 - debug1: channel 0: ibuf empty
11 - debug1: channel 0: input drain->closed
12 - debug1: channel 0: send eof
13 - debug1: channel 0: rcvd eof
14 - debug1: channel 0: output open->drain
15 - debug1: channel: 0 rcvd request for exit-status
16 - debug1: cb_fn 267a4 cb_event 91
17 - debug1: channel 0: rcvd close
18 - debug1: channel 0: obuf empty
19 - debug1: channel 0: output drain->closed
20 - debug1: channel 0: close_write
21 - debug1: channel 0: send close
22 - debug1: channel 0: full closed2
23 - debug1: channel_free: channel 0: status: The following
connections are open: #0 client-session (t4 r0 i8/0 o128/0 fd -1/-1)

24 - debug1: channel_free: channel 0: dettaching channel user
25 - debug1: Transferred: stdin 0, stdout 0, stderr 0 bytes in 0.6
seconds
26 - debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.0
27 - debug1: Exit status 0

The –v output when ssh hung is as follows:

1 - debug1: Entering interactive session.
2 - debug1: client_init id 0 arg 0
3 - debug1: Sending command: ls /Rawdata/Archive2/*VCID4.tlm.gz
4 - debug1: channel 0: open confirm rwindow 0 rmax 32768
5 -debug1: channel 0: read<=0 rfd 6 len 0
6 - debug1: channel 0: read failed
7 - debug1: channel 0: input open->drain
8 - debug1: channel 0: close_read
9 - debug1: channel 0: input: no drain shortcut
10 - debug1: channel 0: ibuf empty
11 - debug1: channel 0: input drain->closed
12 - debug1: channel 0: send eof
13 - debug1: channel 0: rcvd eof
14 - debug1: channel 0: output open->drain
15 - debug1: channel 0: obuf empty
16 - debug1: channel 0: output drain->closed
17 - debug1: channel 0: close_write
18 - debug1: channel 0: send close
19 - debug1: channel: 0 rcvd request for exit-status
20 - debug1: cb_fn 267a4 cb_event 91
21 - debug1: channel 0: rcvd close
22 - debug1: channel 0: full closed2
23 - debug1: channel_free: channel 0: status: The following
connections are open: #0 client-session (t4 r0 i8/0 o128/0 fd -1/-1)

24 - debug1: channel_free: channel 0: dettaching channel user

Note I put the numbers on the trace.

I figure there is a race condition of some kind going on for the end
of command signal and once in awhile it gets missed but.... I just
don't know where else to look....

I'll take any ideas or comments, please!

http://www.snailbook.com/faq/background-jobs.auto.html

--

Subject: Re: ssh not ending (sometimes) from inside a script, why?

Thank you for the link. I did read through this before posting my
question but I just read through it again and wonder if the answer to
my question is not simply it is going to hang sometimes and there is
nothing I can do about it.

This is of course not the answer I was hoping for but if it is so I
guess I should move on and just write code to kill the process after a
period of time. I just don't think that this answer is very pretty :(

Again any thoughts or comments are welcome, and thanks.
Kym

End of messages

0 Comments:

Post a Comment

<< Home