Subject: Re: libssh2_channel_wait_closed(...) issue

Re: libssh2_channel_wait_closed(...) issue

From: Chris Harris <chris.harris_at_kitware.com>
Date: Mon, 1 Oct 2012 08:52:49 -0400

I have a little more information on this. I enable debug trace on the sshd
side.

This is the snippet of trace where I get a -21 error code from libssh2:

Sep 20 10:08:21 salix sshd[14975]: debug1: session_input_channel_req:
session 0 req exec
Sep 20 10:08:21 salix sshd[14975]: debug2: fd 3 setting TCP_NODELAY
Sep 20 10:08:21 salix sshd[14975]: debug2: fd 10 setting O_NONBLOCK
Sep 20 10:08:21 salix sshd[14975]: debug2: fd 9 setting O_NONBLOCK
Sep 20 10:08:21 salix sshd[14975]: debug2: fd 12 setting O_NONBLOCK
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: read<=0 rfd 10 len 0
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: read failed
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: close_read
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: input open -> drain
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: read 0 from efd 12
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: closing read-efd 12
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: ibuf empty
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: send eof
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: input drain -> closed
Sep 20 10:08:21 salix sshd[14975]: debug2: notify_done: reading
Sep 20 10:08:21 salix sshd[14975]: debug1: Received SIGCHLD.
Sep 20 10:08:21 salix sshd[14975]: debug1: session_by_pid: pid 14976
Sep 20 10:08:21 salix sshd[14975]: debug1: session_exit_message: session 0
channel 0 pid 14976
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: request exit-status
confirm 0
Sep 20 10:08:21 salix sshd[14975]: debug1: session_exit_message: release
channel 0
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: write failed
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: close_write
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: send eow
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: output open -> closed
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: send close
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: rcvd eof
Sep 20 10:08:21 salix sshd[14975]: debug2: channel 0: rcvd close
Sep 20 10:08:21 salix sshd[14975]: debug1: server_input_channel_open: ctype
session rchan 1 win 262144 max 32768
Sep 20 10:08:21 salix sshd[14975]: debug1: input_session_request
Sep 20 10:08:21 salix sshd[14975]: debug1: channel 1: new [server-session]
Sep 20 10:08:21 salix sshd[14975]: debug1: session_open: channel 1
Sep 20 10:08:21 salix sshd[14975]: error: no more sessions

And this is the snippet with the delay add:

Sep 20 10:11:26 salix sshd[16409]: debug1: session_input_channel_req:
session 0 req exec
Sep 20 10:11:26 salix sshd[16409]: debug2: fd 3 setting TCP_NODELAY
Sep 20 10:11:26 salix sshd[16409]: debug2: fd 10 setting O_NONBLOCK
Sep 20 10:11:26 salix sshd[16409]: debug2: fd 9 setting O_NONBLOCK
Sep 20 10:11:26 salix sshd[16409]: debug2: fd 12 setting O_NONBLOCK
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: read<=0 rfd 10 len 0
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: read failed
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: close_read
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: input open -> drain
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: read 0 from efd 12
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: closing read-efd 12
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: ibuf empty
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: send eof
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: input drain -> closed
Sep 20 10:11:26 salix sshd[16409]: debug2: notify_done: reading
Sep 20 10:11:26 salix sshd[16409]: debug1: Received SIGCHLD.
Sep 20 10:11:26 salix sshd[16409]: debug1: session_by_pid: pid 16410
Sep 20 10:11:26 salix sshd[16409]: debug1: session_exit_message: session 0
channel 0 pid 16410
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: request exit-status
confirm 0
Sep 20 10:11:26 salix sshd[16409]: debug1: session_exit_message: release
channel 0
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: write failed
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: close_write
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: send eow
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: output open -> closed
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: send close
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: rcvd eof
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: rcvd close
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: is dead
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: gc: notify user
Sep 20 10:11:26 salix sshd[16409]: debug1: session_by_channel: session 0
channel 0
Sep 20 10:11:26 salix sshd[16409]: debug1: session_close_by_channel:
channel 0 child 0
Sep 20 10:11:26 salix sshd[16409]: debug1: session_close: session 0 pid 0
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: gc: user detached
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: is dead
Sep 20 10:11:26 salix sshd[16409]: debug2: channel 0: garbage collecting
Sep 20 10:11:26 salix sshd[16409]: debug1: channel 0: free: server-session,
nchannels 1
Sep 20 10:11:27 salix sshd[16409]: debug1: server_input_channel_open: ctype
session rchan 1 win 262144 max 32768
Sep 20 10:11:27 salix sshd[16409]: debug1: input_session_request
Sep 20 10:11:27 salix sshd[16409]: debug1: channel 0: new [server-session]

It looks to me like the GC step needs to complete before another channel
can be requested. I am not familiar with the ssh protocol but should
libssh2 be waiting for this step to complete before
libssh2_channel_wait_closed()
returns?

Many Thanks,

Chris

On Thu, Sep 20, 2012 at 10:06 AM, Chris Harris <chris.harris_at_kitware.com>wrote:

> Maxime & Daniel,
>
> Thanks very much for the quick response. I tried the patch, but sadly it
> has no effect in this situation. Let me know if there is more information I
> can provide to help. If I get some time today I will start taking a look at
> the code myself as well.
>
> Regards,
>
> Chris
>
>
> On Thu, Sep 20, 2012 at 4:20 AM, Daniel Stenberg <daniel_at_haxx.se> wrote:
>
>> On Wed, 19 Sep 2012, Maxime Larocque wrote:
>>
>> I had a similar trouble with libssh2 1.4.2, and I submitted a patch to
>>> correct this:
>>>
>>> http://trac.libssh2.org/**ticket/245<http://trac.libssh2.org/ticket/245>
>>>
>>> Under some conditions, libssh2 did not close the channel correctly. I am
>>> not sure if it applies to your case...
>>>
>>
>> Thanks for pointing out this potential fix! Chris, let us know if it
>> changes anything for you. I gave libssh2_channel_wait_closed() a quick
>> glance yesterday and I couldn't spot anything really wrong standing out so
>> it will take some proper digging to debug this more.
>>
>> I figure it goes without saying, but I'll state it anyway:
>>
>> We are only a few people actually involved in this project. Those of us
>> who have push rights to git will HUGELY appreciate your feedback, comments
>> and additional tests of patches and bug reports as provided. That helps us
>> merge things into git and future releases with more confidence and speed.
>>
>> --
>>
>> / daniel.haxx.se
>> ______________________________**_________________
>> libssh2-devel http://cool.haxx.se/cgi-bin/**
>> mailman/listinfo/libssh2-devel<http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel>
>>
>
>

_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel
Received on 2012-10-01