Subject: [ libssh2-Bugs-2828139 ] libssh2_poll intermittent hang on Solaris (old version)

[ libssh2-Bugs-2828139 ] libssh2_poll intermittent hang on Solaris (old version)

From: SourceForge.net <noreply_at_sourceforge.net>
Date: Tue, 28 Jul 2009 00:35:33 +0000

Bugs item #2828139, was opened at 2009-07-27 20:35
Message generated for change (Tracker Item Submitted) made by listmail
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=703942&aid=2828139&group_id=125852

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: API
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: listmail (listmail)
Assigned to: Nobody/Anonymous (nobody)
Summary: libssh2_poll intermittent hang on Solaris (old version)

Initial Comment:
I'm still using libssh2-0.18 because I was getting new issues (basically no output) when testing libssh2-1.1 on Solaris 10 + Net::SSH2. My guess is that piece could be related to recent blocking changes in 1.1 since 1.0 worked with the same code.

Anyway, The only problem I've had is that, at random, libssh2_poll will hang indefinitely. It could be a couple of days or more or even just a matter of minutes before this happens. I call a perl/Net::SSH2 app at regular intervals to exec some "health check" commands on some servers. I finally tracked the true hang down to the C recv() function call in libssh2_packet_read. The following code from libssh2_poll is calling libssh2_packet_read consecutively before recv eventually hangs. Its as if it was expected that there would be data (since libssh2_packet_reads return value > 0), but recv ended up waiting for more data.

(session.c, libssh2_poll, near #ifdef HAVE_POLL section)
                case LIBSSH2_POLLFD_CHANNEL:
                    if (sockets[i].events & POLLIN) {
                        /* Spin session until no data available */
                        while (libssh2_packet_read(fds[i].fd.channel->session)
> 0);
                    }

(transport.c, libssh2_packet_read, I put before/after debug code here to catch the hang)
            /* now read a big chunk from the network into the temp buffer */
            nread =
                recv(session->socket_fd, &p->buf[remainbuf],
                     PACKETBUFSIZE - remainbuf,
                     LIBSSH2_SOCKET_RECV_FLAGS(session));

Simply removing the while loop avoids the hang and my initial tests are still showing all expected data being returned. I'm not fluent in C nor do I understand sockets. Should there be some additional poll done before each iteration as if even though the return value indicates more packets there is a chance recv could still hang? I see that you may be deprecating libssh2_poll but right now I can't get 1.1 to read any output.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=703942&aid=2828139&group_id=125852
_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel
Received on 2009-07-28