More Secure Shell troubles

by joe on December 5, 2007

Well, I still haven’t solved the earlier problems (see other posts in this category), but now I’m having a new problem. One of the Windows servers we’ve had OpenSSH running on for quite some time suddenly seems have issues. It will stop accepting connections. The message in the sshd.log is always some variation of this:

63 [main] sshd 7632 child_copy: linked dll bss write copy failed, 0x207A000..0x207CAA0, done 0, windows pid 8136, Win32 error 998

Stopping the service, with the intent of restarting, didn’t work, as the service would then not start at all. cygrunsrv -S sshd would yield the mysterious win32 error 1062, and would refuse to start, with nothing showing up in the event logs. A complete re-installation of cygwin fixed the problem, but it returned within one day. Now I find out that this server is short on memory (it’s used for some heavy-duty data processing), so I suspect that the problem is related to that. If you’re researching the same issue, check your available memory. I’ll report more details here as they develop. In the end, I’ll probably write a comprehensive article for publication on Associated Content.

Update: 12/06/2007: Some of our scripting relied on multiple successive ssh connections to a target server. The idea was to maintain as much of the scripting logic as possible on our build server, executing remote commands one at a time, each via an SSH connection. This may have caused a resource bottleneck. I re-wrote some of the scripts to do a number of things in a single connection. I also added retry logic, in case of the “resource unavailable” error. We’ll see how it goes.

{ 0 comments… add one now }

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>