[sf-lug] server reboot/shutdown & ssh disconnection - or not

Michael Paoli Michael.Paoli at cal.berkeley.edu
Sun Jan 1 09:56:48 PST 2017


> From: "Akkana Peck" <akkana at shallowsky.com>
> Subject: Re: [sf-lug] server reboot
> Date: Sun, 1 Jan 2017 09:10:48 -0700

> Lots of interesting answers, but I'm curious about the original problem.
>
> Usually, when I ssh into a machine and reboot it remotely, it
> disconnects and puts me back in my local shell. I have occasionally
> seen the behavior described in the original post, where the remote
> shell just hangs and ssh doesn't disconnect, and I've always
> wondered what the difference is in those cases, and why ssh isn't
> detecting the disconnect.
>
> When that happens, I type ~. which tells ssh to break the connection
> and exit. So far that's always worked for me.

Somewhere, within the last few months or less, I read something that
somewhat specifically asked that ... and it was addressed.  But I don't
recall where, and haven't been able to easily find it again.  May have
been some specific distribution or release thereof - perhaps even
considered a "bug" for that distribution/release?  But I don't
specifically recall.

Essentially it has to do with how things are shutdown, and how rapidly.
If sshd gets SIGTERM reasonably in advance (e.g. 1 second or more) of
the network being torn down - or it's so signaled and waited for, then
sshd relatively gracefully tears down those TCP connections first.
Otherwise, the server end of the connection effectively disappears, and
the client generally continues to wait - at least until it's otherwise
taken down (e.g. ~. done on client side, or one of the keepalive options
is in use, and the client eventually tears down the connection for lack
of response).

And, the "fix"?  In the bit I recall reading, it was a minor adjustment
to the init system's configuration of sshd, such that in the normal
shutdown case, sshd would receive the SIGTERM and be waited upon in
sequence (or at least get sufficient advance lead time), sometime in
advance of the network teardown.

Also, in general, one should use shutdown - not halt nor reboot nor
haltsys nor the like.  Those latter ones should generally be reserved
for cases where one has good reason and need to take the system down
significantly faster than shutdown, and orderly shutdown isn't the
primary concern (e.g. flames are shooting out the back of the power
supply, or the system has been nastily cracked and is doing very bad
things and it's more important to quickly stop it than figure out what
it's doing and what on the system is doing it and how, or it's a test
system and one has reason to check how it deals with faster less
graceful shutdowns, etc.).  Though POSIX may not specifically address
shutdown, halt, reboot, haltsys, etc., and most Linux distributions may
be designed to make it more difficult for the system administrator to
accidentally do something stupid, in general, in the land of Unix-like
operating systems, shutdown is the way to do an orderly shutdown (though
syntax details do vary).  The other commands, not so much so - and even
highly not.  So, if orderly shutdown is intended, best to be in the
habit of using shutdown - lest one day, one might otherwise do something
highly ungraceful when such wasn't the intent.

Also, this might possibly suffice for one's own ssh client connection,
but not for anyone else's:
# exec shutdown [...]




More information about the sf-lug mailing list