[sf-lug] failsafe ... and at(1) ...

Michael Paoli Michael.Paoli at cal.berkeley.edu
Mon Nov 8 18:53:18 PST 2021


> From: "Rick Moen" <rick at linuxmafia.com>
> Subject: Re: [sf-lug] correct URL?
> Date: Mon, 8 Nov 2021 17:22:00 -0800

> Them:  "What'd ya do?"

There's also the "failsafe" method ... well, at least for certain
level(s) of failsafe.

E.g. if one has a suitably functional at and atd or the like.
Set up an at job to run a bit in the future ... oh, like maybe
10 minutes.  And have in that job, something that checks some
relevant important functionality ... oh, like network connectivity.
And, if the check fails, it does the needed to revert to and
activate a known good configuration.  Anyway, quite handy that,
used it fair number of times when reconfiguring things where solidly
breaking it would not be good ... but having it not function for like
10 minutes or so - not a problem.

Note, however, the typical atd on Linux can be a bit funky ...
notably it will often not fire off at jobs if it thinks the load
is "too high" - but often that threshold - at least by default,
may be way too low ... e.g. a load of 2 on your 16 core host isn't
much of anything.  So ... I often bump up the load limit on
atd, relative to, e.g. number of cores on the host.

Most of the time I want atd to launch the at jobs, unless the load
levels are already really rather to quite problematic for the host.




More information about the sf-lug mailing list