[sf-lug] Another victory...
jim stockford
jim at well.com
Mon May 15 21:35:10 PDT 2006
holymoly! (gulp!) I forgot about /tmp /temp and /var/log,
which should be mount points for their own partitions.
we at sf-lug had nothing to do with the linux machines
at javacat (RIP it seems).
what a bunch of good advice below! more thanks.
On May 15, 2006, at 2:52 PM, Rick Moen wrote:
> [A friend commented off-list about the Linux machines apparently now in
> a bad state at Javacat. Since I'd rather post publicly, I'm omitting
> quoting the friend's private remarks.]
>
> Sounds like we don't know who did what, really. The fellow who
> installed Linux on the new machines might have messed up the
> installation, or someone else might have screwed around with them
> later.
> I hope Jim does a bit of investigation before blowing away and
> re-loading.
>
> Linux machines sitting out in [semi-]public, for general use, are
> inherently under _some_ ongoing threat of people monkeying with their
> boot configuration, or cracking root and damaging things, etc. The key
> to their doing well is to expect this and plan for dealing with it.
> This is ideally iterative, i.e., you deploy based on your best guess
> about what will work, then closely observe non-expert users' problems
> and take corrective measures to fix them.
>
> E.g., I can remember a couple of changes made to the CoffeeNet machines
> after the first couple of weeks:
>
> 1. Richard had anticipated that some people would Ctrl-Alt-F1 and
> Ctrl-Alt-Del in order to trigger pointless reboots, and had accordingly
> reedited the "ca:" to make Ctrl-Alt-Del map to something harmless and
> inert, instead of "/sbin/shutdown -t1 -a -r now". This turned out to
> be
> a mistake: Those people who were determined to initiate pointless
> reboots, when foiled in the above fashion, instead just yanked and
> re-plugged the system power cord -- which was much more perilous to
> system health. So, it turned out to be much smarter to let them do
> their dumb, pointless reboots via Ctrl-Alt-Del, which at least ensured
> an orderly shutdown and umount. (This is before the days of journaling
> filesystems.)
>
> 2. We found a few workstations apparently in a hung state, from which
> the customer had walked away (sometimes saying the system had
> "crashed",
> sometimes not), that turned out upon examination to have eight or ten
> Netscape Navigator instances crammed into memory at once. This was a
> little puzzling, until we observed the syndrome: Impatient customer
> pressed the "N" (Netscape browser) button on the tkGoodStuff button
> bar.
> Not getting instant browser pop-up, and not bothering to notice the
> disk-activity light, he/she pressed it again a second later. Then
> again. Then again. Then again. The script invoked by tkGoodStuff to
> launch the browser wasn't smart enough to detect existing instances
> under that same EUID, so it kept spawning more as requested. Thirty
> seconds later, a few browser windows started opening, with machine
> performance slowing to a crawl as it swapped itself nearly to death.
>
> The cure was, of course, to insert a few lines of logic to check for
> running browser instances and terminate if any were found. This is
> probably now an obsolete problem, since I believe that default browser
> wrapper scripts now routinely include such checks, but the moral is:
> Unsophisticated users will find and trigger error modes you hadn't even
> realised were possible. Only observation and corrective action will
> find and defuse those pitfalls.
>
> Richard also had to research and test some fairly arcane, poorly
> documented information about XDM scripts that can be automatically run
> at the beginning of login, and at logout (GiveConsole, TakeConsole),
> for
> system maintenance purposes. These could be caused to run as (and be
> owned by) root, so that the users couldn't fool with them. For
> example,
> /tmp had to be cleared out upon logout, otherwise people would leave
> vast amounts of junk there.
>
>
> The most important decision Richard made was to locate all the
> most-crucial files (mail spools, user home directories, authentication
> information) on the protected NFS server upstairs in his apartment,
> leaving only (very replaceable) generic distro binaries and libraries
> on
> the workstation boxes down in the cafe. Anyone cracking root on, or
> otherwise absuing, the workstation boxes would gain nothing useful:
> Because the NFS exports used the "root_squash" flag, gaining root on
> the
> downstairs machines got you _less_ access on the upstairs server than
> you had before. Any workstation suspected of being fooled with in that
> fashion would, however, be (quickly, easily) reimaged to avert
> mischief.
>
> Other needs for light scripting also arose, but I can't remember
> details. E.g., every user had a specific disk quota imposed on him/her
> programmatically, and it was necessary to run reports on which users
> had
> hit quota -- because inevitably there were people who, through mailing
> list subscriptions or innumerable other means had used up all their
> disk
> space but never figured that out. (Users would come up and say "Hey,
> why is my mail being refused?" and be completely clueless about what
> the
> "550 User over quota" delivery status notification means.)
>
>
> _______________________________________________
> sf-lug mailing list
> sf-lug at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/sf-lug
>
More information about the sf-lug
mailing list