[sf-lug] Another victory...

Mon May 15 21:35:10 PDT 2006

holymoly! (gulp!) I forgot about /tmp /temp and /var/log,
which should be mount points for their own partitions.

we at sf-lug had nothing to do with the linux machines
at javacat (RIP it seems).

what a bunch of good advice below! more thanks.

On May 15, 2006, at 2:52 PM, Rick Moen wrote:

> [A friend commented off-list about the Linux machines apparently now in
> a bad state at Javacat.  Since I'd rather post publicly, I'm omitting
> quoting the friend's private remarks.]
>
> Sounds like we don't know who did what, really.  The fellow who
> installed Linux on the new machines might have messed up the
> installation, or someone else might have screwed around with them 
> later.
> I hope Jim does a bit of investigation before blowing away and
> re-loading.
>
> Linux machines sitting out in [semi-]public, for general use, are
> inherently under _some_ ongoing threat of people monkeying with their
> boot configuration, or cracking root and damaging things, etc.  The key
> to their doing well is to expect this and plan for dealing with it.
> This is ideally iterative, i.e., you deploy based on your best guess
> about what will work, then closely observe non-expert users' problems
> and take corrective measures to fix them.
>
> E.g., I can remember a couple of changes made to the CoffeeNet machines
> after the first couple of weeks:
>
> 1.  Richard had anticipated that some people would Ctrl-Alt-F1 and
> Ctrl-Alt-Del in order to trigger pointless reboots, and had accordingly
> reedited the "ca:" to make Ctrl-Alt-Del map to something harmless and
> inert, instead of "/sbin/shutdown -t1 -a -r now".  This turned out to 
> be
> a mistake:  Those people who were determined to initiate pointless
> reboots, when foiled in the above fashion, instead just yanked and
> re-plugged the system power cord -- which was much more perilous to
> system health.  So, it turned out to be much smarter to let them do
> their dumb, pointless reboots via Ctrl-Alt-Del, which at least ensured
> an orderly shutdown and umount.  (This is before the days of journaling
> filesystems.)
>
> 2.  We found a few workstations apparently in a hung state, from which
> the customer had walked away (sometimes saying the system had 
> "crashed",
> sometimes not), that turned out upon examination to have eight or ten
> Netscape Navigator instances crammed into memory at once.  This was a
> little puzzling, until we observed the syndrome:  Impatient customer
> pressed the "N" (Netscape browser) button on the tkGoodStuff button 
> bar.
> Not getting instant browser pop-up, and not bothering to notice the
> disk-activity light, he/she pressed it again a second later.  Then
> again.  Then again.  Then again.  The script invoked by tkGoodStuff to
> launch the browser wasn't smart enough to detect existing instances
> under that same EUID, so it kept spawning more as requested.  Thirty
> seconds later, a few browser windows started opening, with machine
> performance slowing to a crawl as it swapped itself nearly to death.
>
> The cure was, of course, to insert a few lines of logic to check for
> running browser instances and terminate if any were found.  This is
> probably now an obsolete problem, since I believe that default browser
> wrapper scripts now routinely include such checks, but the moral is:
> Unsophisticated users will find and trigger error modes you hadn't even
> realised were possible.  Only observation and corrective action will
> find and defuse those pitfalls.
>
> Richard also had to research and test some fairly arcane, poorly
> documented information about XDM scripts that can be automatically run
> at the beginning of login, and at logout (GiveConsole, TakeConsole), 
> for
> system maintenance purposes.  These could be caused to run as (and be
> owned by) root, so that the users couldn't fool with them.  For 
> example,
> /tmp had to be cleared out upon logout, otherwise people would leave
> vast amounts of junk there.
>
>
> The most important decision Richard made was to locate all the
> most-crucial files (mail spools, user home directories, authentication
> information) on the protected NFS server upstairs in his apartment,
> leaving only (very replaceable) generic distro binaries and libraries 
> on
> the workstation boxes down in the cafe.  Anyone cracking root on, or
> otherwise absuing, the workstation boxes would gain nothing useful:
> Because the NFS exports used the "root_squash" flag, gaining root on 
> the
> downstairs machines got you _less_ access on the upstairs server than
> you had before.  Any workstation suspected of being fooled with in that
> fashion would, however, be (quickly, easily) reimaged to avert 
> mischief.
>
> Other needs for light scripting also arose, but I can't remember
> details.  E.g., every user had a specific disk quota imposed on him/her
> programmatically, and it was necessary to run reports on which users 
> had
> hit quota -- because inevitably there were people who, through mailing
> list subscriptions or innumerable other means had used up all their 
> disk
> space but never figured that out.  (Users would come up and say "Hey,
> why is my mail being refused?" and be completely clueless about what 
> the
> "550 User over quota" delivery status notification means.)
>
>
> _______________________________________________
> sf-lug mailing list
> sf-lug at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/sf-lug
>