[sf-lug] sf-lug.com. box questions, documentation, "rules of the road", policies, etc.
jim at well.com
Wed Apr 25 08:46:32 PDT 2007
i've got a very busy day today, hence a brief reply:
* thanks lots! great note.
* nathan, michael, jim are box janitors; one more from
balug would be good. I, jim, am willing to be first
responder to go down there if need be.
* out-of-band could be on circle (sf-lug.org).
* I'll doc the hardware (I bought the box, what the hey).
On Apr 25, 2007, at 7:39 AM, Michael Paoli wrote:
> sf-lug.com. box questions, documentation, "rules of the road",
> policies, etc.
> Still documenting more, but a few questions along the way ...
> appropriate outage notification and out-of-band status page?
> We earlier discussed outages, planned outages, etc.
> Even have a place to document that further
> Two particular items/questions occurred to me regarding that.
> First of all, for planned outages, *who* do we want to notify,
> and *how* do we want to notify them? Might that also depend on
> circumstances, nature of outage (whole box down, or just some
> important service(s)), duration and timing? Would we want to:
> * do a wall on the system
> * edit /etc/motd and/or /etc/issue
> * e-mail the "pagermonkeys" on the box
> * e-mail the sf-lug list
> * and/or other?
> and do we want to come up with "rules" (/guidelines) on what method(s)
> should be used under what circumstances?
> Also out-of-band status page? It would be potentially very useful to
> have some out-of-band (independent of that box, and preferably also
> independent of that colo) status/notification page. E.g. it can be
> highly useful to have an independent web page (could just be a wiki web
> page somewhere) that indicates some status information (most notably
> if/when any unexpected outage occurs - to indicate status and
> estimate/guestimate on return to service, but also a place for folks to
> look during scheduled outages - such as if they didn't know in advance
> about the outage). E.g. rather like:
> out-of-band status page:
> for status of:
> Hardware documentation?
> Although it's possible to use standard LINUX/CentOS tools to get some
> information on the hardware (e.g. CPU, disk sizes, some bits of
> chipset information here and there), could someone document the
> hardware details - e.g. make and model of the system, any particular
> details of hardware/options installed, etc. Having such information
> known (and documented!) could come in rather to quite handy in
> troubleshooting any items that may be hardware related, planning
> certain optimizations and potential upgrades, etc. If someone is
> able to at least provide the basic hardware information, we could
> probably get that up on a wiki page, including hunting down relevant
> reference information (e.g. links to more detailed hardware
> specifications for particular make/model of items identified).
> More documentation/log stuff ...
> I started two log files on the box - feel free to have a look at
> the /home/admin/log* files (most notably /home/admin/log). The
> general idea there is human readable
> (and fairly searchable by date, or other criteria) log of system
> made, issues/bugs noted/corrected, etc. Most notably the idea here is
> to keep lots of less details regarding such off of and from
> piling up ad nauseum on wiki pages (could eventually get quite long),
> and it's also often much easier to drop information straight into flat
> file or copy/paste from such, and not have to worry about wiki
> formatting goop and how to get something to render as plain text.
> The wiki pages are probably much more suitable for more general
> documentation (e.g. policies, how-to, etc.) - such as things likely to
> be revised over time (as opposed to continually appended to and not as
> likely to be of more general interest). For a bit more of an idea,
> have a look at /home/admin/log - it's already up to 82 lines - and
> that's just covering a bit of usage/syntax, and noting and dealing with
> a few minor issues. As I noted, such can get quite long (e.g. on my
> home systems, the equivalent file I maintain on each have grown to be
> in excess of 10,000 lines long (not that we have to be *that* detailed
> on the sf-lug.com. box - on my home systems, I log, for example, all
> package additions/removals/upgrades - including package version
> information, bugs and hardware issues/problems encountered, hardware
> changes, etc.; capturing/noting at least more noteworthy changes/issues
> for the sf-lug.com. box would probably be a good thing.)).
> Code of ethics? Should we add to the "rules of the road" / policy
> something indicating an appropriate code of ethics? The more
> experienced systems administrators likely think such would be quite
> applicable anyway, but, most notably for those that may be much newer
> the field, explicitly noting, or at least referencing such, would help
> call attention to such, introduce such to those not already familiar
> with such, and help develop and foster appropriate professionalism.
> E.g. could add something roughly like:
> Users of the system, and most notably systems administrators and any
> other persons with any type of privileged access to the system, should
> exercise appropriate professionalism and follow appropriate code of
> e.g. the LOPSA/SAGE/USENIX code of ethics:
> Quoting Michael Paoli:
>> Just a bit of a start (I plan to add more), but I put some of the
>> information on the wiki. Feel free to correct anything that's
>> improve formatting/presentation, etc.
> sf-lug mailing list
> sf-lug at linuxmafia.com
More information about the sf-lug