mailing lists, backup/recovery/disasters/failures, free/volunteer, ... [was: Re: Mailing lists using usenet newsgroup]
Michael Paoli
Michael.Paoli at cal.berkeley.edu
Tue Jan 6 20:43:14 PST 2015
Rick/LEdWorldwide!, ...
LEdWorldwide! (and others/if/as/where applicable),
Thanks much on the interest/enthusiasm, much appreciated, and that
generally helps contribute to a fine start. However, I may suggest
channelling that interest/enthusiasm where it might be more/most
useful/beneficial ... both to yourself, and others. Moving or
restarting onto something somewhere else, a very long and well
established list (and archive, etc.), and doing it rather to quite well,
isn't trivial. It's not exactly rocket science, but there are a whole
lot of things that can go wrong or be done quite a bit less than
optimally, or nearly so. Doesn't mean there aren't quite appropriate
learning opportunities for, e.g. Linux(/Open Source/Unix) systems
administration and email and lists and related, it's just that starting
with a long established lists with much history, users/subscribers,
archive, etc., has many more (potential) pitfalls/hazards, than starting
something completely new, unrelated, independent, and much smaller set
of subscribers. Just one random example I'll mention - BALUG. In
BALUG's considerably earlier history, there was a period of a few years
or less, where BALUG's list went through very nasty rough changes, most
notably having most or all of these characteristics: (very) different
(user facing) software interface, complete loss of archive contents,
loss of and complete restart of most or all subscriber list to list and
user setting/passwords/preferences, significant to major changes in list
URLs, email addresses, archive URLs, etc. Anyway, each such transition
generally left a rather to quite bad taste in user's mouths (though that
may have been partially offset by whatever problems were trying to be
solved by each such move/transition), but especially doing such rough
transitions, not only when they can and ought mostly be avoided to the
extent feasible, but also doing multiple such transitions in relatively
short period of time (like when users still remember and highly didn't
like the last similar set of such changes), well, it can among other
things, cause one to lose a whole lot of members/subscribers - many of
which won't come back or may take a very long time to come back - if
they ever come back. And it generally sets a bad impression on folks,
and tends to also secondarily spread/cause a bad reputation. And
that's mostly not even touching upon many of the also important
technical details. (Reputation/taste ... sort'a like when folks at a
meeting are screaming at each other and throwing chairs at each other -
even if one didn't personally witness it, one tends to hear about such
things - and not necessarily exceedingly accurately, but tends to set a
relatively negative impression regardless). So, ... at least for a
well established list/archives, with rather to quite significant number
of folks and rather to quite well establish history, I'd generally
advise it be handled as smoothly and professionally as feasible.
Doesn't mean things will necessarily always be perfect (especially if
one is mostly or entirely relying upon donated
equipment/services/time/resources), but as feasible, generally best to
avoid unnecessarily disruptive rough transitions. E.g., for BALUG, at
least so long as I've had the access to deal with it, the lists have
been run very smoothly and consistently (and Rick Moen has also quite
assisted on that too!) ... notwithstanding a hosting provider that's
seriously screwed up the lists now on multiple occasions within the
last two years time span (and hence we're increasingly motivated to
abandon that provider!). But other than those, uh, "glitches" - and
most notably last couple years, the BALUG lists have been run highly
smoothly since ... well, if I'm not mistaken from my notes/records,
covering the span of or right around 2001-06-15--2013-07-12, without so
much as a significant glitch or hiccup in that time range for those
lists. Anyway, if you want to volunteer and, e.g., start putting your
LPI certification to use, etc., you can certainly not only talk to
myself and Jim Stockford about such (and we can point you at things to
do), but also, one of the many things you can do to practice and hone
such skills, is watch and participate in forums where folks come for
help regarding Linux(/Unix/Open Source) systems administration and more
general questions/issues/problems related to those operating systems
(and those forums can be Usenet, mail lists, IRC, etc. - but a not
archived not publicly accessible email list distribution doesn't count
so much towards that). Anyway, can learn a lot in such forums - read
the issues, read the responses and follow-through - what did/didn't
work and why, and the various advantages/disadvantages, consequences
and considerations, etc. - and see too, how well those did/didn't match
to what you would've done or guessed to be most useful approach. And
as you feel it appropriate, certainly assist in contributing - making
suggestions, providing information, etc. - and well learn from the
feedback on it - how well it did/didn't work, why, what you might've
missed or may have been better or more efficient way to do it, etc.
And most especially to Rick, THANK YOU VERY MUCH! - for all the years -
DECADES! of volunteer work you've done helping Linux and the greater
Open Source community in general! It's all very highly appreciated.
And yes Rick, you very well pointed out before, the access availability
for the SF-LUG list you've hosted, how archives and in mbox format were
made publicly accessible to *anyone* to be able to download and save
that data! You probably pointed it out on multiple occasions before,
and I'm sure you likewise well made the points before about subscriber
list and its accessibility and the importance of backing it up, etc.
As matter of fact, you covered it so well and publicly, that even The
Internet Archive picked it up (but alas, not a very recent copy).
(And mostly to everyone ...)
And *nobody* should be knocking something they get or have gotten for
free! Whole helluva lot 'o folks put a lot of time/resource into
providing/assisting, and if whenever they're not able to, or it's not
high enough on their priorities to get attended to or quickly attended
to, one really ought not complain *at all*!. :-) So, ... that free
service you're no longer getting? Uhm, ... what were the terms in the
Service Level Agreement, and how much per
hour/day/month/year/KiB/MiB/GiB/... were you paying for that service?
Uh huh, ... yeah, don't complain about what you get or got for free
that you might not be getting currently like or just like you got
before.
And yes, points well taken regarding backups! :-) And I do also
provide some fair amount of assistance on SF-LUG (most notably much of
the typical day-to-day systems administration/maintenance of the VM that
sf-lug.org/sf-lug.com runs upon) - and I even do quite regular
(approximately monthly) remote offsite backup of that host (and the
physical host upon which it runs, and one other VM under that physical
host). But alas, in many ways I'm "only" like secondary or tertiary
systems administrator for SF-LUG - I don't own the hardware, I'm not in
charge of its colo arrangement, I don't own/control or have access to
alter its registrar data or contact information or ownership (the "keys
to the kingdom" for sf-lug.org. and sf-lug.com.), but mostly only handle
some fair bits "downstream" of that where some to much of that access
(and/or assistance on it) has been delegated/extended to me. Anyway,
"fully backing up everything of SF-LUG interest" does and would cover
more than just that VM, and I've mostly not covered that (not to make
excuses, but), I'm more heavily involved in BALUG (SF-LUG meeting
location is not convenient to me or where I do/have worked, hence
relatively rare I make it to an SF-LUG meeting), the list mbox and
subscriber information as Rick has pointed out on multiple occasions,
is there available for *anyone* to back up(!), and though I do have
quite a vested interest in the (relatively) smooth continued operation
of BALUG and its services, and would like to also see such with SF-LUG,
I don't have nearly as large a vested interest in SF-LUG ... just
sayin'. Though too :-) I do also quite back up what's important to
*me* with SF-LUG - most notably there's fair bit "invested" in the (VM)
host configuration/operation - so I quite back that up, as recreating
most or all of that would be quite a pain and time/resource sink -
whereas with it quite fully backed up if/when there might ever be need
or reason to replace or get that going again more-or-less as it was,
it's one helluva lot easier to pick up from the point of having very
good backups of that, as compared to totally starting from scratch ...
so, ... I back it up :-) - well, at least quite cover the VM on that.
And in the case of BALUG, on the "keys to the kingdom", not only is
most everything (but alas, not quite all I'd like to see backed up as
fully and completely as it could or ought) backed up and backed up
offsite on a rather regular (usually about monthly) basis, but there
are also backups on the "keys to the kingdom" - e.g. if for any reason
I weren't able to continue to take care of the balug.org domain and
registrar stuff, there are additional person(s) that have access to
that and all of BALUG's data and such - at least to the extent that
*anyone* has such access (with the caveat of some annoying
impediments/limitations/hurdles around a certain hosting provider).
Anyway, don't know that anyone *would* pick up the pieces on BALUG if
for any reason I weren't able to do it, but there are additional
person(s) that at least *could*, as they have the relevant access to
the data and control of the domain. I don't know that SF-LUG is quite
so fully covered in that regard (though many of the pieces are rather
to quite well covered) - but the situation may be the same with many
other user groups and their domains and the like (got backup? Don't
forget backup *person*(s) on critical person(s)/access issues). This
is also not unique to User Groups. I occasionally remind managers, and
especially for Business Continuity / Disaster Recovery exercises and
planning. I typically tell them to plan for up to about 1/3 of staff
being unavailable for several days or more, and up to 10% of staff
never being available again (and adjust those numbers somewhat if the
group/organization is more, or less, geographicly distributed). E.g.
many companies learned exceedingly hard lesson after 2001-09-11 -
having all or nearly all critical persons to a team in the same place
at the same time can be a rather to quite bad idea - especially also if
it's the case much of the time. (Many high-value organizations have
policies to prevent all key executives from being in the same place at
the same time, many also have such policies that extend to much of
critical IT staff - e.g. must always have at least minimal skeleton
crew not at the same location as all the rest of the IT staff that
covers that same functionality in the organization.) And having your
offsite backups only in the adjacent tower is *not* far enough away
for "regional disasters" (generally recommended minimum distance is >10
miles, and much better for it to be hundred(s) of miles or more away).
Also, only having one offsite backup/failover location is often not
"good enough". Though the probability is fairly low, losing both, or
losing both temporarily, while a low probability, is not an exceedingly
low probability. E.g. I know of case some decade(s)+ ago, where large
distributed company had rather well distributed call centers (separate
states hundreds of miles apart). But for one of their rather/quite
important functions (albeit not super critical), they had "only" two
call centers. Well, instability in Western US power grid happened -
both call centers were simultaneously without power for hour(s) or more
at a time - which left them completely without operational call center
for many of their important functions while that was going on. And
remember too, for offsite backups, not only are multiple distributed
offsite locations preferable, but expect some percentage of offsite
backups to fail. E.g. got a bunch of tapes, or disks, or optical media
or whatever? Expect some moderate percentage of them to fail (can't
read, or get lost or destroyed or damaged). And remember to test -
periodically test the data restoration, procedures, etc. E.g. one of
the scenarios I often like to tell managers to use, is come up with a
fair array of various plausible and reasonably probable
disaster/failure scenarios - and including in that certain percentages
of staff temporarily and also permanently unavailable - then randomly
select one or more of those scenarios (at least at a time), and test
them - well run through the exercise. And another point to keep in
mind is bottlenecks and single points of failure - notably those one
might easily overlook. E.g. use one vendor to handle all your offsite
backups? Lots of other companies use them too? And how fast do you
think you'll get your backup media back when transportation is
significantly impaired and a whole lot of that company's
customers/clients all want their backup media delivered to them all
right around the same time? Likewise for vendors that provide the
"sure, we can set up all that hardware for you quickly to be available
for you to use in our data center" - that model may work if the impact
is to a small number of companies, but if it's to a large number, and
the vendor doesn't have all the equipment to simultaneously cover all
those customers/clients at the same time for their equipment,
heating/cooling/power and bandwidth needs and all simultaneously ...
well, would be a severe case of musical chairs, and some
clients/customers would not get the hardware and resources they were
expecting and counting upon.
So ... don't complain about what you get/got for free and may no longer
get - be thankful and appreciative that you get/got it. And don't
forget backups/redundancy - and that's a lot more than just data (oh,
well connected to three major ISPs for redundancy? And what are the
physical routes of those connections? Oh, all of them go over the same
bridge / along the same railroad right-of-way / use the same
satellite?) E.g. random example I remember from past - satellite
failed, that took out several FM broadcast stations (due to uplink feed
dependencies), no problem, page the relevant folks to hop on addressing
that and rerouting as quickly as feasible ... uhm, except those pagers
were dependent upon same satellite ... oops. Some recommended related
reading:
http://www.csl.sri.com/users/neumann/neumann-book.html
http://www.csl.sri.com/users/neumann/illustrative.html
http://catless.ncl.ac.uk/Risks/
(actually, I *highly* recommend at *least* that book, if not the first
two referenced items, not only to all engineers and programmers, but
those who manage them and run such companies/organizations, and also
computer/electronics/electrical/networking technicians, those in most
any other technical field (there are often overlaps and parallels, and
computers are so pervasive in their influence and extent, most all
significantly touched by them that might well appreciate and understand
their impact and reach and also failings and unintended consequences)
and also to those so aspiring to be, probably also ought read these).
> From: "LEdWorldwide!>" <ledworldwide.solutions at gmail.com>
> Subject: Re: Mailing lists using usenet newsgroup
> Date: Tue, 06 Jan 2015 16:21:10 -0800
> All of this talk of mailing lists sounds like too much fun to pass
> up the opportunity to be a part of (and it gives me a chance to put
> my LPI certification to good use).
>
> I'm enthusiastic about taking on the project and can get the data
> from Rick. I also like the idea of using Mailman and I'd like to
> consider Digital Ocean to host a VPS for us ($5-$10 per month).
>
> I'm certainly open to other ideas/suggestions but I do know that as
> a group we can pool our resources and knowledge together and create
> some great projects.
>
>
> Cheers,
>
> -Michael Rojas-
>
>
> Rick Moen <rick at deirdre.net> wrote:
>
>> On Tue, Jan 6, 2015 at 2:40 PM, jim <jim at well.com> wrote:
>>> I'm for MailMan. Either we wait until Rick gets to
>>> it or we set up our own (after we get the backups
>>> from Rick).
>>
>> I'm pretty sure I've sent this information to you-plural in the past,
>> but I'll do it now just in case: This is intended to help your
>> collective memory and improve your process going forward.
>>
>> 1. GNU Mailman can be trivially configured (by the site admin) to
>> permit public download of the cumulative mbox of any hosted Maliman
>> mailing list. That is the set of all past postings to date, from
>> which the archives can then be recreated anywhere desired, with a
>> single command (using /var/lib/mailman/bin/arch). Thus, it is 95% of
>> the important data comprising the mailing list's 'state'. In the case
>> of SF-LUG's mailing list on linuxmafia.com, the relevant URL was (and
>> will again be) /http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug.mbox
>> . Not all Mailman instances are configured to enable that function,
>> but all of the ones that I administer are.
>>
>> If I never brought that matter to Jim's attention (and I'm pretty sure
>> I did), I'm doing so now. Yr. welcome.
>>
>> It is common sense for you to periodically back up that file, and
>> doing so is greatly in your interest. Shortly before I started
>> hosting a mailing list for SF-LUG, you guys completely lost all back
>> traffic to a previous iteration of the SF-LUG mailing list somewhere
>> else -- to something like a failed hard drive or such -- and at the
>> time I wondered why you never bothered to back up the mbox, given your
>> then-recent somewhat ignominious loss. E.g., anyone whatsoever could
>> do that task daily or weekly using a simple 'wget -c' fetch in a cron
>> job.
>>
>> 2. The other indispensable part of a Mailman mailing list's state is
>> the subscriber roster. The mailing list admin can arrange to have
>> that information mailed to interested parties periodically using the
>> /var/lib/mailman/bin/list_members utility. We might call that 4% of
>> the mailing list's 'state'. The other 1% would be things like any
>> unusual mailing list settings, individual subscriber's subscription
>> passwords, and the like, none of which is a huge loss if you happen to
>> lose it.
>>
>> In the case of my server, all mailing list information is present on
>> both my backups and on the live hard drives of the (down) server. If
>> you-all decide you wish to get the current data, you can visit my
>> house and bring a Linux machine able to read an ext3 filesystem from a
>> USB device. I can easily give you the cumulative mbox file during
>> your visit. For the roster, I can give you Mailman's stored copy,
>> which is in a Python's 'pickle' stored-data format. You would need to
>> figure out how to extract what you need from that.
>>
>> You have my cellular number.
>>
>> Might I suggest that you guys start showing some basic initiative
>> towards self-preservation? If you had bothered to do that, you would
>> not have ignominiously lost your entire previous hard drive, and you
>> would not be needing to ask me for 'backups from Rick'.
More information about the sf-lug
mailing list