[sf-lug] Jim Stockford (and/or others?): Do you have old list emails?

jim jim at well.com
Mon May 27 09:42:25 PDT 2019


what time period?


On 5/27/19 12:19 PM, Michael Paoli wrote:
> Are you an archivist (or chronic hoarder of old emails?  ;-))
> Would love to hear from you.
>
> Most notably, especially for BALUG - and also to lesser extent SF-LUG,
> there are some list posting that have been lost 8-O - in the case of
> BALUG, many year worth (no thanks to DreamHost, and also some folks
> earlier switching list service/software, and not bothering to save
> the older).  In the case of SF-LUG, I believe it's mostly more like
> moderate handful after some hardware issues on a couple of occasions
> (most of which were restore, but I believe still some were lost).
>
> Anyway, if you've got collection of most or all list emails, and
> especially older ones, I'm quite interested in grabbing the list
> emails from older email collections, so any that may be found there
> that are missing can be restored to the lists/archives.
>
> No, don't to (human) read all your emails/collections ... I can write
> script/program to extract just the list emails.  But may need to
> have you (or I) initially scan some emails so (and notably for some
> of the different lists and list software or services used at the time)
> I/we can identify unique headers of items sent to the lists.
> Once that's been determined, relatively straight-forward to write
> program that would extract only email messages that were sent out
> by the lists (can also add collecting items you sent to list(s), as,
> depending on software/settings, lists may or may not send posting also
> to poster).  Anyway, that way, can just extract items sent by(/to)
> relevant lists, and don't need a human to be reading other emails in
> email collections.
>
> So, Jim Stockford ... let me know how we might arrange this some time.
> I believe you said you save *all* emails, and have 'em going way 
> back.  :-)
> Pile 'o hard drives?  Certainly can be well used - I've got hardware 
> which
> can read most drive (interface) types and the data upon the drives.
> "Of course" mbox format is easiest, but can likely also deal with other
> formats (semi-)easily enough - again, I can write bit 'o code (or find
> such), suitable for reading other formats, convert that to mbox (or
> similar enough), and likewise then use appropriate header match criteria,
> for extracting just the list emails.
>
> Likewise for anyone else that does or may have such email 
> collections.  :-)
>
> As for BALUG (I think I also posted similar to some relevant BALUG
> list(s) before ... but it's been quite a while - other than slight
> regular mention ("volunteering to help BALUG" ...
> "archivist/history/retrieval/etc."))
> I can provide more specific details on what we're missing from what
> ranges of time on what lists ... there's much we have; also much we
> don't.  There's also some fair bit between, where we have less than
> ideal format (what we could extract from archive.org, but those have 
> web and
> email mungings that can't be undone (e.g. s/ at /@/g does not undo:
> s/@/ at /g
> think for example:
> John at example.com, use:
> a=' at '; b='@'; if [ x"$a" != x"$b" ]; then foo; else bar; fi
> )
>
> In any case, even in the case of SF-LUG, not sure of the much earlier
> list stuff - notably before list being hosted on linuxmafia.org.
> Anyway, if someone has those old emails may be very possible to
> reintroduce them to the archive ... of the earlier list(s) were
> sufficiently different (e.g. different set of lists, or quite
> different naming/purpose) we might want to alternatively preserve
> those in some separate available read-only format for folks (and
> search engines) to be able to peruse and provide useful (and
> historical) information from.
>
> And yes, can reintroduce (or remove if necessary/warranted) items from
> mailman archive.  One of my pre-"go live" tests for moving of
> mailman hosting of BALUG's lists, from DreamHost to the balug VM
> (hosted by yours truly), was testing that I could not only restore
> archive, but also (re)inject emails to list archive, and also remove
> emails from archive.  So I do also have all that info. somewhere in my
> notes too (and looks like Rick's also covered that information on-list
> too.  :-)).
>
>> From: "Rick Moen" <rick at linuxmafia.com>
>> Subject: Re: [sf-lug] Mobile-friendlying the SF-LUG website (was Re: 
>> Status of SF-LUG etc) SF-LUG web site (mis?)information thereof
>> Date: Sun, 26 May 2019 15:29:49 -0700
>
>> Quoting Michael Paoli (Michael.Paoli at cal.berkeley.edu):
>>
>>> Some of BALUG's older archives are also semi-missing - so that also
>>> made it harder for me to check(/correct).
>>
>> If we (for you & BALUG values of 'we') have the older stuff in mbox or
>> can-be-hammered-into-mbox format, it's actually really easy to add them
>> into Pipermail's archive.  I've done so a bunch of times on my host and
>> SVLUG's mail host.
>>
>> 1.  Use 'cat' to slam together multiple mboxes to make one big one, and
>> make that become the new
>> /var/lib/mailman/archives/private/balug-talk.mbox/balug-talk.mbox, which
>> should be 0644 and owned by list:list .
>>
>> 2.  $ su -
>>     # su - list
>>     $ cd /var/lib/mailman/
>>     $ bin/arch --wipe -q balug-talk 
>> archives/private/balug-talk.mbox/balug-talk.mbox
>>     $ ## wait a long time, maybe 20 minutes
>>     $ exit
>>     # exit
>>
>> That's about the limit unless some of the constituent mbox material had
>> one or more unescaped body text line starting flush-left with 'From ',
>> in which case /var/lib/mailman/bin/arch (the Pipermail archiver
>> program) will make hapless parsing errors, which you fix by finding
>> those lines, escaping each such mbox line with a prefatory '>', and
>> re-running 'arch'.
>>
>> Don't make the common mistake of running the Pipermail 'arch' prgram as
>> the _root_ user, or the geneerated archives will be obscured by a 403
>> error because file permissions will be wrong.
>
>
> _______________________________________________
> sf-lug mailing list
> sf-lug at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/sf-lug
> SF-LUG is at http://www.sf-lug.org/ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://linuxmafia.com/pipermail/sf-lug/attachments/20190527/ee0074c3/attachment.html>


More information about the sf-lug mailing list