<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<font face="Courier New, Courier, monospace">what time period? <br>
<br>
</font><br>
<div class="moz-cite-prefix">On 5/27/19 12:19 PM, Michael Paoli
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:20190527091939.15756wtiybmqwq88@webmail.rawbw.com">Are
you an archivist (or chronic hoarder of old emails? ;-))
<br>
Would love to hear from you.
<br>
<br>
Most notably, especially for BALUG - and also to lesser extent
SF-LUG,
<br>
there are some list posting that have been lost 8-O - in the case
of
<br>
BALUG, many year worth (no thanks to DreamHost, and also some
folks
<br>
earlier switching list service/software, and not bothering to save
<br>
the older). In the case of SF-LUG, I believe it's mostly more
like
<br>
moderate handful after some hardware issues on a couple of
occasions
<br>
(most of which were restore, but I believe still some were lost).
<br>
<br>
Anyway, if you've got collection of most or all list emails, and
<br>
especially older ones, I'm quite interested in grabbing the list
<br>
emails from older email collections, so any that may be found
there
<br>
that are missing can be restored to the lists/archives.
<br>
<br>
No, don't to (human) read all your emails/collections ... I can
write
<br>
script/program to extract just the list emails. But may need to
<br>
have you (or I) initially scan some emails so (and notably for
some
<br>
of the different lists and list software or services used at the
time)
<br>
I/we can identify unique headers of items sent to the lists.
<br>
Once that's been determined, relatively straight-forward to write
<br>
program that would extract only email messages that were sent out
<br>
by the lists (can also add collecting items you sent to list(s),
as,
<br>
depending on software/settings, lists may or may not send posting
also
<br>
to poster). Anyway, that way, can just extract items sent by(/to)
<br>
relevant lists, and don't need a human to be reading other emails
in
<br>
email collections.
<br>
<br>
So, Jim Stockford ... let me know how we might arrange this some
time.
<br>
I believe you said you save *all* emails, and have 'em going way
back. :-)
<br>
Pile 'o hard drives? Certainly can be well used - I've got
hardware which
<br>
can read most drive (interface) types and the data upon the
drives.
<br>
"Of course" mbox format is easiest, but can likely also deal with
other
<br>
formats (semi-)easily enough - again, I can write bit 'o code (or
find
<br>
such), suitable for reading other formats, convert that to mbox
(or
<br>
similar enough), and likewise then use appropriate header match
criteria,
<br>
for extracting just the list emails.
<br>
<br>
Likewise for anyone else that does or may have such email
collections. :-)
<br>
<br>
As for BALUG (I think I also posted similar to some relevant BALUG
<br>
list(s) before ... but it's been quite a while - other than slight
<br>
regular mention ("volunteering to help BALUG" ...
<br>
"archivist/history/retrieval/etc."))
<br>
I can provide more specific details on what we're missing from
what
<br>
ranges of time on what lists ... there's much we have; also much
we
<br>
don't. There's also some fair bit between, where we have less
than
<br>
ideal format (what we could extract from archive.org, but those
have web and
<br>
email mungings that can't be undone (e.g. s/ at /@/g does not
undo:
<br>
s/@/ at /g
<br>
think for example:
<br>
<a class="moz-txt-link-abbreviated" href="mailto:John@example.com">John@example.com</a>, use:
<br>
a=' at '; b='@'; if [ x"$a" != x"$b" ]; then foo; else bar; fi
<br>
)
<br>
<br>
In any case, even in the case of SF-LUG, not sure of the much
earlier
<br>
list stuff - notably before list being hosted on linuxmafia.org.
<br>
Anyway, if someone has those old emails may be very possible to
<br>
reintroduce them to the archive ... of the earlier list(s) were
<br>
sufficiently different (e.g. different set of lists, or quite
<br>
different naming/purpose) we might want to alternatively preserve
<br>
those in some separate available read-only format for folks (and
<br>
search engines) to be able to peruse and provide useful (and
<br>
historical) information from.
<br>
<br>
And yes, can reintroduce (or remove if necessary/warranted) items
from
<br>
mailman archive. One of my pre-"go live" tests for moving of
<br>
mailman hosting of BALUG's lists, from DreamHost to the balug VM
<br>
(hosted by yours truly), was testing that I could not only restore
<br>
archive, but also (re)inject emails to list archive, and also
remove
<br>
emails from archive. So I do also have all that info. somewhere
in my
<br>
notes too (and looks like Rick's also covered that information
on-list
<br>
too. :-)).
<br>
<br>
<blockquote type="cite">From: "Rick Moen"
<a class="moz-txt-link-rfc2396E" href="mailto:rick@linuxmafia.com"><rick@linuxmafia.com></a>
<br>
Subject: Re: [sf-lug] Mobile-friendlying the SF-LUG website (was
Re: Status of SF-LUG etc) SF-LUG web site (mis?)information
thereof
<br>
Date: Sun, 26 May 2019 15:29:49 -0700
<br>
</blockquote>
<br>
<blockquote type="cite">Quoting Michael Paoli
(<a class="moz-txt-link-abbreviated" href="mailto:Michael.Paoli@cal.berkeley.edu">Michael.Paoli@cal.berkeley.edu</a>):
<br>
<br>
<blockquote type="cite">Some of BALUG's older archives are also
semi-missing - so that also
<br>
made it harder for me to check(/correct).
<br>
</blockquote>
<br>
If we (for you & BALUG values of 'we') have the older stuff
in mbox or
<br>
can-be-hammered-into-mbox format, it's actually really easy to
add them
<br>
into Pipermail's archive. I've done so a bunch of times on my
host and
<br>
SVLUG's mail host.
<br>
<br>
1. Use 'cat' to slam together multiple mboxes to make one big
one, and
<br>
make that become the new
<br>
/var/lib/mailman/archives/private/balug-talk.mbox/balug-talk.mbox, which
<br>
should be 0644 and owned by list:list .
<br>
<br>
2. $ su -
<br>
# su - list
<br>
$ cd /var/lib/mailman/
<br>
$ bin/arch --wipe -q balug-talk
archives/private/balug-talk.mbox/balug-talk.mbox
<br>
$ ## wait a long time, maybe 20 minutes
<br>
$ exit
<br>
# exit
<br>
<br>
That's about the limit unless some of the constituent mbox
material had
<br>
one or more unescaped body text line starting flush-left with
'From ',
<br>
in which case /var/lib/mailman/bin/arch (the Pipermail archiver
<br>
program) will make hapless parsing errors, which you fix by
finding
<br>
those lines, escaping each such mbox line with a prefatory
'>', and
<br>
re-running 'arch'.
<br>
<br>
Don't make the common mistake of running the Pipermail 'arch'
prgram as
<br>
the _root_ user, or the geneerated archives will be obscured by
a 403
<br>
error because file permissions will be wrong.
<br>
</blockquote>
<br>
<br>
_______________________________________________
<br>
sf-lug mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:sf-lug@linuxmafia.com">sf-lug@linuxmafia.com</a>
<br>
<a class="moz-txt-link-freetext" href="http://linuxmafia.com/mailman/listinfo/sf-lug">http://linuxmafia.com/mailman/listinfo/sf-lug</a>
<br>
SF-LUG is at <a class="moz-txt-link-freetext" href="http://www.sf-lug.org/">http://www.sf-lug.org/</a>
</blockquote>
<br>
</body>
</html>