[sf-lug] repeated pleadings ... SF-LUG ... backups ... periodically fetch and archive ... (was: Notes + constructive suggestions/advice on fstrim/discard)

Michael Paoli Michael.Paoli at cal.berkeley.edu
Fri Feb 22 08:08:08 PST 2019


> From: "Rick Moen" <rick at linuxmafia.com>
> Subject: Re: [sf-lug] Notes + constructive suggestions/advice on  
> fstrim/discard
> Date: Thu, 21 Feb 2019 15:35:37 -0800

> Quoting Alex Kleider (akleider at sonic.net):
>
>> With the existence of "the Way Back" machine (Internet Archives)
>> even that is may not be possible. Don't know.
>
> ISTR that, after my repeated pleadings that SF-LUG accept and keep
> backups of this mailing list's membership roster and cumulative mbox
> fell 100% on deaf ears for long years, Michael Paoli finally worked with
> me to arrange both.  As a reminder -- and I've said this many times
> before -- the cumulative mbox is completely public, and it would
> brighten my day if more than a single SF-LUG volunteer bothered to
> periodically fetch and archive it.
>
> http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug.mbox
>
> (Michael fetches it slightly differently, using rsync, in order to fetch
> just the diff, each time.  'rsync linuxmafia.com::sf-lug' will show  
> you that.)

Yes, been covered before, e.g.:
http://linuxmafia.com/pipermail/sf-lug/2015q1/010666.html
http://linuxmafia.com/pipermail/sf-lug/2015q1/010740.html
http://linuxmafia.com/pipermail/sf-lug/2015q1/011128.html
http://linuxmafia.com/pipermail/sf-lug/2015q1/011139.html
http://linuxmafia.com/pipermail/sf-lug/2015q1/011141.html
(some of the earlier indirect additional links may no longer
work or be current - notably I think some message drop, add, and/or
renumbering caused much or all of the 2007q4 archived post
links to no longer line up to what they once were ...
feel free to let us know what those earlier links were intended to
go to and update us.  :-)
)

And, let's see ... currently we have (not all that much has changed):

$ hostname; TZ=GMT0 date -Iseconds; id; pwd -P
balug-sf-lug-v2.balug.org
2019-02-22T15:40:13+00:00
uid=29774(sflug) gid=29774(sflug) groups=29774(sflug)
/home/sflug
$ ip a s | sed -ne '/inet/!d;/scope host/d;/scope link/d;p' | sort
     inet 198.144.194.238/29 brd 198.144.194.239 scope global eth0
     inet6 2001:470:1f04:19e::2/64 scope global
     inet6 2001:470:1f05:19e::2/64 scope global
     inet6 2001:470:1f05:19e::3/64 scope global
$ ls bin
RCS                                sf-lug_roster_fetch+decrypt+version_control
sf-lug.mbox_rsync+version_control  sf-lug_roster_stats
sf-lug_mbox_stats
$ ls -d /etc/*cron*/*sf-lug*
/etc/cron.daily/local_sf-lug-roster  /etc/cron.daily/local_sf-lug_mbox
$ tmpd=$(mktemp -d)
$ (for f in bin/sf-lug_r* bin/sf-lug.m* /etc/*cron*/*sf-lug*; do  
b=$(basename "$f") && < "$f" expand -t 4 > "$tmpd"/"$b"; done; cd  
"$tmpd"/ && more * | cat); rm "$tmpd"/* && rmdir "$tmpd" && unset tmpd
::::::::::::::
local_sf-lug-roster
::::::::::::::
#!/bin/sh

su - sflug -c '
     sleep 300 && {
         >>/dev/null 2>&1 sf-lug_roster_fetch+decrypt+version_control

         # Monthly stats reporting:
         case "$(date +%d)" in

             01)
                 mailfrom='\''Michael Paoli  
<Michael.Paoli at cal.berkeley.edu>'\''
                 mailto='\''SF-LUG <sf-lug at linuxmafia.com>'\''
                 subject='\''sf-lug: List: stats, etc.'\''

                 exec >>/dev/null 2>&1
                 {
                     echo '\''The roster (list of subscribers), number  
of subscribers, by date:'\''
                     echo '\''$ sf-lug_roster_stats'\''
                     ./bin/sf-lug_roster_stats
                     echo '\''$ '\''
                 } |
                 mailx \
                     -r "$mailfrom" \
                     -s "$subject" \
                     "$mailto"
             ;;

             0[02-9]|[12][0-9]|3[01])
                 :
             ;;

             *)
                 :
             ;;

         esac

     }
     :
'
::::::::::::::
local_sf-lug_mbox
::::::::::::::
#!/bin/sh
su - sflug -c 'sleep 600 && >>/dev/null 2>&1  
sf-lug.mbox_rsync+version_control; :'
::::::::::::::
sf-lug.mbox_rsync+version_control
::::::::::::::
#!/bin/sh

set -e # bail if we fail

# vi(1) :se tabstop=4

# update sf-lug.mbox via rsync, save changes using version_control

umask 022 # default to world readable, writable only by owner

# Also available via http, but we use rsync so we can fully match
# even if older/earlier content changes, without need to download
# entire file.
#
# Some earlier background on access via http:
#
# http://linuxmafia.com/pipermail/conspire/2018-June/009229.html
# 2018-06-28
#
# Yes, oft posted/reminded.  The hosting software (Mailman), is among
# other things, configurable to allow the entire raw mbox format file (the
# raw total collection of all postings to the list) to be accessible in
# manner quite similar to the more general archive of list postings ... so
# generally public ... or possibly restricted to just list members?
# ("private").  In any case, the sf-lug list graciously hosted by Rick on
# on linuxmafia.com is so accessible, and Rick has often made the point
# that it's there, available to be backed up in its entirety, by anyone
# that would so much as care to bother (my paraphrasing, not necessarily
# an exact quote).  So ... linuxmafia.com lists ...
# http://www.linuxmafia.com/
# mailing lists --> http://linuxmafia.com/mailman/listinfo/
# ... archive ...
# all but two "listed" (I think it's "advertised" in Mailman's
# configuration terminology) lists there are publicly archived (two appear
# to be members only access - password protected).  And of the public ones
# ... the main archive pages don't provide URL for downloading entire raw
# mbox format archive file - but it's rather well known, and has been oft
# mentioned (certainly at least for the SF-LUG list).  So, let's see ...
# $ (for l in conspire dvlug friday-follies sf-lug test
# do echo "$l" $(
# curl -s -I http://linuxmafia.com/pipermail/"$l".mbox/"$l".mbox |
# sed -ne '/^HTTP/{p;/200/!q;};/^Content-Length:/p' | tr -d '\015'); done)
# conspire HTTP/1.1 200 OK Content-Length: 46875787
# dvlug HTTP/1.1 200 OK Content-Length: 3139113
# friday-follies HTTP/1.1 404 Not Found
# sf-lug HTTP/1.1 200 OK Content-Length: 70806214
# test HTTP/1.1 200 OK Content-Length: 22522
# $
# And so, we can see, that for the 5 lists publicly archived there, 4 of
# the 5 have their full raw mbox archive files also publicly available
# (and does include both the conspire and sf-lug lists).

login=sflug
HOME=$(perl -e 'print((getpwnam(q('"$login"')))[7],"\n");')
[ -n "$HOME" ]

# directory where we put the file when we're done
target_dir="$HOME"/sf-lug.mbox

# options we want to use for our rsync operation:
rsync_opts='--quiet --checksum --times --sparse --partial  
--ignore-times --compress-level=9 --bwlimit=5'
# --compress-level=9 and --bwlimit=5 to go easy on bandwidth for upstream

# my private (not shared by anything else) temporary directory,
# use same filesytem as target_dir (so mv(1) can use rename(2))
my_tmp_dir="$target_dir"/.sf-lug.mbox.tmp

# (numeric) signals we'll trap on
trapsigs='1 2 3 15'

# throughout, we generally trap signals of interest to clean up and
# return appropriate exit value.
gotsig=
for sig in $trapsigs
do
     trap 'gotsig='"$sig" "$sig"
done

# if my target temporary directory already exists, remove it
rm -rf "$my_tmp_dir" || :

trap 'rm -rf "$my_tmp_dir"' 0
for sig in $trapsigs
do
     trap 'trap - 0; rm -rf "$my_tmp_dir"; exit '$(expr "$sig" + 128) "$sig"
done
[ x"$gotsig" = x ] || kill -"$gotsig" "$$"

# make my private temporary directory and cd into it
mkdir "$my_tmp_dir"
cd "$my_tmp_dir"

# this target will be the rsync target we update
cp -p "$target_dir"/sf-lug.mbox "$my_tmp_dir"/sf-lug.mbox

# in case rsync cares, set it writeable for user before rsync update
chmod u=rw,go=r sf-lug.mbox

# update target in temporary location
rsync $rsync_opts rsync://linuxmafia.com/sf-lug/sf-lug.mbox sf-lug.mbox

cd "$target_dir"

# check that existing file matches what was most recently checked in
co -p -q -kb sf-lug.mbox |
>> /dev/null 2>&1 cmp - sf-lug.mbox || \
{
     # not matched, therefore not checked in, we go ahead and check it in
     rcs -l sf-lug.mbox && # lock it before check in
     chmod u=rw,go=r sf-lug.mbox && # make user writable as if did co -l
     # was earlier initialized with rcs -i -kb, so we default to
     # binary for, e.g. our ci -u, ci -l, co, etc.
     ci -d -u -M -m'update' sf-lug.mbox && # check it in
     chmod a=r sf-lug.mbox && # set read only for all
     # should now match, assert it (set -e):
     co -p -q -kb sf-lug.mbox |
     >>/dev/null 2>&1 cmp - sf-lug.mbox
}

# set user writable, in case mv(1) cares:
chmod u=rw,go=r "$my_tmp_dir"/sf-lug.mbox "$target_dir"/sf-lug.mbox
mv -f "$my_tmp_dir"/sf-lug.mbox "$target_dir"/sf-lug.mbox
rmdir "$my_tmp_dir"

# no longer need to clean up my_tmp_dir
trap - 0
for sig in $trapsigs
do
     trap 'trap - 0; exit '$(expr "$sig" + 128) "$sig"
done
# we hope our version control is "smart enough" to deal appropriately
# with any signals it may receive or any action it didn't earlier
# complete

# check if rsynced file matches what was most recently checked in
if co -p -q -kb sf-lug.mbox | >>/dev/null 2>&1 cmp - sf-lug.mbox; then
     chmod a=r sf-lug.mbox # set read only for all
else
     # not matched, therefore not checked in, we go ahead and check it in
     rcs -l sf-lug.mbox && # lock it before check in
     chmod u=rw,go=r sf-lug.mbox && # make user writable as if did co -l
     # was earlier initialized with rcs -i -kb, so we default to
     # binary for, e.g. our ci -u, ci -l, co, etc.
     ci -d -u -M -m'update' sf-lug.mbox && # check it in
     chmod a=r sf-lug.mbox && # set read only for all
     # should now match, assert it (set -e):
     co -p -q -kb sf-lug.mbox |
     >>/dev/null 2>&1 cmp - sf-lug.mbox
fi
::::::::::::::
sf-lug_roster_fetch+decrypt+version_control
::::::::::::::
#!/bin/sh

# :se tabstop=4

# this script,
# sf-lug_roster_wget+decrypt+RCS
# is to fetch and decrypt our upstream data source,
# and to also save older decrypted versions via version control

set -e # bail if we fail

umask 022

# Configuration stuff:
LC_ALL=C export LC_ALL # treat as ASCII/bytes, avoid "illegal character in"
# URL of encrypted membership roster:
fetch_url='http://linuxmafia.com/pipermail/sf-lug.mbox/sf-lug_roster.asc'
login=sflug
HOME=$(perl -e 'print((getpwnam(q('"$login"')))[7],"\n");')
[ -n "$HOME" ]
target_dir="$HOME"/sf-lug_roster
target_base_file=sf-lug_roster
target_temp_base_file=.sf-lug_roster.tmp
target_www_dir="$target_dir"
target_www_base_file=sf-lug_roster.asc
target_www_temp_base_file=.sf-lug_roster.asc.tmp

# the below bit is some more information on what upstream
# file://linuxmafia.com/etc/cron.daily/sf-lug-roster
# (on linuxmafia.com) prepares for us:

# #!/bin/sh
# #
# # sf-lug-roster:  Cron script to save out current Mailman mailing list
# # sf-lug's roster in a place SF-LUG officers can get it.
# #
# #               Written by Rick Moen (rick at linuxmafia.com)
# #               $Id: cron.weekly,v 1.00 2015-02-02 16:06:00 rick
#
# set -o errexit  #aka "set -e": exit if any line returns non-true value
# set -o nounset  #aka "set -u": exit upon finding an uninitialised variable
#
# /var/lib/mailman/bin/list_members -f sf-lug | \
# /usr/bin/gpg --armor --yes --batch --trust-model always --encrypt  
--recipient \
# 0x960C4BE648737D4287DC188FE8A55E60878BD8C0 > \
# /var/lib/mailman/archives/private/sf-lug.mbox/sf-lug_roster.asc

# 06:32:04 typical daily local mtime of the above

# (numeric) signals we'll trap on
trapsigs='1 2 3 15'

# throughout, we generally trap signals of interest to clean up and
# return appropriate exit value.
gotsig=
for sig in $trapsigs
do
     trap 'gotsig='"$sig" "$sig"
done

temp_file=$(mktemp)

trap 'rm "$temp_file"' 0
for sig in $trapsigs
do
     trap 'trap - 0; rm "$temp_file"; exit '$(expr "$sig" + 128) "$sig"
done
[ x"$gotsig" = x ] || kill -"$gotsig" "$$"

# fetch file from our URL:
wget -q -O "$temp_file" "$fetch_url"

for sig in $trapsigs
do
     trap 'trap - 0; rm -f "$temp_file"  
"$target_www_dir/$target_www_temp_base_file"; exit '$(expr "$sig" +  
128) "$sig"
done

# upstream is encrypted world readable from The Internet,
# mktemp(1) denies go r perms, despite umask 022, so,
# since no reason not to have our fetched copy also be world readable,
# after successfully fetching, we do:
chmod a+r "$temp_file"

# to yet another temporary location (on same filesystem),
# we do this as the first is or may be faster and more efficient and
# automatically cleaned up periodically and/or upon reboots,
# and the latter, as it's on same filesystem as target, so mv(1) can use
# rename(2)
cp -p "$temp_file" "$target_www_dir/$target_www_temp_base_file"
rm "$temp_file"

trap 'trap - 0; rm -f "$target_www_dir/$target_www_temp_base_file"; exit 0' 0
for sig in $trapsigs
do
     trap 'trap - 0; rm -f  
"$target_www_dir/$target_www_temp_base_file"; exit '$(expr "$sig" +  
128) "$sig"
done

# all good so far, move to its persistent location
mv -f "$target_www_dir/$target_www_temp_base_file"  
"$target_www_dir/$target_www_base_file"

umask 077 # roster contents confidential (strongly encrypted is public)

trap 'rm "$target_dir/$target_temp_base_file"' 0
for sig in $trapsigs
do
     trap 'trap - 0; rm "$target_dir/$target_temp_base_file"; exit  
'$(expr "$sig" + 128) "$sig"
done

# decrypt
< "$target_www_dir/$target_www_base_file" \
> "$target_dir/$target_temp_base_file" \
gpg --decrypt

# take mtime from source
touch -m -r "$target_www_dir/$target_www_base_file" \
"$target_dir/$target_temp_base_file"

# is file contents same as before?
if >>/dev/null 2>&1 cmp "$target_dir/$target_temp_base_file" \
     "$target_dir/$target_base_file"; then
     # same contents, we're esentially done
     exit 0 # trap removes temporary file
else
     # updated contents - put in place, update version control
     mv -f "$target_dir/$target_temp_base_file" \
     "$target_dir/$target_base_file"
     trap - 0
     for sig in $trapsigs
     do
         trap 'trap - 0; exit '$(expr "$sig" + 128) "$sig"
     done
     cd "$target_dir"
     rcs -l "$target_base_file"
     ci -d -u -M -m'update' "$target_base_file"
fi
::::::::::::::
sf-lug_roster_stats
::::::::::::::
#!/bin/sh
cd /home/sflug/sf-lug_roster &&
rlog sf-lug_roster |
sed -ne '
     /^revision ./{
         s/^revision //
         h
     }
     /^date: [0-9][0-9][0-9][0-9]\/[0-9][0-9]\/[0-9][0-9]  
[0-9][0-9]:[0-9][0-9]:[0-9][0-9];/{
         s/^date:  
\([0-9][0-9][0-9][0-9]\)\/\([0-9][0-9]\)\/\([0-9][0-9]\)  
[0-9][0-9]:[0-9][0-9]:[0-9][0-9];.*$/\1-\2-\3/
         H
         s/.*//
         x
         s/\n/ /
         /^[^    ][^     ]* [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$/p
         d
     }
' |
{
     echo 'YYYY-MM-DD'
     while read r yyyymmdd
     do
         echo "$yyyymmdd" \
         $(2>>/dev/null co -p"$r" sf-lug_roster | wc -l)
     done
}
$




More information about the sf-lug mailing list