[sf-lug] List postings per username in given timerange + useful pipe/script
aaronco36
aaronco36 at SDF.ORG
Mon Nov 15 07:45:00 PST 2021
Quoting top of Michael P's 'sf-lug: List: stats, etc.' at [1]:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The roster (list of subscribers), number of subscribers, by date:
$ sf-lug_roster_stats
YYYY-MM-DD
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Might also be interesting for admittedly just a few readers to see the
number of mailing-list postings per contributor in a given timerange; the
number of each person's postings during, for example, the current Fourth
quarter 2021.
Roughly eyeballing it by sorting Fourth quarter 2021 Archives by author[2]
and further manually sorting by ascending posting frequency, the results
are approximately the following to date...
User 1stname Number of postings
================ =============================
John 2
Al 3
Ronald 4
aaronco36 (self) 4 (including this posting)
Michael 13
Bobbie 15
Rick 16
Hmmm.... seems that am currently at the Median for comparative number of
postings in given timerange.
Notwithstanding an effective bash pipeline, bash script, Michael's
preferred "sh will do fine, thankyouverymuch" of [3] ;-D, perl script,
python script, or whatever else..., would the following
prototyping/pseudocody rough draft be a first approximation for automating
this?
[start script]
..Download latest G'zip'd Text file, e.g. at [4], into
localhost's|otherhost's /<downloadsubfolder>
..Gunzip /<downloadsubfolder>/2021q4.txt.gz
..Loop through each email message's 'From:' and 'Date:' fields in the
/<downloadsubfolder>/2021q4.txt textfile ....
....Check If contents of each 'From:' for each 'Date:'-timestamp are
validated (by list-admins?) as genuine vs spammy-seeming
...Check If-other validation tests on same or on other fields (e.g.,
blank contents, much-too-long contents, attachments...) ?
...Backup/Copy the /<downloadsubfolder>/2021q4.txt textfile to
/<downloadsubfolder>2021q4<newname> and then Remove full contents of
certain posts within that /<downloadsubfolder>2021q4<newname> textfile,
e.g., obvious duplicates, forged headers,...etc.
..Loop through each email message's 'From:' field in the now-validated
/<downloadsubfolder>2021q4<newname> textfile
....Assign tallying counter variables for each post for each particular
previously-validated 'From:'username in
/<downloadsubfolder>2021q4<newname>
..Display the unique string contents of each 'From:'username' as well as
the total _number_ of their posts for the given time period of the G'zip'd
Text file previously downloaded above
[end script]
-A
=========================================
REFERENCES/EXCERPTS
=========================================
[1]http://linuxmafia.com/pipermail/sf-lug/2021q4/015421.html
[2]http://linuxmafia.com/pipermail/sf-lug/2021q4/author.html
[3]http://linuxmafia.com/pipermail/sf-lug/2021q4/015450.html
[4]http://linuxmafia.com/pipermail/sf-lug/2021q4.txt.gz
=========================================
aaronco36 at sdf.org
--
More information about the sf-lug
mailing list