[conspire] Mailing list servers and the spam problem

Tue Feb 24 14:45:39 PST 2015

Quoting Scott DuBois (rhcom.linux at gmail.com):

> The reality being I don't have a place that is donating server time so I'm
> paying $5/mo out of my _own_ pocket just to have the VPS at DigitalOcean which
> also serves my uses as a SOCKS5. So outside of personal accomplishment and port
> forwarding anonymity, there isn't a lot of motivation to press on with building
> a mailing list that will only produce a level of "spam recepticle" headache I
> _don't_ need. This isn't to say "no" mailing list, just not on a VPS.

My home server in Menlo Park, the SVLUG mailing list server in the
Via.net colo in Palo Alto, and the BALUG mailing list server at
Dreamhost all _receive_ lots of spam delivery attempts (as shown in the
MTA logs), but of course a massive percentage of that gets refused as
being obviously spammy or not addresed to a valid local user.  So, that
much of it is essentially network noise that you never even notice
unless you go plowing through logfiles.

Of the remaining percentage, some of it's addressed to either a valid
Mailman posting address or to one of Mailman's administrative addresses
(like conspire-owner at linuxmafia.com).  Spam aimed as one of the posting
addresses _always_ gets caught in the admin queue, because spammers do
not go out of their way to provide the forged 'From:' header of a valid
subscriber.  So, every bit of that spam lodges in the queues, and none
goes out to subscribers.

One of my occasional worries over the years has been the possibility
that spammers might change that -- that they might either start
believably forging subscriber 'From:' headers _or_ create spambots that 
navingate the Mailman 'join' process before posting.  This has never
happened.

I offer two theories.  (Pick any n.)  One is that spammers are so
fixated on volume, volume, volume that it's just not worth their coding
time to do anything complicated.  Easier to go for low-hanging fruit and
the greater-sucker theory.  The other is that a script navigating the
three-way handshake of Mailman's 'join' operation makes a spambot much
easier to find, and spammers just don't want to have war declared
against them by an army of boiling-mad, well informed listadmins
who are out for blood.

The spam addresssd to Mailman's administrative addresses
(like conspire-owner at linuxmafia.com) don't lodge in admin queues, but
rather go unimpeded out to those addresses' owners (such as me) unless
detected and rejected by _my_ MTA (which is pretty good at that).

The consequence is that *I* get a certain amount of spam as a result of
my administrative role for many of the Bay Area's technical mailing
lsts.  And I'm aware of, but don't have to encounter closely, about a
dozen spams at a time lodged in a typical Mailman admin queue that
haven't yet aged out.  The other people, the subscribers, do not
receive, or otherwise become aware of, _any_ spam.

You as a subscriber will have seen that latter bit for yourself.  So,
that's where I'm trying to set your expectations.  The _listadmins_ 
of a Mailman installation get some spam and awareness of other spam 
aging out of admin queues.  Everyone else, no spam.

The main reason for attempting to do a better-than-GMail job of antispam
on the machine's MTA is to reduce annoyance value to the _listadmins_.
(In my experience, Linux distros' MTA packages default to no real spam
rejection at all, as if it's still 1992.)

if your project with Postfix hasn't gotten very far, maybe you should 
consider scrapping that approach and starting over with the following
MTA-related Debian packages:

exim4-daemon-heavy
spamassasin
sa-exim
spf-tools-perl
libmail-spf-perl 

Those are the ones listed in the requirement of J.P. Boggis's page,
http://www.jcdigita.com/eximconfig/ (with the last three being optional
and used for vetting arriving mail's envelope headers against SPF DNS
records).

The advantage of J.P. Boggis's approach is that he provides a canned
tarball of well-thought-out configuration files for Exim4, that
implement many antispam ideas without you needing to do any work.
You just follow the README's instructions, and it Just Works[tm] to
give you not just a Linux MTA but one that does smart antispam right out
of the gate.

_If I recall correctly_ (and, like an idiot, I've failed to keep good
notes), all I've really done aside from that is to adjust some of the
whitelists, rejigger some spamassassin weightings and adjust some of its
tests, occasionally feed spamd's Bayesian classifier with spam and with
'ham' (non-spam), and make that one change I mentioned to Boggis's
SPF-checking rule to _avoid_ attempting to validate the 'From:' header.

Here is what I wrote the Jonathan (J.P.) Boggis about that:

---<snip>---

I found that the best remedy was to disable "spf_from_acl" in
/etc/exim4/eximconfig/config/spf.conf .  This bit:

  #spf_from_acl:
  #
  #    # Check header From:
  #    warn     set acl_m8  = ${address:$h_From:}
  #    deny     !acl        = spf_check
  #    warn     message     = Received-SPF-From: $acl_m8 ($acl_m7)
  #    accept

I've left the envelope-sender check, etc., intact:

  spf_rcpt_acl:

      # Check envelope sender
      warn     set acl_m8  = $sender_address
      deny     !acl        = spf_check
      warn     message     = Received-SPF: $acl_m8 ($acl_m7)
      accept

  spf_check:

      warn     set acl_m2  = ${readsocket{/tmp/spfd}\
                             {ip=$sender_host_address\n\
                           helo=${if def:sender_helo_name\
                           {$sender_helo_name}{NOHELO}}\
                           \nsender=$acl_m8\n\n}{20s}{\n}{socket failure}}

      # Defer on socket error

      defer    condition   = ${if eq{$acl_m2}{socket failure}{yes}{no}}
               message     = Cannot connect to spfd

      # Prepare answer and get results

      warn     set acl_m2  = ${sg{$acl_m2}{\N=(.*)\n\N}{=\"\$1\" }}
               set acl_m8  = ${extract{result}{$acl_m2}{$value}{unknown}}
               set acl_m7  = ${extract{header_comment}{$acl_m2}{$value}{}}

      # Check for fail

      deny     condition   = ${if eq{$acl_m8}{fail}{yes}{no}}
               message     = ${extract{smtp_comment}{$acl_m2}{$value}{}}
               log_message = Not authorized by SPF

      accept

Unless I'm missing something, that really _is_ the only logical
solution, since SPF by its creator's definition and intent _is_ intended
to validate "From " and _not_ "From:".

I really would urge that you consider disabling "spf_from_acl"
in post-2.2 versions.

---<snip>---

> As I understand it, that $5/mo doesn't include traffic rates which
> will add to the cost should the thing go "live" and start receiving
> spam that would have to be dealt with.

Here's the thing:  Every IP that's live on the Internet has _attempted_
SMTP delivery of spam attempted against it.  Every.  IP.  That's because
the spammers are all about volume, volume, volume.  There's very little
intelligence ever applied to anything.  They don't need to be smart:
They have stolen firepower (mostly, virus-compromised Windows desktop
boxes around the world).  They can just blast spam at every findable
public IP, all the time.

It's _possible_ that a bit more spam starts arriving once you have been
operating an MTA on port 25/tcp for a while -- but I frankly doubt that
makes much difference.

The one thing that _would_ make a difference is if you were reckless
enough to operate an open SMTP relay.  Those get discovered and crushed
with traffic, because they're a goldmine for spammers, a good-reputation
IP that can be freely used as a spam reflector and amplifier at someone
else's expense.

Fortunately, it's not 1997 any more, and no Linux MTA comes anywhere
near installing by default as an open relay.  You would have to go way
out of your way to shoot yourself in the foot in that fashion.  (I'm
sure Linux users manage it, even now, but it takes some work.)

The other source of additional cost would be backscatter spam, which
gets generated if you're stupid and attempt to reject spam after SMTP
receipt.  Which, as I've said very clearly, is a case of Don't Do That,
Then.  (Backscatter spam is not only bad for the not-clearly-thinking
MTA admin, but also for the recipients of that spam, who are
overwhelmingly innocent third parties.  Remember, the 'From:' header of
spam is invariably a forgery.  What happens if some yoyo attempts to
reject after receipt spam that falsely claims, say, _you_ in the 'From;'
header?  That's right, _you_ receive new spam in the form of the stupid
MTA operator's reject message.

That is one of the numerous reasons why rejecting mail after SMTP
receipt is a stupid, evil idea, and is behind why C-R antispam systems
are the Worst Antispam Idea Ever.

[EBLUG Google Group:]

> Well, I deployed that as it was "quick and dirty" which I figured was
> better than nothing.

No question.  In fact, I think I suggested it, too.