Mailman/Spam fighting

From Wikitech

SpamAssassin

SpamAssassin is installed using the default Debian spamassassin package, look for the spamassassin puppet class. The current configuration looks like:

class { 'spamassassin':
    required_score   => '4.0',
    use_bayes        => '0',
    bayes_auto_learn => '0',
    trusted_networks => $trusted_networks,
}

which means that only messages that score higher than 4.0 will be tagged with the various X-Spam headers.

Fighting spam in mailman

All list mail is sent through spamassassin which adds headers with a spam status and score.

This does not mean though that it automatically discards any messages. Only messages with a really high spam score (>= 6) are discarded immediately, below that messages are just tagged with a header. What actions to take based on this information is left to the list admins.

Spam from non-members

Most spam comes from generic spammers which harvested potential email address over the web and mailed them all: such spammers don't bother subscribing to the list. In contrast, it's rather rare for a public list to ever need receiving emails from non-members.

Multiple list admins report that they greatly reduced the amount of spam received simply by stopping acknowledging the messages, because in this way the spammer doesn't know the address is real.

This configuration is in Privacy options > Sender filter > generic_nonmember_action and you can set to "Discard". You can also reach this configuration at https://lists.wikimedia.org/mailman/admin/YOURLIST/?VARHELP=privacy/sender/hold_these_nonmembers .

Do NOT discard emails from non-members if there is any chance that real people need to write to your list's subscribers without subscribing (typical examples: private committee lists, feedback lists). Just avoid "bounce" and hold for moderation; if it becomes impossible to handle moderated messages, try to stop acknowledging the receipt of messages held for moderation.

Spam scores

The mailman UI supports this via the configuration variable header_filter_rules aka. 'Spam Filter Regexp' (description: Filter rules to match against the headers of a message.). See also https://www.gnu.org/software/mailman/mailman-admin/sender-filters.html

This can be found in the administrative interface in Privacy options...-> [Spam filters] -> Spam Filter Regexp (or visit directly the URL, replacing YOURLIST with your list name: https://lists.wikimedia.org/mailman/admin/YOURLIST/?VARHELP=privacy/spam/header_filter_rules ).

You can filter X-Spam-Score, which on lists.wikimedia.org as of 2017 is indicated with pluses (+) instead of stars (*):

X-Spam-Score:[^+]*[+]{4,}

This line matches when 4 or more pluses are encountered on the same line, meaning a score of 4 or more. SpamAssassin considers messages to be spam over 4 points. See some examples of real-life SpamAssassin scores (X-Spam-Report).

Additionally you have to pick an action for that which can be one of:

Defer  Hold  Reject  Discard  Accept

If you use "Hold" you will have to manually check the moderation queue every once in a while, but can avoid possible false positives, if you use "Discard" you don't have to do anything but _might_ discard some ham without noticing, though that is unlikely with the scores proposed above.

If spam is rampant, you could try and add other, more specific blacklists. For instance, if you find that a good portion of the spam is in BRBL or in MailSpike whitelist, you could add a rule after the score-based one, like

(RCVD_IN_BRBL_LASTEXT|RCVD_IN_MSPIKE_H2|RCVD_IN_MSPIKE_H1)

and set it to "Hold" messages. Do not discard messages based on hand-made filters like this, they may match good messages too. Instead, use "Hold" and go check the moderation queue in the next days, to ensure there are no false positives. Note however that "Hold" seems to override other "Discard" filters, so such a filter may prevent discarding of messages from non-members etc. if you have such discards configured.

DMARC Compatibility

DMARC is an email authentication mechanism intended to combat phishing and spam. To quote from dmarc.org: "DMARC, which stands for “Domain-based Message Authentication, Reporting & Conformance”, is an email authentication policy, and reporting protocol. It builds on the widely deployed SPF and DKIM protocols, adding linkage to the author (“From:”) domain name, published policies for recipient handling of authentication failures, and reporting from receivers to senders, to improve and monitor protection of the domain from fraudulent email."

Several large mail providers (yahoo, aol, etc.) have deployed strict DMARC policies. Unfortunately, these policies interfere with the operation of mailing list software like mailman. This is because mailman modifies the email contents (causing DKIM failure) and attempts to re-mail messages using from addresses that belong to remote mail systems (causing SPF failure). This results in lost email from list users with a strict DMARC policy set by their provider.

To work around this issue, beginning August 1, 2017, messages whose original From: domain publishes a DMARC policy of p=reject or p=quarantine will...

  • Have the From: header rewritten (Munged) to the posters name 'via the list' <list_address> and
  • Merge the poster's address into the Reply-To: header

To reiterate, only when a users email provider enforces a strict DMARC policy will from address be rewritten.

While rewriting the from header may not be ideal this is a necessary default setting to ensure reliability of the mailing list system. Without it messages may be lost silently, or with cryptic error DSN messages returned to the poster.

This is implemented in mailman like so:

# /etc/mailman/mm_cfg.py
DEFAULT_DMARC_MODERATION_ACTION=1

List administrators may also choose a different DMARC moderation action on a per-list basis by navigating to Privacy options > Sender filters. The possible dmarc moderation actions include:

  • Munge From (default) - Rewrite the From: and Reply-To:
  • Wrap Message - Wrap the message as a message/rfc822 sub-part in a MIME format outer message with From: and Reply-To: as above.
  • Reject - Reject the post 
  • Discard - Discard the post