Difference between revisions of "Mail"
Revision as of 08:17, 23 June 2012 (view source)
Krinkle (talk | contribs)
(→‎Foundation mail)
← Older edit
Revision as of 14:05, 23 May 2014 (view source)
Matěj Grabovský (talk | contribs)
m (tocright, links, typo, wording)
Newer edit →
Line 1:Line 1:
{{TOCright}}

== Overview of Wikimedia Mail ==== Overview of Wikimedia Mail ==
This section is the high level overview of how mail at wikimedia is treated. More detail on each of these sections exist in the rest of this page.This section provides a high-level overview of how mail at Wikimedia is treated. More detail on each of these sections exist in the rest of this page.


=== Product mail ====== Product mail ===
Mail that is produced by mediawiki. Please fill in this section.Mail that is produced by MediaWiki. Please fill in this section.


=== Lists ====== Lists ===
All mail to @lists.wikimedia.org is handled by mailman running on http://lists.wikimedia.org/. There are public archived lists there, private archived lists, and private unarchived lists. There's some sort of synchronization to gmane.org.All mail to @[[lists.wikimedia.org]] is handled by Mailman running on http://lists.wikimedia.org/. Public archived lists, private archived lists, and private unarchived lists are located there. There's some sort of synchronization to gmane.org.


=== Foundation mail ====== Foundation mail ===
Mail for employees etc. All mail to @wikimedia.org domains are delivered to [[mchenry.wikimedia.org]] (the MX record for wikimedia.org). From there, a few things happenMail for employees etc. All mail to @wikimedia.org domains is delivered to [[​mchenry|​mchenry.wikimedia.org]] (the MX record for wikimedia.org). From there, a few things happen:

* The recipient is checked against aliases in /etc/exim4/aliases/wikimedia.org. These are mostly ops-related (eg noc@) or older aliases. New aliases for ops should go here; new aliases for the rest of the organization should be created in Google.* The recipient is checked against aliases in <code>​/etc/exim4/aliases/wikimedia.org​</code>​. These are mostly ops-related (e.g., noc@) or older aliases. New aliases for ops should go here; new aliases for the rest of the organization should be created in Google.
* The recipient is checked against LDAP. LDAP contains a few different address types* The recipient is checked against LDAP. LDAP contains a few different address types:
** most employees' mail is forwarded to user accounts at google
** newer aliases are forwarded to google groups** most employees' mail is forwarded to user accounts at Google;
** newer aliases are forwarded to Google Groups;
** some legacy mailing lists are forwaded to @lists.wikimedia.org** some legacy mailing lists are forwaded to @lists.wikimedia.org;
** some employees are identified as IMAP accounts and their mail is forwarded to sanger
* There's a sqlite db that contains filters and addresses to forward to sanger** some employees are identified as IMAP accounts and their mail is forwarded to [[sanger]].
** use 'wmfmailadmin' on sanger to modify these filters and such (it's rsynced back to [[mchenry]])* There's a SQLite database that contains filters and addresses to forward to [[sanger]].
** Use <code>wmfmailadmin</code> on sanger to modify these filters and such – it's rsynced back to [[mchenry]].
* Mail for Request Tracker is forwarded to streber* Mail for Request Tracker is forwarded to [[streber]].


== HowTo ==== HowTo ==
Line 24:Line 27:


=== Client setup for end-users ====== Client setup for end-users ===
You can use any mail client supporting the [[w:IMAP|IMAP]] protocol in its secure form ([[w:IMAPS|IMAPS]]), for example [[w:Mozilla Thunderbird|Thunderbird]].

You can use any mail client supporting the [[w:IMAP|IMAP]] protocol in it's secure form ([[w:IMAPS|IMAPS]]). f.e.[[w:Mozilla Thunderbird|Thunderbird]].


Select IMAP(S), enable SSL, and use port 993.Select IMAP(S), enable SSL, and use port 993.


Enter "mail.wikimedia.org" as your server. Your user name should be formatted as:Enter <code>​mail.wikimedia.org​</code> as your server. Your user name should be formatted as: ?


=== Modify aliases ====== Modify aliases ===
Right now [[mchenry]] is the mail relay which also does all ops-related aliasing and some older aliases for the rest of the foundation. All domains use separate aliases files. Each domain has its own alias file in <tt>/etc/exim4/aliases/</tt>.Right now [[mchenry]] is the mail relay which also does all ops-related aliasing and some older aliases for the rest of the foundation. All domains use separate aliases files. Each domain has its own alias file in <tt>/etc/exim4/aliases/</tt>.


New ops-oriented aliases (eg noc@, etc.) should be created in mchenry's alias file. New requests for general foundation-related aliases should be redirected to OIT and be created as a google group.New ops-oriented aliases (e.g., noc@, etc.) should be created in mchenry's alias file. New requests for general foundation-related aliases should be redirected to OIT and be created as a Google Group.


To add/modify/remove an alias, simply edit the corresponding text file on mchenry, e.g. <tt>/etc/exim4/aliases/wikimedia.org</tt>. No additional steps are necessary.To add/modify/remove an alias, simply edit the corresponding text file on mchenry, e.g., <tt>/etc/exim4/aliases/wikimedia.org</tt>. No additional steps are necessary.


To use the same aliases for multiple domains you ''can'' use symbolic links, however be careful because unqualified targets (i.e. mail addresses without a domain, like ''noc'') that are ''not'' listed in the same alias file (for example, OTRS queues) may not work as they do not exist in the symbolically linked domain. Use fully qualified addresses in that case.To use the same aliases for multiple domains you ''can'' use symbolic links, however be careful because unqualified targets (i.e., mail addresses without a domain, like ''noc'') that are ''not'' listed in the same alias file (for example, OTRS queues) may not work as they do not exist in the symbolically linked domain. Use fully qualified addresses in that case.


=== IMAP account management ====== IMAP account management ===
Line 657:Line 659:
domainlist legacy_mailman_domains = wikimedia.org : wikipedia.org : mail.wikimedia.org : mail.wikipedia.orgdomainlist legacy_mailman_domains = wikimedia.org : wikipedia.org : mail.wikimedia.org : mail.wikipedia.org


The following router, near the end of the routers section, checks if a given local part exists in the file <tt>/etc/exim4/legacy_mailing_lists</tt>, and rewrites it to the new address if it does, to be routed via the normal DNS MX/SMTP routers/transports. Since Mailman does not distinguish between domains, only a single local parts file for all legacy mailman domains exists. This file only needs to contain the mailing list names; all suffixes are handled by the router.The following router, near the end of the routers section, checks if a given local part exists in the file <tt>/etc/exim4/legacy_mailing_lists</tt>, and rewrites it to the new address if it does, to be routed via the normal DNS MX/SMTP routers/transports. Since Mailman does not distinguish between domains, only a single local parts file for all legacy Mailman domains exists. This file only needs to contain the mailing list names; all suffixes are handled by the router.


# Alias old mailing list addresses to @lists.wikimedia.org on lily# Alias old mailing list addresses to @lists.wikimedia.org on lily
Revision as of 14:05, 23 May 2014
Contents
1Overview of Wikimedia Mail
1.1Product mail
1.2Lists
1.3Foundation mail
2HowTo
2.1Client setup for end-users
2.2Modify aliases
2.3IMAP account management
2.3.1Adding a new account
2.3.2Change an existing (or forgotten) email password
2.3.3Removing or disabling an account
2.3.4Changing mail quotas
2.3.5Setting up server side filtering
2.3.6
Changing the forward of the user from our server to Google Apps
2.4Setting up a Vacation autoreply
2.4.1Removing a Vacation autoreply
2.5Adding / removing an OTRS queues and mail addresses
2.6Adding / removing mail domains
2.7Searching the logs
3Design decisions
3.1Software
3.2Formats used
3.3Mailbox storage and mail delivery
3.4Authentication
3.5Layout
4Configuration details
4.1Account database
4.1.1Schema
4.2Mail relay
4.2.1Resource limits
4.2.2Aliases
4.2.3LDAP accounts and aliases
4.2.4IMAP mail
4.2.5RT
4.2.6OTRS
4.2.7SpamAssassin
4.2.8System filter
4.2.9Mailing lists
4.2.10Wiki mail
4.2.11Postmaster
4.2.12Internal address rewriting
4.3Secondary mail relay
4.3.1Relay domains
4.4IMAP server
4.4.1TLS support
4.4.2Local mail submissions
4.4.3User filters
4.4.4IMAP delivery
4.4.5User left
4.4.6Vacation auto-reply
4.4.7Smart host
4.4.8SMTP authentication
4.4.9Dovecot deliver
4.4.10User database syncing
4.4.11Backups
4.4.12Mail box cleanup
5See also
6External documentation
Overview of Wikimedia Mail
This section provides a high-level overview of how mail at Wikimedia is treated. More detail on each of these sections exist in the rest of this page.
Product mail
Mail that is produced by MediaWiki. Please fill in this section.
Lists
All mail to @​lists.wikimedia.org is handled by Mailman running on http://lists.wikimedia.org/​. Public archived lists, private archived lists, and private unarchived lists are located there. There's some sort of synchronization to gmane.org.
Foundation mail
Mail for employees etc. All mail to @wikimedia.org domains is delivered to mchenry.wikimedia.org (the MX record for wikimedia.org). From there, a few things happen:
HowTo
This section lists some commonly needed actions and how to perform them.
Client setup for end-users
You can use any mail client supporting the IMAP protocol in its secure form (IMAPS), for example Thunderbird​.
Select IMAP(S), enable SSL, and use port 993.
Enter mail.wikimedia.org as your server. Your user name should be formatted as: ?
Modify aliases
Right now mchenry is the mail relay which also does all ops-related aliasing and some older aliases for the rest of the foundation. All domains use separate aliases files. Each domain has its own alias file in /etc/exim4/aliases/​.
New ops-oriented aliases (e.g., noc@, etc.) should be created in mchenry's alias file. New requests for general foundation-related aliases should be redirected to OIT and be created as a Google Group.
To add/modify/remove an alias, simply edit the corresponding text file on mchenry, e.g., /etc/exim4/aliases/wikimedia.org​. No additional steps are necessary.
To use the same aliases for multiple domains you can use symbolic links, however be careful because unqualified targets (i.e., mail addresses without a domain, like noc) that are not listed in the same alias file (for example, OTRS queues) may not work as they do not exist in the symbolically linked domain. Use fully qualified addresses in that case.
IMAP account management
IMAP accounts are managed through the use of wmfmailadmin on sanger. wmfmailadmin modifies the SQLite database /var/vmaildb/user.db​, which is rsynced to mchenry every 15 minutes. This means that it can take up to 15 minutes before the change will be visible on mchenry!
wmfmailadmin has actions and fields. Actions can be create account (-c), update account (-u), delete account (-d), list account(s) (-l) or show field (-s fieldname​). Only one such action can be given in a single command. Each action may require one or more fields, to either select an account for modifying/removal, or to update that field of a certain account. From the help screen (-h):
Fields: -a <...> --active Active -e <...> --email E-mail address -f <...> --filter Filter file or '-' for stdin, or 'None' -i <...> --id Id -q <...> --quota Quota -p <...> --password Password hash or '-' for prompting -r <...> --realname Real name
Adding a new account
# wmfmailadmin -c -e localpart​@wikimedia.org -r "real name"
for example:
# wmfmailadmin -c -e jdoe@wikimedia.org -r "John Doe"
It will prompt for a password, enter it in plaintext. Or, use dovecotpw -s sha1 to generate a hashed password, and do:
# wmfmailadmin -c -e localpart​@wikimedia.org -r "real name" -p 'hash'
That way, the user doesn't need to tell you their actual password.
Make sure the user doesn't have a forwarding alias on mchenry (/etc/exim4/aliases/wikimedia.org), otherwise nothing will go to the IMAP account. But wait with doing this until the user.db file has synced to mchenry, or the account will be temporarily broken!
You can check whether things are working correctly, by running the command
$ exim -bt localpart​@wikimedia.org
on either sanger or mchenry. Exim will tell you how it will handle the specified mail address on that system. mchenry should forward it to sanger, where it should be delivered locally.
Change an existing (or forgotten) email password
# wmfmailadmin -u -e localpart​@wikimedia.org -p '' (<---two single 'quotes'; no space between)
note: don't pass the password to -p .. just hit enter and you will be prompted for it
Removing or disabling an account
If you want to remove an account entirely, use:
# wmfmailadmin -d -e localpart​@wikimedia.org
or:
# wmfmailadmin -d -i account id
When account is removed entirely, eventually its corresponding mail box will automatically be moved to a backup location /var/vmail/.backup/ with a timestamp attached (so the same account can be removed multiple times). This is handled by a daily cron job /etc/cron.daily/mailbox-cleanup which uses the script /usr/local/sbin/mbcleanup​.
However, if you want to merely disable an account, you should set it inactive:
# wmfmailadmin -u -e localpart​@wikimedia.org -a 0
When people mailing to this (former) address should receive a "left message", the account should be disabled (not removed!), and a customized autoreply message file should be placed in /etc/exim4/userleft/​domain​/​localpart​. Don't create a new file, use /etc/exim4/userleft/TEMPLATE as an example.
Changing mail quotas
All accounts have a 1 GB quota by default. Once an account crosses this limit, a message will be placed in the INBOX, informing the user that no more deliveries will be done, and cleaning up is necessary. Until the mail account is no longer over quota, messages will be put on the Exim mail queue, until they time out (after 4 days). If the user cleans up in time, no mail will be lost.
The quota for an account can be changed, using:
# wmfmailadmin -u -e localpart​@wikimedia.org -q value
The value should be given in kilobytes​. 0 means no quota limit.
Setting up server side filtering
The mail system supports server-side filtering using per-user filters. This is not really meant for end-users at the moment, as there is no convenient interface for them to edit and install the filter files. However, with the help of a system administrator this can be done using the following procedure.
Write an Exim filter to do server-side mail filtering. Make sure that it starts with the line
# Exim filter
on top. Test the filter before installing it, using:
$ exim -bf /path/to/filter/file < /dev/null
or, with an example message and more debugging info:
$ exim -d -bf /path/to/filter/file < some_mail_message
If the filter works as expected and contains no syntax errors, it can be loaded into the account database using:
# wmfmailadmin -u -e localpart​@wikimedia.org -f /path/to/filter/file
or
# wmfmailadmin -u -e localpart​@wikimedia.org -f -
for loading from stdin.
By default a standard filter is used that delivers messages classified as spam by SpamAssassin to the Junk subfolder. This default filter is located in /etc/exim4/default_user_filter​. When you install a custom filter, you may want to make sure that the contents of this default filter are included, or Junk mail deliveries will no longer happen.
The usage of user filtering can be disabled by installing an empty filter:
# wmfmailadmin -u -e localpart​@wikimedia.org -f
The custom user filter can also be removed, such that the default filter will be used again:
# wmfmailadmin -u -e localpart​@wikimedia.org -f None
To view a custom filter active for an account, use:
# wmfmailadmin -s filter -e localpart​@wikimedia.org
Note that we usually save filters we have added for users in /root on sanger.
Changing the forward of the user from our server to Google Apps
Setting up a Vacation autoreply
Follow the direction about setting up a server side filter. In the filter file, use the following as a template:
# This is a Vacation auto-responder. Simply replace the Capitalized words with the actual value. # The username is known in the database as 'localpart' the rest of the required information can be obtain thusly: # Full name = sqlite3 /var/vmaildb/user.db "select realname from account where localpart='USERNAME'" # Email = localpart@DOMAIN (can also be queried). if personal and not error_message and not $h_X-Spam-Score matches "\\N\(\\+{3,}\)\\N" then unseen mail to $h_From bcc $h_To from "REALNAME <EMAIL>" subject "Re: $h_subject" text "AWAY_MESSAGE" return message once /tmp/USERNAME.vacation once_repeat 7d endif # Include below existing filters. # These can be obtain with the following commnand: # sqlite3 /var/vmaildb/user.db "select filter from account where localpart='USERNAME'"
NOTE: The first line of the filter HAS TO BE "# Exim filter" otherwise the filter will not work and return errors.
Removing a Vacation autoreply
On sanger run this if the user has no filters set up past the out of office.
wmfmailadmin -u -e localpart@wikimedia.org -f None
Otherwise run
wmfmailadmin -s filter -e localpart@wikimedia.org > /path/to/localfile
And remove the just the out of office bits and add the other custom filters back
wmfmailadmin -u -e localpart@wikimedia.org -f /path/to/localfile
Adding / removing an OTRS queues and mail addresses
Just add the queue in OTRS with appropriate mail addresses and be happy. mchenry will automatically see that the queue exists or has disappeared, and no involvement from Wikimedia admins is necessary.
Under some circumstances it's possible that, due to negative caching at the secondary MXes, a new mail address will only start working after up to two hours.
Adding / removing mail domains
Set up DNS MX records with mchenry.wikimedia.org as the primary MX, and lists.wikimedia.org as secondary, and things should already start to work. You'll probably want to add an alias file on the primary mail relay though, or no mail will be accepted.
If you don't want to rely on DNS MX records alone, you can also add the domain to the file /etc/exim4/local_domains on the primary mail relay, and /etc/exim4/relay_domains on the secondary mail relays, but this is not a requirement.
Searching the logs
Exim 4's main log file is /var/log/exim4/mainlog​. Using exigrep instead of grep may be helpful, as it combines (scattered) log lines per mail transaction.
Design decisions
Reliable mail delivery first
Spam filtering and other tricks are needed, but reliable mail delivery for genuine mail should have a higher priority. False positives in mail rejects and incompatibilities should be kept to a minimum.
Black box mail system, no user shell logins
Few users would make good use of this anyway. Greatly simplifies network and host security, allows the use of some (non-critical) non-standardized extensions between software components for greater performance, interoperability and features because it doesn't have to support whatever shell users might install to access things directly.
IMAP only, no POP3
IMAP has good client support nowadays, and for a large part solves the problem of having multiple clients. Also backups can be done centrally on the server side, and multiple folders with server side mail filtering might be supported.
Support for mail submission
Through SMTP authentication we can allow our users to submit mails through the mail server, without them having to configure an outgoing mail server for whatever network they reside on. Can support multiple ports/protocols to evade firewalls.
SSL/TLS access only, no plain-text
Although client support for this is not 100% yet, especially on mobile devices, the risks of using plain-text protocols is too high, especially with users visiting conferences and other locations with insecure wireless networks.
Quota support
Although we can set quotas widely especially for those who need it, quotas should be implemented to protect the system.
Spam and virus filtering
Is unfortunately necessary. Whether this should be global or per-user is to be determined.
Multi-domain support
We have many domains, and the mail setup should be able to distinguish between domains where necessary.
Web access
Some form of web-mail would be nice, although not critical at first and can be implemented at later stages.
Backups
At least daily, with snapshots.
Cold failover
Setting up a completely redundant system is probably a bit overkill at this stage, but we should make it easy and quick to set up a new mail system on other hardware in case of major breakage.
Documentation
Although not all aspects of the involved software can be described of course, the specifics of the Wikimedia setup should be properly documented and HOWTOs for commonly needed tasks should be provided.
Software
MTA
Exim : Great flexibility, very configurable, reliable, secure.
IMAP server
Dovecot : Fast, secure, flexible.
Formats used
Maildir
Safe, convenient format, moderately good performance, good software support.
Password and user databases
sqlite - Indexed file format, powerful SQL queries, no full-blown RDBMS needed. Easy maintenance, good software support, replication support. Also easy to change to MySQL/PostgreSQL should that ever be necessary. Supported by both Exim and Dovecot.
Other data lookups
either flat-file for small lists, or cdb for larger, indexed lookups.
Mailbox storage and mail delivery
Ext3 as file system
ReiserFS may be a bit faster, but Ext3 is more reliable. Make sure directory indexes are enabled.
LVM
For easy resizing, moving of data to other disks, and snapshots for backups.
RAID-1
The new mail servers have hardware RAID controllers, we'll probably use them.
Dovecot's "deliver" as LDA
Though Exim has a good internal Maildir "transport", the use of Dovecot's LDA allows it to use and update the Dovecot specific indexing for greater performance. This actually restricts some Exim flexibility because no appendfile features (quotas) can be used, forcing the use of deliver counterparts. The performance benefits were only marginal anyway, due to Dovecot's use of dnotify, so use Exim's own Maildir delivery.
fcntl() and dot-file locking
Greatest common divisors.
Maildir++ quotas
Standard, reasonably fast.
Authentication
PLAIN authentication
Universally supported for both IMAP and SMTP. Encrypted connections are used exclusively, so no elaborate hashing schemes needed.
SMD5 or SSHA password scheme
Salted hashing.
SMTP authentication through either Exim's Dovecot authenticator, or using direct lookups
Exim 4.64 has support for directly authenticating against Dovecot's authenticator processes, though this version is not in Ubuntu Feisty yet, so needs backporting. If direct lookups from Exim's authenticators are easy enough, use that. Also depends on the security model.
Layout
The mail setup consists of 2 general mail servers, plus a mailing lists server (lily) and an OTRS server. The two general mail servers are mchenry and sanger.
Wikimedia mail setup
One server (​mchenry​) acts as relay; it accepts mail connections from outside, checks them for spam, viruses and other policy checks, and then queues and/or forwards to the appropriate internal mail server. It also accepts mail destined for outside domains from internal servers, including the application servers.
The other server, sanger, is the IMAP server. It accepts mail from mchenry and delivers it to local user mailboxes. Outgoing mail from SMTP authenticated accounts are also accepted on this server, and forwarded to mchenry, where it's queued and sent out. Web mail and other supportive applications related to user mail accounts and their administration will also run on sanger.
Lily, the mailing lists server, also acts as a secondary MX and forwards non-mailing list mail to mchenry. In case of downtime of mchenry, it might be able to send partial (IMAP account) mail to sanger directly, depending on the added complexity of the configuration. During major hardware failure of sanger, mchenry (with identical hardware) should be able to be setup as IMAP server.
Configuration details
Account database
The user & password account database on the IMAP server is stored in a SQLite database. This format is fast and convenient to use, and can easily be moved to MySQL or PostgreSQL should that be necessary later.
Initially, the schema of this database is intentionally kept simple, because simplicity is good. We could extend it with many tables supporting domains, aliases, other data and do a whole lot of joins to make it work, but right now we don't need that. For example, aliases are simply kept in a text file on the primary mail relay, which works well. If we ever need more features, it'll be easy to adapt the schema to the new situation.
Schema
CREATE TABLE account ( id INTEGER PRIMARY KEY AUTOINCREMENT, localpart VARCHAR(128) NOT NULL, domain VARCHAR(128) NOT NULL, password VARCHAR(64) NOT NULL, quota INTEGER DEFAULT '0' NOT NULL, realname VARCHAR(64) NULL, active BOOLEAN DEFAULT '1' NOT NULL, filter BLOB NULL, UNIQUE (localpart, domain) );
Mail relay
The current mail relay is mchenry.
As a mail relay needs to do a lot of DNS lookups, it's a good place for a DNS resolver, and therefore mchenry is pmtpa's secondary DNS recursor - although mchenry uses its own resolver as primary.
mchenry uses Exim 4, the standard Ubuntu Feisty exim4 exim4-daemon-heavy package. This package does some stupid things like running under a Debian-exim user, but not enough to warrant running our own modified version. All configuration lives in /etc/exim4​, where exim4.conf is Exim's main configuration file.
The following domain and host lists are defined near the top of the configuration file:
# Standard lists hostlist wikimedia_nets = <; 66.230.200.0/24 ; 145.97.39.128/26 ; 203.212.189.192/26 ; 211.115.107.128/26 ; 2001:610:672::/48 domainlist system_domains = @ domainlist relay_domains = domainlist legacy_mailman_domains = wikimedia.org : wikipedia.org
domainlist local_domains = +system_domains : +legacy_mailman_domains : lsearch;CONFDIR/local_domains : @mx_primary/ignore=127.0.0.1
system_domains is a list for domains related to the functioning of the local system, e.g. mchenry.wikimedia.org and associated system users. It has little relevance to the rest of the Wikimedia mail setup, but makes sure that mail submitted by local software is handled properly.
relay_domains is a list for domains that are allowed to be relayed through this host.
local_domains is a compound list of all domains that are in some way processed locally. They are not routed using the standard dnslookup router. Besides the domains listed in /etc/exim4/local_domains​, mail will also accepted for any domain which has mchenry (or, one of its interface IP addresses) listed in DNS as the primary MX. This could get abused by people having control over some arbitrary DNS zone of course, but since typically no alias file for it will exist, no mail address will be accepted in that case anyway.
For content scanning, temporary mbox files are written to /var/spool/exim4/scan​, and deleted after scanning. To improve performance somewhat, this directory is mounted as a tmpfs filesystem, using the following line in /etc/fstab​:
tmpfs /var/spool/exim4/scan tmpfs defaults 0 0
Resource limits
To behave gracefully under load, some resource limits are applied in the main configuration section:
# Resource control check_spool_space = 50M
No mail delivery if there's less than 50MB free.
deliver_queue_load_max = 75.0 queue_only_load = 50.0
No mail delivery if system load is > 75, and queue-only (without immediate delivery) when load is > 50.
smtp_accept_max = 100 smtp_accept_max_per_host = ${if match_ip{$sender_host_address}{+wikimedia_nets}{50}{5}}
Accept maximally 100 SMTP connections simultaneously, max. 5 from the same host. Unless it's a host from a Wikimedia network, then a higher limit applies.
smtp_reserve_hosts = <; 127.0.0.1 ; ::1 ; +wikimedia_nets
Reserve SMTP connection slots for our own servers.
smtp_accept_queue_per_connection = 500
If more than 500 mails are sent in one connection, queue them without immediate delivery.
smtp_receive_timeout = 1m
Drop the connection if an SMTP line was not received within a 1 minute timeout.
remote_max_parallel = 25
Invoke at most 25 parallel delivery processes.
smtp_connect_backlog = 32
TCP SYN backlog parameter.
Aliases
Each Wikimedia domain (wikimedia.org, wikipedia.org, wiktionary.org, etc...) is now distinct and has its own aliases file, under /etc/exim4/aliases/​. Alias files use the standard format. Unqualified address targets in the alias file (local parts without domain) are qualified to the same domain. Special :fail: and :defer: targets and pipe commands are also supported, see http://www.exim.org/exim-html-4.66/doc/html/spec_html/ch22.html#SECTspecitredli​.
The following router takes care of this. It's run for all domains in the +local_domains domain list defined near the top of the Exim configuration file. It checks whether the file /etc/exim4/aliases/$domain exists, and then uses it to do an alias lookup.
# Use alias files /etc/exim4/aliases/$domain for domains like # wikimedia.org, wikipedia.org, wiktionary.org etc. aliases: driver = redirect domains = +local_domains require_files = CONFDIR/aliases/$domain data = ${lookup{$local_part}lsearch*{CONFDIR/aliases/$domain}} qualify_preserve_domain allow_fail allow_defer forbid_file include_directory = CONFDIR pipe_transport = address_pipe
If the exact address is not found in the alias file, it will do another lookup for the key *, so catchalls can be made for specific domains as well.
LDAP accounts and aliases
As the office is now putting staff data in LDAP, the mail relay has been configured to use that as the primary mail account database. Two Exim routers have been added for this, one to lookup mail accounts (which are forwarded to Google Apps), and one for aliases in LDAP:
# LDAP accounts ldap_account: driver = manualroute domains = wikimedia.org condition = ${lookup ldap \ {user="cn=eximagent,ou=other,dc=corp,dc=wikimedia,dc=org" pass=LDAPPASSWORD \ ldap:///ou=people,dc=corp,dc=wikimedia,dc=org?mail?sub?(&(objectClass=inetOrgPerson)(mail=${quote_ldap:$local_part}@$domain)(x121Address=1))} \ {true}fail} local_part_suffix = +* local_part_suffix_optional transport = remote_smtp route_list = * aspmx.l.google.com
For mail addresses in domain wikimedia.org an LDAP query is done on the default LDAP servers, under the given base DN. Only the mail attribute is returned on successful lookup. The scope is set to sub (so a full subtree search is done), and the filter specifies that only inetOrgPerson objects should match, with the mail attribute matching the exact mail address being tested, and only if the x121Address field exists and is set to 1. If all these conditions are met, the mail is forwarded to Google Apps.
Mail addresses with an optional local part suffix of the form +​whatever are also accepted the same as without the suffix, and are forwarded unmodified.
ldap_alias: driver = redirect domains = wikimedia.org data = ${lookup ldap \ {user="cn=eximagent,ou=other,dc=corp,dc=wikimedia,dc=org" pass=LDAPPASSWORD \ ldap:///ou=people,dc=corp,dc=wikimedia,dc=org?mail?sub?(&(objectClass=inetOrgPerson)(initials=${quote_ldap:$local_part}@$domain))} \ {$value}fail}
Aliases are lookup in LDAP as well using the initials attribute, and are rewritten to their canonical form as returned in the mail attribute.
IMAP mail
Mail destined for IMAP accounts on the IMAP server should be recognized and routed specially by the mail relay. Therefore the mail relay has a local copy of the accounts database, and uses a manualroute to route those mail addresses to the IMAP server:
imap: driver = manualroute domains = +local_domains condition = ${lookup sqlite{USERDB \ SELECT * FROM account WHERE localpart='${quote_sqlite:$local_part}' AND domain='${quote_sqlite:$domain}'}} transport = remote_smtp route_list = * sanger.wikimedia.org
RT
RT is implemented on a separate domain (​rt.wikimedia.org​) and a separate server (streber). From the domain name, Exim knows to forward to it:
domainlist rt_domains = rt.wikimedia.org
# Send RT mails to the RT server rt: driver = manualroute domains = +rt_domains route_list = * streber.wikimedia.org byname transport = remote_smtp

OTRS
For OTRS, the mail relay queries the OTRS MySQL servers directly to check the existence of an OTRS mail address. This implies that newly created OTRS queues / mail addresses will start to work immediately and no involvement from Wikimedia admins is needed.
The MySQL servers are specified near the top of the Exim configuration file:
# MySQL lookups (OTRS) hide mysql_servers = srv7.wikimedia.org/otrs/exim/​password : \ srv8.wikimedia.org/otrs/exim/​password
These servers will be queried in turn. If neither of these servers respond, or respond with an error, the mail will be deferred. A MySQL user account "exim" with (just) SELECT privileges on the system_address table of the otrs database needs to exist, which is accessible from the mail relay (​mchenry.wikimedia.org​).
The following router does the actual aliasing of the OTRS address to otrs@ticket.wikimedia.org​, if the OTRS queue address exists in the database:
# Query the OTRS MySQL server(s) for the existence of the queue address # $local_part@$domain, and alias to otrs@ticket.wikimedia.org if # successful. otrs: driver = redirect domains = +local_domains condition = ${lookup mysql{SELECT value0 FROM system_address WHERE value0='${quote_mysql:$local_part@$domain}'}{true}fail} data = otrs@ticket.wikimedia.org
In the new OTRS setup, this is not done by rewriting the address, but delivering the message with unmodified recipients directly to the OTRS server with a manualroute router:
otrs: driver = manualroute domains = +local_domains condition = ${lookup mysql{SELECT value0 FROM system_address WHERE value0='${quote_mysql:$local_part@$domain}'}{true}fail} route_list = * williams.wikimedia.org byname transport = remote_smtp
SpamAssassin
SpamAssassin is installed using the default Ubuntu spamassassin package. A couple of configuration changes were made.
By default, spamd, if enabled, runs as root. To change this:
# adduser --system --home /var/lock/spamassassin --group --disabled-password --disabled-login spamd
The following settings were modified in /etc/default/spamassassin​:
# Change to one to enable spamd ENABLED=1
User preferences are disabled, spamd listens on the loopback interface only, and runs as user/group spamd:
OPTIONS="--max-children 5 --nouser-config --listen-ip=127.0.0.1 -u spamd -g spamd"
Run spamd with nice level 10:
# Set nice level of spamd NICE="--nicelevel 10"
In /etc/spamassassin/local.cf​, the following settings were changed:
trusted_networks 66.230.200.0/24 145.97.39.128/26 203.212.189.192/26 211.115.107.128/26
...so SpamAssassin knows which hosts it can trust.

We also enable the SARE (SpamAssassin Rules Emporium) repository in order to catch more spam (​http://saupdates.openprotect.com/​)
gpg --keyserver pgp.mit.edu --recv-keys BDE9DC10 gpg --armor -o pub.gpg --export BDE9DC10 sa-update --import pub.gpg sa-update --allowplugins --gpgkey D1C035168C1EBC08464946DA258CDB3ABDE9DC10 --channel saupdates.openprotect.com
in /etc/cron.daily/spamassassin​, change the following so that the rules are updated daily:
#sa-update || exit 0 sa-update --allowplugins --gpgkey D1C035168C1EBC08464946DA258CDB3ABDE9DC10 --channel saupdates.openprotect.com --channel updates.spamassassin.org || exit 0
We also want to speed up the spamassassin process as much as we can. To that end, it helps to compiles all the rules. This is taken care of in the cron.daily file, but r2ec is missing. So it needs to be installed:
apt-get install re2c sa-compile
One of the more useful feature in spam-fighting is the Bayesian filter. It allows spamassassin to detect spam regardless of its rules. However, it needs to be enabled: In /etc/spamassassin/local.cf​, the following settings were changed:
use_bayes 1 bayes_auto_learn 1 bayes_path /etc/spamassassin/bayes/bayes # <- This is important. In a virtual environment, omitting this line renders the Bayesian filter useless. bayes_ignore_header X-Bogosity bayes_ignore_header X-Spam-Flag bayes_ignore_header X-Spam-Status
...so Bayes is able to learn and stores its database in a central location and benefits everyone
In order to improve on Bayes usefulness, we want it to learn from the users what is Spam and what is Ham. To that end, we need to teach the Bayesian filter. To that end, we are adding two folders in each user's INBOX (on Sanger): =NOTE=: You need to use maildirmake.dovecot to crete this directory, and chown vmail:vmail them.
.INBOX.Bayes_Ham/ .INBOX.Bayes_Spam/
We also need to 'encourage everybody to use these folders, so we add these two lines at the end of everyboy's subscriptions file (​/var/vmail/wikimedia.org/USERNAME/subscriptions​).
INBOX.Bayes_Ham INBOX.Bayes_Spam
Of course once people start filling these new mailboxes, we need to process them. The more interesting part is that the mailboxes are on Sanger and the Spam filtering happens on McHenry. We therefore need to move the messages in these folders over to McHenry so that sa-learn can be ran against them. Here is a small shell script meant to be ran as user vmail on Sanger and located in /usr/local/bin/GatherBayesData.sh
#!/bin/bash cd /var/vmail/wikimedia.org echo "Archiving users' Bayes mailboxes..." tar zcvf /tmp/BayesLearning_`date +%Y-%m-%d--%H`.tgz `find ./ -name *Bayes_[H,Sp]* -type d` >/dev/null 2>&1 echo "Transmitting Spam/Ham to McHenry... "; scp /tmp/BayesLearning_`date +%Y-%m-%d--%H.tgz` vmail@mchenry:. && rm /tmp/BayesLearning_`date +%Y-%m-%d--%H.tgz` >/dev/null 2>&1 for user in `ls -d */ | sed s/.$//g` ; do echo "" echo "=== WORKING WITH $user's MAILBOX ===" SpamFolder=`find ./$user/ -type d -name *Bayes_Spam* -exec basename {} \;` ; HamFolder=`find ./$user/ -type d -name *Bayes_Ham* -exec basename {} \; `; if [ -d "$user/$SpamFolder/" ] then echo "Found Bayes mailboxe(s): SPAM: $SpamFolder." echo "Purging Spam..." # Purging SPAM folder: find $user/$SpamFolder/{new,cur}/ -type f -exec rm -f {} \; else echo "No SPAM Bayes mailboxe(s) found." continue fi if [ -d "$user/$HamFolder/" ] then echo "Found Bayes mailboxe(s): HAM: $HamFolder." echo "Moving Ham messages back to their original place..." find $user/$HamFolder/cur/ -type f -exec mv {} $user/cur/ \; find $user/$HamFolder/new/ -type f -exec mv {} $user/new/ \; else echo "No HAM Bayes mailboxe(s) found." continue fi done
This script is pretty self-explanatory:
  1. Create an archive of everybody's HAM and SPAM folders
  2. Sends said archive to McHenry for further processing
  3. move the HAM message back in the user's INBOX (while respecting the status of the message as read / unread)
  4. Permanently delete the SPAM.
On the receiving end (McHenry) we also have a little bash script that passes messages to sa-learn: /var/vmail/process_bayes.sh
!/bin/bash
cd /var/vmail [ $(ls -A *.tgz) ] || (echo "Nothing to process..." ; exit 0 ); for file in `ls *.tgz` ; do
echo "Processing $file..."; echo "Creating temp dir: ./tmp_bayes "; mkdir ./tmp_bayes; echo "Extracting archive..."; tar -C ./tmp_bayes -zxf $file; echo "Analyzing HAM / SPAM for each user/" cd tmp_bayes; for user in `ls -d */ | sed s/.$//g` ; do echo "" echo "=== WORKING ON $user's MAILBOX ===" SpamFolder=`find $user -type d -name *Bayes_Spam* -exec basename {} \;` HamFolder=`find $user -type d -name *Bayes_Ham* -exec basename {} \;` echo "Found the following Bayes Mailboxes: SPAM: $user/$SpamFolder | HAM: $user/$HamFolder." echo "Learning Ham form $user:"; sa-learn --ham $user/$HamFolder/{cur,new}/*; echo "Learning Spam from $user:"; sa-learn --spam $user/$SpamFolder/{cur,new}/*; done echo "Completed analysis from $file."; cd ..; rm -rf ./tmp_bayes; rm -f $file;
done
Once again the script is pretty self-explanatory...
  1. look for an archive in /var/vmail (where Sanger sends the daily archive)
  2. untar said file in a temp directory
  3. crawl every user's directory for Spam and Ham message
  4. send those messages to sa-learn
  5. clean up.
Both these scripts are in cron so that they run every night at 10:30 / 11:00PM (PST):
Sanger: /etc/cron.d/Bayes-collector​:
MAILTO=fvassard@wikimedia.org SHELL=/bin/bash 30 5 * * * vmail /usr/local/bin/GatherBayesData.sh
McHenry: Root's crontab
MAILTO=fvassard@wikimedia.org 0 6 * * * /var/vmail/process_bayes.sh

The default X-Spam-Report headers are very long because they contain a "content preview", which is rather useless in our setup. This can be modified:
# Do not include the useless content preview clear_report_template report Spam detection software, running on the system "_HOSTNAME_", has report identified this incoming email as possible spam. If you have any report questions, see _CONTACTADDRESS_ for details. report report Content analysis details: (_SCORE_ points, _REQD_ required) report report " pts rule name description" report ---- ---------------------- -------------------------------------------------- report _SUMMARY_
In Exim, SpamAssassin is called from the DATA ACL for domains in domain list spamassassin_domains​. exim4.conf​:
domainlist spamassassin_domains = *
acl_smtp_data = acl_check_data
acl_check_data: # Let's trust local senders to not send out spam accept hosts = +wikimedia_nets set acl_m0 = trusted relay # Run through spamassassin accept endpass acl = spamassassin spamassassin: # Only run through SpamAssassin if requested for this domain and # the message is not too large accept condition = ${if >{$message_size}{400K}} # Add spam headers if score >= 1 warn spam = nonexistent:true condition = ${if >{$spam_score_int}{10}{1}{0}} set acl_m0 = $spam_score ($spam_bar) set acl_m1 = $spam_report # Reject spam at high scores (> 12) deny message = This message scored $spam_score spam points. spam = nonexistent/defer_ok condition = ${if >{$spam_score_int}{120}{1}{0}} accept
First, not listed in spamassassin_domains is accepted, as well as mails bigger than 400 KB. Then a Spam check is done using the local spamd daemons. If that results in a score of minimum 1, two ACL variables are for adding X-Spam-Score: and X-Spam-Report: headers later, in the system filter. If the spam score is 12 or higher, the mail is rejected outright.
System filter
All mail is run through a so called "system filter" that can do certain checks on the mail, and determine actions. A system filter is run once for a mail, and applies to all recipients.
The system filter is set using the main configuration option:
system_filter = CONFDIR/system_filter
In our setup the system filter is used to remove any untrusted spam checker headers, and to add our spam headers to the message. The file /etc/exim4/system_filter has the following content:
# Exim filter if first_delivery then if $acl_m0 is not "trusted relay" then # Remove any SpamAssassin headers and add local ones headers remove X-Spam-Score:X-Spam-Report:X-Spam-Checker-Version:X-Spam-Status:X-Spam-Level endif if $acl_m0 is not "" and $acl_m0 is not "trusted relay" then headers add "X-Spam-Score: $acl_m0" headers add "X-Spam-Report: $acl_m1" endif endif
Mailing lists
Mailing lists now live on a dedicated mailing lists server (lily) on a dedicated mail domain lists.wikimedia.org​. However, mail for the old addresses such as info-en@wikipedia.org still come in and should be rewritten to the new addresses, and then forwarded to the mailing lists server.
Near the top of the Exim configuration file a domain list is defined, which contains mail domains that can contain these old addresses:
domainlist legacy_mailman_domains = wikimedia.org : wikipedia.org : mail.wikimedia.org : mail.wikipedia.org
The following router, near the end of the routers section, checks if a given local part exists in the file /etc/exim4/legacy_mailing_lists​, and rewrites it to the new address if it does, to be routed via the normal DNS MX/SMTP routers/transports. Since Mailman does not distinguish between domains, only a single local parts file for all legacy Mailman domains exists. This file only needs to contain the mailing list names; all suffixes are handled by the router.
# Alias old mailing list addresses to @lists.wikimedia.org on lily legacy_mailing_lists: driver = redirect domains = +legacy_mailman_domains data = $local_part$local_part_suffix@lists.wikimedia.org local_parts = lsearch;CONFDIR/legacy_mailing_lists local_part_suffix = -bounces : -bounces+* : \ -confirm+* : -join : -leave : \ -owner : -request : -admin : \ -subscribe : -unsubscribe local_part_suffix_optional
Wiki mail
The application servers send out mail for wiki password reminders/changes, and e-mail notification on changes if enabled. These automated mass mailings are also accepted by the mail relay, mchenry, but are treated somewhat separately. To minimize the chance of external mail servers blocking mchenry's regular mail because of mass emails, these "wiki mails" are sent out using a separate IP.
Near the top of the configuration a macro is defined for the IP address to accept incoming wiki mail, and to use for sending it out to the world:
WIKI_INTERFACE=66.230.200.216
A hostlist is defined for the IP ranges that are allowed to relay from:
hostlist relay_from_hosts = <; @[] ; 66.230.200.0/24 ; 10.0.0.0/16
The rest of the configuration file uses the incoming interface address to distinguish wiki mail from regular mail. Therefore care must be taken that external hosts cannot connect using this interface address. A SMTP Connect ACL takes care of this:
# Policy control acl_smtp_connect = acl_check_connect
acl_check_connect: # Deny external connections to the internal bulk mail submission # interface deny condition = ${if match_ip{$interface_address}{WIKI_INTERFACE}{true}{false}}  ! hosts = +wikimedia_nets accept
Wiki mail gets picked up by the first router, selecting on incoming interface address and a specific header inserted by MediaWiki:
# Route mail generated by MediaWiki differently wiki_mail: driver = dnslookup domains = ! +local_domains condition = ${if and{{match_ip{$interface_address}{WIKI_INTERFACE}}{eqi{$header_X-Mailer:}{MediaWiki mailer}}}} errors_to = wiki@wikimedia.org transport = bulk_smtp ignore_target_hosts = <; 0.0.0.0 ; 127.0.0.0/8 ; 0::0/0 ; 10/8 ; 172.16/12 ; 192.168/16 no_verify
The router directs to a separate SMTP transport, bulk_smtp​. no_verify is set because mails from the application servers are not verified anyway, to be as liberal as possible with incoming mails and keep the queues on the application servers small. Queue handling should be done on the mail relay. For other mail, this router is not applicable so is not needed for verification either.
The envelope sender is forced to wiki@wikimedia.org, as it may have been set to something else by sSMTP.
The bulk_smtp transport sets a different outgoing interface IP address, and a separate HELO string:
# Transport for sending out automated bulk (wiki) mail bulk_smtp: driver = smtp hosts_avoid_tls = <; 0.0.0.0/0 ; 0::0/0 interface = WIKI_INTERFACE helo_data = wiki-mail.wikimedia.org
Wiki mail also has a shorter retry/bounce time than regular mail; only 8 hours:
begin retry * * senders=wiki@wikimedia.org F,1h,15m; G,8h,1h,1.5
Postmaster
For any local domain, postmaster@ should be accepted even if it's forgotten in alias files. A special redirect router takes care of this:
# Redirect postmaster@$domain if it hasn't been accepted before postmaster: driver = redirect domains = +local_domains local_parts = postmaster data = postmaster@$primary_hostname cannot_route_message = Address $local_part@$domain does not exist
Internal address rewriting
Internal servers in the .pmtpa.wmnet domain sometimes send out mail, which gets rejected by mail servers in the outside world. Sender domain address verification cannot resolve the domain .pmtpa.wmnet​, and the mail gets rejected. To solve this, mchenry rewrites the Envelope From to root@wikimedia.org for any mail that has a .pmtpa.wmnet sender address:
################# # Rewrite rules # ################# begin rewrite # Rewrite the envelope From for mails from internal servers in *.pmtpa.wmnet, # as they are usually rejected by sender domain address verification. *@*.pmtpa.wmnet root@wikimedia.org F
Secondary mail relay
lily is Wikimedia's secondary mail relay. It should do the same policy checks on incoming mail as the primary mail relay, so make sure its ACLs are equivalent for the relevant domains.
Lily does not have a copy/cache of the local parts which are accepted by the primary relay, as that is a dynamic process. Instead, it uses recipient address verification callouts, i.e. it asks the primary mail relay whether a recipient address would be accepted or not. In case the primary mail relay is unreachable, or does not respond within 5-30s, the address is assumed to exist and the mail is accepted - it is, after all, a backup MX. Callouts are cached, so resources are saved for frequently appearing destination addresses.
Relay domains
Secondary mail relays will relay for any domain for which the following holds:
  1. The domain is listed in a static text file of domains: /etc/exim4/relay_domains​, or
  2. The secondary mail relay is listed as a secondary MX in DNS for the domain, and
  3. The higher priority MXes are in a configured list of allowed primaries
The latter is to prevent abuse; we don't really want people with control over a DNS zone abusing our mail servers as backup MXes.
Near the top of the configuration file, two domain lists are defined for domains to relay for:
domainlist relay_domains = lsearch;CONFDIR/relay_domains domainlist secondary_domains = @mx_secondary/ignore=127.0.0.1
relay_domains contains domains explicitly listed in the text file /etc/exim4/relay_domains​, and secondary_domains queries DNS whether the local host is listed as a secondary MX. Note: the two lists will usually overlap.
A host list is defined with accepted primary mail relays. This list should only contain IPs, and are the only IP addresses where @mx_secondary domains will be relayed to. For domains explicitly configured in relay_domains, it doesn't matter what the primary MX is.
@mx_secondary domains use a separate dnslookup router, to check the higher priority MX records:
# Relay @mx_secondary domains only to these hosts hostlist primary_mx = 66.230.200.240
# Route relay domains only if the higher prio MXes are in the allowed list secondary: driver = dnslookup domains = ! +relay_domains : +secondary_domains transport = remote_smtp ignore_target_hosts = ! +primary_mx cannot_route_message = Primary MX(s) for $domain not in the allowed list no_more
All relevant (= higher priority) MX records not in hostlist primary_mx are removed from the list for consideration by Exim. In case there are no higher priority MX records which coincide with the primary_mx list, the MX list will be empty and the router will decline. As this router is run during address verification in the SMTP session as well, the RCPT command will be rejected.
Exim's dnslookup router has a precondition check check_secondary_mx​. However, the secondary_domains domainlist serves the same purpose, and using both at the same time in fact doesn't work, as by the time the check_secondary_mx check is run, Exim will already have removed the local host from the MX list (due to ignore_target_hosts​), and the router will decline to run.
Note: this router should not be run for domains in domainlist relay_domains​, as for those domains, the MX rules need not to be as stringent. They can be handled by the regular dnslookup router:
# Route non-local domains (including +relay_domains) via DNS MX and A records dnslookup: driver = dnslookup domains = ! +local_domains transport = remote_smtp ignore_target_hosts = <; 0.0.0.0 ; 127.0.0.0/8 ; 10/8 ; 172.16/12 ; 192.168/16 cannot_route_message = Cannot route to remote domain $domain no_more
IMAP server
The IMAP server is sanger. It only receives e-mail destined for its IMAP accounts; other mail is handled by mchenry. Outgoing mail is not sent directly, but routed via the mail relays, so the IMAP server should never build up a large mail queue itself.
Mail storage uses a single system user account vmail, which has been created with the command
# adduser --system --home /var/vmail --no-create-home --group --disabled-password --disabled-login vmail
Mail is stored under the directory /var/vmail​, which should be created with the correct permissions:
# mkdir /var/vmail # chown root:vmail /var/vmail # chmod g+s /var/vmail
User Debian-exim needs to be part of the vmail group to access the mail directories:
# gpasswd -a Debian-exim vmail
TLS support
For SMTP mail submissions we require authentication over TLS/SSL. To make Exim support server-side TLS connections, a SSL certificate and private key need to be installed. In the main configuration file, set the following two configuration settings:
tls_certificate = /etc/ssl/certs/wikimedia.org.pem tls_privatekey = /etc/ssl/private/wikimedia.org.key
The private key file should have file permissions set as restricted as possible, but Exim (running as user Debian-exim​) should be able to read it. Therefore Debian-exim has been added to the ssl-cert group.
To advertise TLS to all connecting hosts, use:
tls_advertise_hosts = *
To start TLS by default on the SMTPS port, set:
tls_on_connect_ports = 465
There can be a problem with draining the random entropy pool on not very busy servers. Exim in Debian/Ubuntu is linked to GnuTLS instead of OpenSSL, and uses /dev/random​. When it tries to regenerate the gnutls-params Diffie Hellman paramaters file, it can get blocked for being out of random entropy and thereby delaying all mails until more random entropy is available. To avoid this, make sure that the Exim cron job can regenerate the parameters file outside Exim, using the certtool command:
# apt-get install gnutls-bin
Local mail submissions
There is a problem with mail submitted through the IMAP server with destinations that are local. All aliasing happens on the mail relay, so a mail to a mail address that exists as a local IMAP account would just be delivered locally, and never go to any of the aliases that might exist for the same mail address on the mail relay. Therefore we force all local mail submissions (recognizable by $received_protocol matching /e?smtpsa$/​) to go via the mail relay(s). All routers that might handle such an address locally get an extra condition:
condition = ${if !match{$received_protocol}{\Nsmtpsa$\N}}
Because this condition is used identically on multiple routers, it's been defined as a macro NOT_LOCALLY_SUBMITTED at the top of the configuration file:
NOT_LOCALLY_SUBMITTED=${if !match{$received_protocol}{\Nsmtpsa$\N}}
Routers can thus use:
condition = NOT_LOCALLY_SUBMITTED
User filters
The second router, after system_aliases​, applies only to IMAP accounts. It checks whether an IMAP account exists with the specified mail address, and whether that account has a custom user filter. A user filter can be an Exim filter, or a Sieve filter, and is meant to provide more or less the same functionality as procmail filters, i.e. sorting out mail into subfolders, rejecting based on certain criteria and the like.
User filters are loaded as text BLOBs into the account database, and can be changed using wmfmailadmin​. If an account's filter field is set to NULL, the Exim setup will revert to a default filter, loaded from the file /etc/exim4/default_user_filter​:
# Exim filter if $h_X-Spam-Score matches "\\N\(\\+{5,}\)\\N" then save .Junk/ endif
This filter simply sorts mails classified by SpamAssassin as spam (score 5.0 or higher), into the Junk folder.
The router has some extra filter options set to deny the usage of certain functionality in filters that might compromise system security.
# Run a custom user filter, e.g. to sort mail into subfolders # By default Exim filter CONFDIR/default_user_filter is run, # which sorts mail classified spam into the Junk folder user_filter: driver = redirect domains = +local_domains condition = NOT_LOCALLY_SUBMITTED router_home_directory = VMAIL/$domain/$local_part address_data = ${lookup sqlite{USERDB \ SELECT id, filter NOTNULL AS hasfilter \ FROM account \ WHERE localpart='${quote_sqlite:$local_part}' \ AND domain='${quote_sqlite:$domain}' \ AND active='1'}{$value}fail} data = ${if eq{${extract{hasfilter}{$address_data}}}{1}{ \ ${lookup sqlite{USERDB \ SELECT filter \ FROM account \ WHERE id='${quote_sqlite:${extract{id}{$address_data}}}'}}} \ {${readfile{CONFDIR/default_user_filter}}}} allow_filter forbid_filter_dlfunc forbid_filter_existstest forbid_filter_logwrite forbid_filter_lookup forbid_filter_perl forbid_filter_readfile forbid_filter_readsocket forbid_filter_run forbid_include forbid_pipe user = vmail group = vmail directory_transport = maildir_delivery reply_transport = reply_transport # added for autoreply support no_verify
The address_data query checks whether a matching account exists and is active. If it is, the id of the account will be stored in $address_data, along with a boolean value that represents the existence of a custom filter in the account. If the query fails because no matching account is found, the string expansion is forced to fail and the user_filter router is skipped. In the data query, the data previously looked up and stored in $address_data is used. If a custom filter exists for the account, it's looked up in an SQL query. Otherwise the default user filter file is read.
If the filter chooses to decline handling the mail, e.g. because no special action is required (it's not spam), then control is passed to the next router which will handle a normal INBOX Maildir delivery.
IMAP delivery
The next router handles delivery to local mail boxes. If a given mail address exists in the SQLite database, it's handed to the dovecot_delivery transport:
# Delivery to a Maildir mail box. local_user: driver = accept domains = +local_domains condition = NOT_LOCALLY_SUBMITTED local_part_suffix = +* local_part_suffix_optional address_data = ${lookup sqlite{USERDB \ SELECT id, quota \ FROM account \ WHERE localpart='${quote_sqlite:$local_part}' \ AND domain='${quote_sqlite:$domain}' \ AND active='1'}{$value}fail} transport = maildir_delivery transport_home_directory = VMAIL/$domain/$local_part transport_current_directory = VMAIL
The local_part_suffix options accept an optional suffix to the local part, e.g. mark+​something​@
This router is accompanied by the maildir_delivery appendfile transport, which delivers a message to a Maildir mail box:
# Exim appendfile transport for Maildir delivery maildir_delivery: driver = appendfile maildir_format directory = ${if def:address_file{$address_file}{$home}} create_directory create_file = belowhome delivery_date_add envelope_to_add return_path_add user = vmail group = vmail ...
The transport only delivers to Maildir directories (​maildir_format​), determined by the directory parameter: if $address_file is defined, because it's been set by the user_filter router, then the path in that variable is used. Otherwise it uses the (transport) home directory as set by the local_user router.
If a (sub folder or top level) Maildir directory does not exist yet, it's created by Exim given that it's in or below the specified home directory (​create_directory​, create_file​). The headers Delivery-date​, Envelope-to and Return-path are added to the message before delivery. The delivery process runs as uid/gid vmail.
The second part of the transport implements quota support:
... # Quota support quota = ${if !eq{$received_protocol}{local}{${extract{quota}{$address_data}{${value}K}{0}}}} quota_is_inclusive = false quota_warn_threshold = 100% quota_warn_message = ${expand:${readfile{CONFDIR/quota_warn_message}}} maildir_use_size_file maildir_quota_directory_regex = ^(?:cur|new|\.(?!Trash).*)$ maildir_tag = ,S=$message_size
The quota limit is stored in the SQLite database in the quota column, and is stored as kilobytes (this is enforced by the Dovecot plugins). The quota limit, if any, is stored as a keyed field in the $address_data variable by the routers earlier, and thus can be extracted by the transport. This is only done for messages that do not have protocol local. System warnings such as those generated by Exim use protocol local, and therefore get a quota limit of 0 and are allowed through regardlessly.
The quota enforced is not inclusive (​quota_is_inclusive​), which means that the quota limit is only enforced after it has been exceeded. This is because otherwise a confusing situation could arise where big messages can not be delivered because they would exceed the total mailbox quota size, whereas smaller messages would be let through. This behaviour is a little more consistent with what the user would expect.
Once the user fully exceeds the quota limit (​quota_warn_threshold​), a warning message as specified in the file /etc/exim4/quota_warn_message is sent to tell the user to clean up (​quota_warn_message​).
Some Maildir++ extensions are used: Exim uses a maildirsize file in the Maildir to more efficiently keep track of the total size of the mail box, rather than doing a stat() on all files (​maildir_use_size_file​). Also, a suffix is appended to all Maildir filenames with the size of the message, so a stat() can again be avoided by both Exim and Dovecot, a readdir() is enough. Because this is a black box mail system, this poses no security problems.
The Trash folder is exempted from quota calculation, as this may cause problems when the user actually wants to clean up the mail box.
A small problem exists with the exemption of protocol local messages: in that case the quota is set to 0, which also makes Exim write this quota to the maildirsize file, until the next non-​local message is delivered. However, Dovecot doesn't read the quota from this file but also retrieves it directly from the database, so this is not likely to cause any problems.
User left
In some circumstances we want to provide an automatic reply (bounce) to mails for accounts of users that have left the organization. This is implemented using an accept router and an autoreply transport:
The router simply accepts the message if it's for a local domain, and was not submitted locally, and hands it over to the left_message transport:
# Bounce/auto-reply messages for users that have left user_left: driver = accept domains = +local_domains condition = NOT_LOCALLY_SUBMITTED require_files = CONFDIR/userleft/$domain/$local_part transport = left_message
The transport takes the message, and wraps it in a new bounce-style message, using the expanded template file /etc/exim4/userleft/$domain/$local_part​.
# Autoreply bounce transport for users that have left the organization left_message: driver = autoreply file = CONFDIR/userleft/$domain/$local_part file_expand return_message from = Wikimedia Foundation <postmaster@wikimedia.org> to = $sender_address reply_to = office@wikimedia.org subject = User ${quote_local_part:$local_part}@$domain has left the organization: returning message to sender
So, for each user that leaves the organization, the corresponding account must be set to inactive (not deleted!), and a file /etc/exim4/userleft/$domain/$local_part must be created. An example template file is available in /etc/exim4/userleft/TEMPLATE​.
Vacation auto-reply
In order to enable the auto-reply feature, a transporter needs to be defined:
reply_transport: driver = autoreply
Smart host
The last Exim router in the configuration file handles (outgoing) mail not destined for the local server; it sends mail for all domains to mchenry.wikimedia.org​, or lists.wikimedia.org if the former is down.
# Send all mail not destined for the local machine via a set of # mail relays ("smart hosts") smart_route: driver = manualroute transport = remote_smtp route_list = * mchenry.wikimedia.org:lists.wikimedia.org
SMTP authentication
We want our IMAP account users be able to send mail through our mail servers from wherever they are, regardless the network they are on. Therefore we use SMTP authentication​, supported by most modern mail clients. TLS is used and enforced to encrypt the connection, so the password cannot be sniffed on the wire.
In Exim, SMTP authentication is controlled through the authenticators in the equally named section of the configuration file. The plaintext driver can take care of both the PLAIN and the LOGIN authentication standards.
# PLAIN authenticator # Expects the password field to contain a "LDAP format" hash. Only # (unsalted) {md5}, {sha1}, {crypt} and {crypt16} are supported. plain: driver = plaintext public_name = PLAIN server_prompts = : server_condition = ${lookup sqlite{USERDB \ SELECT password \ FROM account \ WHERE localpart||'@'||domain='${quote_sqlite:$auth2}' \ AND active='1'} \ {${if crypteq{$auth3}{$value}}}{false}} server_set_id = $auth2 server_advertise_condition = ${if def:tls_cipher}
With the PLAIN mechanism, three parameters ($auth1, $auth2, $auth3) are expected from the client. The first one should be empty, the second one should contain the username, the third one the plaintext password.
server_condition is a string expansion that should return either true or false, depending on whether the username and password can be verified to match with those in the user database. It does a SQL lookup in the SQLite database. If the lookup fails, false is returned. If the lookup succeeds, the password is matched using Exim's crypteq function, which supports the crypt, crypt16, md5 and sha1 hashes. The type of hash is expected to be prepended to the hash in curly brackets, e.g. "{SHA1}" - a format which Dovecot also uses.
Unfortunately none of the salted password hash schemes could be used, as for all commonly used formats, either Exim or Dovecot didn't support it. This can be remedied in the future, either by using the Dovecot authenticator in Exim 4.64, or by adding a base64 decoder to Exim's string expansion functions.
The server_set_id is set to the given username, and is the id used by Exim to identify this authenticated connection (for example, in log lines).
server_advertise_condition controls when the SMTP AUTH feature is advertised to connecting hosts in the EHLO reply. This is only done when a TLS encrypted connection has already been established, and thus $tls_cipher will be non-empty. Exim automatically refuses AUTH commands if the AUTH feature had not been advertised.
Dovecot deliver
Dovecot deliver is no longer used, instead Exim's own Maildir delivery transport is used because this allowed for more flexibility with quota and subfolder filtering.
The Dovecot configuration file path is /etc/dovecot/dovecot.conf​. The Dovecot LDA needs to be able to read it while running under uid vmail, so the default file permissions are changed:
# chgrp vmail /etc/dovecot/dovecot.conf # chmod g+r /etc/dovecot/dovecot.conf
If deliver is given a -d username argument, it will attempt an auth DB lookup, which is unnecessary as Exim can provide it with all relevant information. Therefore this argument should not be used.
The postmaster_address option needs to be set for deliver to work:
protocol lda { # Address to use when sending rejection mails. postmaster_address = postmaster@wikimedia.org ...
deliver needs to know where, and in what format to store mail. As it only has the home directory to work with, use that:
# Deliver doesn't have username / address info but receives the home # directory from Exim in $HOME mail_location = maildir:%h
As the LDA is run under the restricted uid/gid vmail, it can't log to Dovecot's default log files without root permissions, so a separate log file is used:
... log_path = /var/log/dovecot-deliver.log info_log_path = /var/log/dovecot-deliver.log }
User database syncing
In order to know what accounts exist on the IMAP server, the primary mail relay mchenry must have a (partial) copy of the accounts database. The SQLite database on sanger is rsynced every 15 minutes to mchenry by the CRON job /etc/cron.d/rsync-userdb​:
*/15 * * * * root rsync -a /var/vmaildb/ mchenry-rsync:/var/vmaildb
The relevant ssh keys are in /root/.ssh/rsync​, and setup in /root/.ssh/config​.
Backups
sanger is backed up to mchenry using rdiff-backup​. The daily CRON job is in /etc/cron.daily/zz_backup​, and makes use of a dedicated ssh key /root/.ssh/rdiff-backup through /root/.ssh/config​. The list of in-/excluded directories is in /etc/rdiff-backup/includes​. The data is backed up to /var/backup/mchenry/​.
Mail box cleanup
Mail boxes are automatically moved out of the way (daily) once an account ceases to exist completely in the account database. To handle this, a small script has been written, mbcleanup.py​, available in SVN in the wmfmailadmin directory. Its run daily from /etc/cron.daily/mailbox-cleanup​. It takes three arguments, account db path, mailboxes root path and backup root path, respectively. From the account database it pulls a list of all existing accounts, and compares this with a set of mail boxes it finds from the two level directory structure under mailbox root path (ignoring .dot-directories and permission denieds). Superfluous mailboxes are then moved to the backup directory with a timestamp appended.
See also
External documentation
Categories: MailServices
Search
This page was last edited on 23 May 2014, at 14:05.
Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See Terms of Use for details.
Privacy policy
About Wikitech
Disclaimers
Code of Conduct
Mobile view
Developers
Statistics
Cookie statement
Create accountLog in
PageDiscussion
ReadView sourceView history
Visit the main pageMain pageRecent changesServer admin log: ProdAdmin log: RelEngIncident statusDeploymentsSRE/Operations HelpCloud VPS portalToolforge portalRequest VPS projectAdmin log: Cloud VPSWhat links hereRelated changesSpecial pagesPermanent linkPage informationCite this pageCreate a bookDownload as PDFPrintable version