How not to design an MTA - part 4 - spool file format

It's traditional for most MTAs to use something close to the host operating system's standard text file format for storing messages that are queued for delivery. For example, on Unix that means using bare LF for newlines instead of the Internet standard CRLF. Exim, Sendmail, and qmail do this. On other systems the translation may be more extensive, e.g. if the native charset is EBCDIC, or if text files are record-oriented rather than stream-oriented. Postfix is a bit of a counter-example, since it runs on Unix but turns the lines of a message into records, as part of its general IPC scheme that I mentioned in part 2.

This makes sense if you are working in an environment where local email is more common than remote email: messages stay in the system's local format so are less likely to be munged. However, nowadays MTAs (especially on Unix) generally act as relays to and from SMTP (and variants like message submission or LMTP). The mismatch between local text formats and the Internet standard format leads to all sorts of transparency bugs. For example, SMTP was originally specified to be 7 bit clean, with no restrictions on bare CR or LF. However these characters will get mangled when a message is transferred via an MTA that translates newlines. There are descriptions of many similar problems in RFC 2049.

The other problem with all of this translation and re-translation is that it's horribly inefficient: it requires lots of copying back-and-forth, meaning you use several times more memory bus bandwidth than is needed to shift bits between the network and the disk. One of the neat things that the Cyrus IMAP server does is store its data in wire format, so that it can be transferred from disk to network with minimal copying. (UW IMAP, Courier, and Dovecot all use Unix native Berkeley or maildir folders.) Why not do the same in an SMTP server?

The main difficulty with this idea is that SMTP does not have a single wire format. What's worse is that you don't find out a message's outgoing wire format until you have connected to the destination host and are about to transmit it. Even worse, there might not even be a single outgoing wire format if the message has multiple recipients at different hosts with different capabilities.

You can make this problem much more tractable if you store the messages in the format that they are received in, and prepare for any recoding that may be necessary on transmission. In most cases, the incoming and outgoing software will have the same capabilities, so you will be able to use efficient APIs like sendfile(). But what do I mean by "prepare"?

You need to parse the MIME structure of the message, and note at least the content-transfer-encoding of each part. This will allow you to downgrade each part in the appropriate way when that is necessary.
If you are really keen, then you can also scan the data in each part to (say) choose between quoted-printable or base64 depending on which would be most efficient.
You need to note dot-stuffing points: in traditional SMTP, where dots have been inserted, and with chunked SMTP, where they need to be inserted for onward traditional transmission.
If you support UTF8SMTP, you have to scan the RFC822 and MIME headers for top-bit-set characters that may need downgrading to RFC2047 or RFC2231 encodings.

This can be done as the message is received, which only requires the message data to go over the memory bus once. (Data goes over the bus twice in a copy - from the old buffer and to the new buffer - so a parse is cheaper than a copy.) When you come to send the message onwards, you can directly sendfile() those parts that do not need to be downgraded and only recode where absolutely necessary.

It's worth noting that transmission between chunked and traditional SMTP can be done efficiently (assuming no MIME recoding is necessary). Dot-stuffing is only rarely necessary (e.g. it never happens in base64 attachments) so there will be reasonably large blocks of the message between dots which can sensibly be passed to sendfile(). When removing dots you will be sending blocks with a one byte gap between the file offset of the end of one block and the start of the next, and when adding dots you can use the leader or trailer feature of sendfile() to add the dot. This is good because it means it is cheap to use the more pipelineable BDAT command when it is available. It's also way more efficient than the usual scan-every-byte-of-the-message implementation of dot-stuffing. And I think it is quite cute :-)

Existing MTAs that implement 8BITMIME downgrading (such as Sendmail) generally do the MIME parse and recode when sending the message, and just use base64 because it would be too expensive to do two passes to work out if quoted-printable would be better. Another situation when two passes are necessary is adding a DomainKeys signature, because it appears at the top of the message. In many cases (if you aren't going to alter the message) you can make DK more efficient by calculating the signature as the message arrives so that it's easy to add when the message is transmitted.

Is it worth trying to make this more efficient? For example, if you have a cache of destination host capabilities, you could recode the message as it comes in to save a trip over the memory bus. But you will still have to support late recoding if the cache lookup misses, so the extra complexity probably isn't worth it.

Previously: part 3- local delivery; part 2 - partitioning for security; part 1 - the sendmail command; message identification.