SMTP extensions T. Finch Internet-Draft University of Cambridge Obsoletes: 1845 (if approved) January 23, 2007 Intended status: Experimental Expires: July 27, 2007 SMTP service extensions for transaction checkpointing draft-fanf-smtp-rfc1845bis-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 27, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Finch Expires July 27, 2007 [Page 1] Internet-Draft SMTP transaction checkpointing January 2007 Abstract This memo describes the SMTP service extension for checkpoint/resume, which allows a client to recover from a lost connection to the server without having to repeat all of the commands and message content sent prior to the interruption, and with less risk of duplicated messages. Checkpoint/resume is based on, and fixes a number of problems with, the SMTP service extension for checkpoint/restart [RFC1845]. Finch Expires July 27, 2007 [Page 2] Internet-Draft SMTP transaction checkpointing January 2007 Document revision $Cambridge: hermes/doc/qsmtp/draft-fanf-smtp-rfc1845bis.xml,v 1.54 2007/01/23 20:43:48 fanf2 Exp $ Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 2. SMTP service extension for checkpoint/resume . . . . . . . . . 6 2.1. Framework . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3. Transaction IDs . . . . . . . . . . . . . . . . . . . . . 7 2.4. Octet offsets . . . . . . . . . . . . . . . . . . . . . . 8 2.5. Server-side resume state . . . . . . . . . . . . . . . . . 9 2.6. Transaction resets . . . . . . . . . . . . . . . . . . . . 10 2.7. Retry versus resume . . . . . . . . . . . . . . . . . . . 10 2.8. Re-establishing a connection . . . . . . . . . . . . . . . 10 2.9. The RESUME command . . . . . . . . . . . . . . . . . . . . 11 2.10. The TRANSID and TRANSOFF parameters . . . . . . . . . . . 11 2.11. The QUIT command . . . . . . . . . . . . . . . . . . . . . 12 3. SMTP service extension for checkpoint/restart . . . . . . . . 13 3.1. Framework . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2. Description . . . . . . . . . . . . . . . . . . . . . . . 13 3.3. Changes from RFC 1845 . . . . . . . . . . . . . . . . . . 15 3.4. Prefer checkpoint/resume to checkpoint/restart . . . . . . 16 4. Security considerations . . . . . . . . . . . . . . . . . . . 17 4.1. Server storage . . . . . . . . . . . . . . . . . . . . . . 17 4.2. Transaction IDs . . . . . . . . . . . . . . . . . . . . . 18 4.3. Unnecessary bounces . . . . . . . . . . . . . . . . . . . 18 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.1. Normative references . . . . . . . . . . . . . . . . . . . 21 6.2. Informative references . . . . . . . . . . . . . . . . . . 21 Appendix A. Draft discussion venue . . . . . . . . . . . . . . . 23 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 24 Intellectual Property and Copyright Statements . . . . . . . . . . 25 Finch Expires July 27, 2007 [Page 3] Internet-Draft SMTP transaction checkpointing January 2007 1. Introduction Although SMTP is widely and robustly deployed, it does not handle the loss of its underlying connection particularly gracefully. There are two problems, and they are becoming more of a concern because of the way the Internet is changing. Firstly, if the connection is lost partway through the transmission of a message, the client must retry the transmission starting from the beginning. When dealing with very large messages over less reliable connections it is possible for substantial resources to be consumed by repeated unsuccessful attempts to transmit the message in its entirety. Messages are getting larger on average. Wireless connections, which are much more likely to be lost in normal use, are becoming more common. Secondly, a connection that is lost after the client has transmitted the message but before it receives the server's response can result in a duplicated message. This problem is described in [RFC1047], which recommends that servers should minimize the time between receiving the last of the message data and sending their response. However it is often preferable to analyse the message data for spam and viruses at this point so that the server can avoid taking responsibility for an undesirable message, and this can be time consuming. Furthermore, traditional SMTP cannot make efficient use of a connection with a bandwidth*delay product larger than the average message size, because of the pipeline stall after the DATA command [RFC2920]. A client can work around this problem by using multiple concurrent SMTP connections, or by using the SMTP service extension for large messages [RFC3030]. The latter eliminates pipeline stalls so the client can stream multiple messages to the server while waiting for responses. However in both cases one loss of network connectivity means multiple messages will need retransmission and may be duplicated. 1.1. Overview This memo provides a facility by which a client can uniquely identify a particular SMTP transaction. The server stores this identifying information along with all the information it receives as the transaction proceeds. If the connection is lost during the transaction the SMTP client may establish a new SMTP session and ask the server to resume the transaction. The server tells the client how much message data it saved from the interrupted transaction. The client can then perform an abbreviated transaction, repeating only those commands to which it did not receive responses, and Finch Expires July 27, 2007 [Page 4] Internet-Draft SMTP transaction checkpointing January 2007 transmitting only message data that the server does not yet have. These problems were originally tackled by the SMTP service extension for checkpoint/restart [RFC1845]. However it has a number of limitations (Section 3.3) so this memo describes the similar but improved SMTP service extension for checkpoint/resume (Section 2). We address backwards compatibility in Section 3 by re-defining checkpoint/restart as a small modification to checkpoint/resume. Finally, Section 4 gives proper consideration to the security implications of these extensions. 1.2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. The "message envelope" comprises the SMTP MAIL and RCPT commands. The "message data" is transmitted using the SMTP DATA command, or an alternative such as the BDAT command [RFC3030]. An SMTP "transaction" is the sequence of commands necessary to transmit a message, i.e. a message envelope followed by message data. It starts with a MAIL command and ends with the last of the message data or a reset. A "reset" is caused by the RSET, HELO, or EHLO commands. The metalinguistic notation used in this memo corresponds to the "Augmented BNF" used in other Internet mail system memos. The reader who is not familiar with that syntax should consult the ABNF specification [RFC4234]. Rules not defined in this specification are either defined in the ABNF core rules or in [RFC2821] or [RFC2822]. Metalanguage terms used in running text are surrounded by pointed brackets (e.g., ) for clarity. Finch Expires July 27, 2007 [Page 5] Internet-Draft SMTP transaction checkpointing January 2007 2. SMTP service extension for checkpoint/resume 2.1. Framework The SMTP service extension for checkpoint/resume is defined as follows: o The name of the service extension is "checkpoint/resume". o The EHLO keyword associated with the extension is "RESUME". o The RESUME EHLO keyword has no parameters. o This extension defines one additional command, which MAY appear anywhere in a pipelined group [RFC2920]. Syntax: resume-command = "RESUME" SP transid-value CRLF o This extension defines the 355 reply to the RESUME command. Syntax: resume-reply = "355" SP octet-offset SP text CRLF o This extension defines two additional parameters to the MAIL command. The TRANSID parameter has the following syntax: esmtp-param =/ transid-param transid-param = "TRANSID=" transid-value transid-value = "<" transid-spec ">" transid-spec = dot-string "@" domain The TRANSOFF parameter has the following syntax: esmtp-param =/ transoff-param transoff-param = "TRANSOFF=" octet-offset octet-offset = 1*20DIGIT o The maximum length of the MAIL command is increased by 297 characters. The maximum length of a is 256 characters. o There are no additional parameters to the RCPT command defined by this extension and its maximum length is not increased. Finch Expires July 27, 2007 [Page 6] Internet-Draft SMTP transaction checkpointing January 2007 o This extension is suitable for use with message submission [RFC4409]. 2.2. Overview When an SMTP client wishes to be able to resume a transaction in the event of a lost connection, it first issues the EHLO command to the SMTP server. If the server responds with code 250 to the EHLO command, and the response includes the EHLO keyword value RESUME, then the server is indicating that it supports checkpoint/resume as described below. The client then proceeds as usual, except that it issues MAIL commands with TRANSID and TRANSOFF parameters (Section 2.10). The TRANSID parameter's value is described in Section 2.3 and the TRANSOFF parameter's value is 0 (zero). The server periodically checkpoints the transaction, retaining state so that it can later be resumed (Section 2.5). If all goes well, the client will close the connection normally and the server can discard the resume state (Section 2.11). If the connection is lost (Section 2.7) then the client can reconnect (Section 2.8) and issue a RESUME command (Section 2.9) for each transaction in the lost connection to discover the of the data that needs to be retransmitted (Section 2.4). The client then issues MAIL commands with TRANSID and non-zero TRANSOFF parameters to resume any transactions that are missing data on the server or for which the client lost any of the server's replies. 2.3. Transaction IDs A is an identifier that serves to uniquely identify a particular SMTP transaction started by a particular client. The and client identity together form the "transaction ID". The is structured to ensure global uniqueness without any additional registry. Its domain part SHOULD be a valid domain name that uniquely identifies the SMTP client. Note that this is usually the same as the domain name given in the EHLO command, but not always. The EHLO domain name identifies the specific host the SMTP connection originated from, whereas the domain can refer to a group of hosts that collectively host a multi-homed SMTP client, or it can refer to a mobile client's home domain name, etc. The local part MUST be an identifier that distinguishes this SMTP transaction from any other originating from Finch Expires July 27, 2007 [Page 7] Internet-Draft SMTP transaction checkpointing January 2007 this SMTP client. Care must be used in constructing a to simultaneously ensure both uniqueness, unguessability, and the ability to reidentify interrupted transactions. It MUST be unguessable for security reasons; see Section 4. Despite the structured nature of a the server MUST treat the value as an opaque, case-sensitive string. Note that the contents of the [RFC2822] Message-ID: header field, or the MAIL FROM: ENVID parameter [RFC3461], typically are NOT appropriate for use as a , since such identifiers may be associated with multiple copies of the same message - e.g., as it is split during transmission down different network paths - and hence with multiple distinct SMTP transactions. For security reasons, servers MUST treat the same from different clients as different transaction IDs. Servers MUST use a secure identifier to distinguish clients, such as credentials from the SMTP AUTH command [RFC2554] or from TLS negotiation [RFC3207]. Note that these identifiers are independent of the IP address a client connects from: servers MUST allow authenticated mobile clients to reconnect and resume transactions from different IP addresses. If a client is not authenticated the server SHOULD use its IP address to identify it. 2.4. Octet offsets The represents an offset, counting from zero, to the particular octet in the actual message data the server expects to see next. (This is also a count of how many octets the server has received and stored successfully.) This offset SHALL NOT account for the message envelope. A value of 0 would indicate that the client needs to start sending the message from the beginning, a value of 1 would indicate that the client should skip one octet, and so on. Additional requirements apply depending on the command used by the client to transmit the message data: DATA [RFC2821] section 4.5.2: The SHALL NOT count any octets added by the dot-stuffing algorithm. The offset MUST also correspond to the start of a line, i.e. equal to zero or a point immediately after a . BDAT [RFC3030]: The SHALL count octets in the same way as the parameter to the BDAT command. There are no restrictions on where the can point within a chunk. Finch Expires July 27, 2007 [Page 8] Internet-Draft SMTP transaction checkpointing January 2007 BURL [RFC4468]: The SHALL count the size of the content retrieved from the URL parameter to the BURL command. The offset MUST NOT correspond to a point within the BURL data, that is the server stores the whole URL content or none of it. These requirements are consistent with the semantics of the SIZE parameter to the MAIL command [RFC1870]. 2.5. Server-side resume state The server's "resume state" is the additional information it retains in order to implement checkpoint/resume. The server SHOULD keep track of the transaction IDs associated with each connection, that is the s used by the client in RESUME commands or TRANSID parameters. This so that it can prevent multiple connections from trying to modify the same transaction, as described in Section 2.8, and so that the server can discard the resume state of the connection's transactions when the connection is closed cleanly, as described in Section 2.11. What is included in a transaction's resume state varies depending on whether or not the server has received any message data, and whether or not it has committed the transaction. The server commits the transaction when it decides what its final response to the last of the message data will be. Servers retain no resume state before they receive any message data. This implies that clients must resume the transaction from the beginning if the connection is lost while the client is transmitting the envelope. Before the server commits a transaction, its resume state comprises the message envelope (including all client commands and server replies) and any message data that the server has received so far. It is OPTIONAL for the server to retain this state. That is, a server MAY be configured to require some or all clients to resume interrupted transactions from the beginning. After the transaction is committed, its resume state comprises the message envelope (as before), the size of the message data, and the server's final response. If the client loses a connection and takes a while to reconnect and resume the transactions, it can be necessary for the server to retain this resume state for longer than is required to transmit the message on to its next destination. Finch Expires July 27, 2007 [Page 9] Internet-Draft SMTP transaction checkpointing January 2007 2.6. Transaction resets If the client terminates a transaction with a reset, the server SHALL discard any resume state for the transaction ID and remove the transaction ID from the list of those associated with the current connection. If the client issues a reset between transactions, the server SHALL retain any resume state for the preceding transaction(s). 2.7. Retry versus resume A client SHALL NOT attempt to resume a transaction unless the SMTP session was lost after the start of the transaction. A connection is lost if it is terminated without the server receiving a QUIT command from the client or the client receiving a 421 reply from the server. This includes SMTP-level timeouts as well as loss or premature closure of the underlying connection. If the connection is lost before the first MAIL command, or message transmission fails for another reason, the client MUST use the retry algorithm specified by [RFC2821]. These non-resume situations include, but are not limited to, the following: o failure to establish a connection; o failure to establish a security layer; o failure to authenticate; o a temporary failure indicated by a 4yz reply from the server; o a connection forcibly closed by the server with a 421 reply. 2.8. Re-establishing a connection When a connection has been lost as described in the previous section, the client MAY reconnect immediately. It SHALL first re-check the DNS if necessary, as determined by the DNS records' time-to-live. If the IP address (target of A or AAAA record) used to connect to the original server is still on this list it SHOULD be tried first, since this server is most likely to be capable of resuming the transaction. If that IP address is no longer on the list or if the connection fails, then the client SHOULD next try any other IP addresses associated with the host name (target of MX record) used to connect to the original server. If these also fail then the client SHOULD fall back to the ranking algorithm described in [RFC2821] section 5. A client SHOULD NOT wait for a server with an interrupted Finch Expires July 27, 2007 [Page 10] Internet-Draft SMTP transaction checkpointing January 2007 transaction's resume state to come back online. Multi-homed and clustered SMTP servers do exist, which means that it is entirely possible for a transaction to resume on a different server host. Note that connection loss can appear different from the client and the server ends, so a client might detect the loss and reconnect before the server detects the loss. This can lead to legitimate circumstances where there are multiple connections to the server that appear to be trying to use the same transaction ID. Therefore, if a client issues a command containing a that the client has used in another connection that is currently active from the server's point of view, then the server SHOULD prevent that other connection from performing further actions that affect the same transaction. For example, it MAY drop the other connection. 2.9. The RESUME command A client uses the RESUME command to discover how much message data the server has stored for a given , which is specific to that client. If the server has no resume state associated with the transaction ID, it SHALL reply with a 355 code and an of "0". If the server has resume state associated with the transaction ID the server SHALL reply with a 355 code and the of the end of the message data received by the server so far, which can be less than or equal to the message size. The field in a 355 reply is REQUIRED to be the first thing on the first line of the reply. It MUST be separated from any following by linear whitespace. The can vary and MUST be ignored by the client. For example, 355 64735 is the transaction offset The client MUST NOT issue a RESUME command during a MAIL transaction. The server SHOULD reject a RESUME command issued during a transaction with a "503 Bad sequence of commands" reply. After reconnecting, clients SHOULD issue RESUME commands for all the transactions in a lost connection, not just for the transactions that were interrupted. This is so that when the client QUITs the server can discard all their resume state. See Section 2.11. 2.10. The TRANSID and TRANSOFF parameters A client SHALL start or resume a resumable transaction by issuing a MAIL command with one TRANSID parameter and one TRANSOFF parameter. Finch Expires July 27, 2007 [Page 11] Internet-Draft SMTP transaction checkpointing January 2007 It MUST NOT include more than one of either parameter, or omit either parameter. (However see Section 3 for the meaning of a TRANSID parameter without a TRANSOFF parameter.) If this is a new transaction, the TRANSOFF MUST be 0 (zero). The server SHALL discard any resume state that it may have retained for the given , which is specific to that client. The transaction then proceeds as normal. If the client is resuming a transaction, then it MUST have previously issued a RESUME command in this connection with the same as in the TRANSID parameter. The value of the TRANSOFF parameter MUST be the same as the given in the server's reply to the RESUME command. If the client issues a MAIL command with a non-zero TRANSOFF that does not meet these requirements then the server SHALL reply with "503 Bad sequence of commands". Otherwise the server SHALL give the same reply to the MAIL command that it retained in the transaction's resume state. After the client has issued a MAIL command with a non-zero TRANSOFF, it MAY re-issue the transaction's RCPT commands. The server SHALL give the same reply to each RCPT command that it retained in the transaction's resume state. If the client issues a RCPT command that was not retained in the resume state then the server SHALL reject it with a "553 Requested action not taken: mailbox name not allowed" reply. After the client has issued the envelope commands, it SHALL issue another data transfer command (e.g. DATA or BDAT) and send the remaining message data starting from the . If the is equal to the message size, then the client SHALL send no data (e.g. DATA. or BDAT 0 LAST) and the server SHALL give the reply it retained in the transaction's resume state. 2.11. The QUIT command The client MUST NOT pipeline the QUIT command; that is, it SHALL wait to receive the replies to all outstanding commands before issuing the QUIT command. Note that this is stricter than [RFC2920] which allows the QUIT command to appear as the last command in a pipelined group. This requirement means that when the server receives a QUIT command, it can be sure that the client has received all the replies to all the transactions in the connection, so the server MAY then discard any resume state associated with these transactions. Finch Expires July 27, 2007 [Page 12] Internet-Draft SMTP transaction checkpointing January 2007 3. SMTP service extension for checkpoint/restart This section defines checkpoint/restart according to the SMTP extension framework specified in [RFC2821]. The working of this extension is described in terms of checkpoint/resume. We go on to list differences between this specification and [RFC1845]. Finally, we explain why checkpoint/resume is better than checkpoint/restart. 3.1. Framework The SMTP service extension for checkpoint/restart is defined as follows: o The name of the service extension is "checkpoint/restart". o The EHLO keyword associated with the extension is "CHECKPOINT". o The CHECKPOINT EHLO keyword has no parameters. o This extension defines one additional parameter to the MAIL command. The TRANSID parameter has the syntax defined in Section 2. o The maximum length of the MAIL command is increased by 88 characters. The maximum length of a is 77 characters. If the server also supports the checkpoint/resume extension, then its larger limits apply instead of (not in addition to) these limits. o This extension defines the 355 reply to the MAIL command which has the syntax defined in Section 2. o There are no additional parameters to the RCPT command defined by this extension and its maximum length is not increased. o There are no additional commands defined by this extension. o This extension is suitable for use with message submission [RFC4409]. 3.2. Description A server that supports the SMTP service extension for checkpoint/ restart SHALL include the CHECKPOINT keyword in its response to the client's EHLO command. A client that wishes to use this extension MUST issue a MAIL command with a TRANSID parameter and without a TRANSOFF parameter. Such a MAIL command MUST appear as the last command in a pipelined group; note that this is stricter than the Finch Expires July 27, 2007 [Page 13] Internet-Draft SMTP transaction checkpointing January 2007 usual pipelining requirements for the MAIL command specified in [RFC2920]. Issuing a MAIL FROM:<...> TRANSID= command to a server that supports checkpoint/restart is equivalent to issuing the following pair of commands to a server that supports checkpoint/ resume: RESUME MAIL FROM:<...> TRANSID= TRANSOFF= The value of the TRANSOFF parameter is the same as the given in the server's hypothetical 355 reply to the RESUME command. One of the two hypothetical replies to this pair of checkpoint/resume commands is given in response to the equivalent real checkpoint/ restart MAIL...TRANSID command. The reply is chosen according to the follwing rules: o If the hypothetical reply to the RESUME command would have indicated failure (i.e. not 355) then that is used as the real reply; o Otherwise, if the hypothetical reply to the MAIL FROM command would have indicated failure (i.e. not 250) then that is used as the real reply; o Otherwise, if the hypothetical given by the server's 355 reply and used in the MAIL FROM command would have been zero, then the hypothetical 250 reply to the MAIL FROM command is used as the real reply; (This is the case for new transactions.) o Otherwise, the hypothetical 355 reply to the RESUME command is used as the real reply. (This is the case for resumed transactions.) When a client wishes to start a new transaction that the server will checkpoint so that it can be resumed in the event of a lost connection, it SHALL issue a MAIL...TRANSID command. If it does not get the expected 250 reply (which indicates an accidental or malicious transaction ID collision) it SHALL issue a RSET command to reset the transaction, and re-issue the MAIL...TRANSID command. When a client wishes to resume a transaction after a lost connection, it SHALL issue the same MAIL...TRANSID command again. If it receives a 250 reply, it MUST repeat the transaction in full. If it receives Finch Expires July 27, 2007 [Page 14] Internet-Draft SMTP transaction checkpointing January 2007 a 355 reply, it MAY re-issue the rest of the transaction's envelope commands, then it MUST resume transmission of the message data at the exact indicated in the reply. In all other respects this extension works in the same way as checkpoint/resume (Section 2). 3.3. Changes from RFC 1845 This section lists differences between checkpoint/restart as specified in this memo and as specified in [RFC1845]. We also give the reasons for each change. o Interworking with pipelining [RFC2920] is improved in a number of ways: * We explicitly state that a checkpoint/restart MAIL command has stricter pipelining requirements than specified in [RFC2920]. * A client that pipelines its envelope commands and message data (using BDAT [RFC3030] or BURL [RFC4468]) can lose the server's envelope replies when a connection is lost. Clients may now re-issue envelope commands when resuming a transaction. * A client can pipeline the end of one transaction with the start of the next, so a new transaction does not indicate that the previous transaction is complete from the client's point of view. Servers are no longer permitted to discard a transaction's resume state at this point. o We now specify the semantics of the when the message data is transmitted using the BDAT [RFC3030] or BURL [RFC4468] commands. o The server's data retention requirements have been loosened. This allows a server to advertise support for checkpointing to less trustworthy clients, with less security exposure. See Section 4. o In [RFC1845] the is coupled to the client's EHLO host name. This specification prefers to use higher-level authenticated client identities where possible, so that mobile clients can re-connect from a different IP address (which implies a different EHLO host name) and resume interrupted transactions. o This specification allows a client to re-connect to the server sooner when resuming a transaction than is usual for normal SMTP retries, as specified in [RFC2821] section 4.5.4. This is friendlier to message submission clients that involve a human in Finch Expires July 27, 2007 [Page 15] Internet-Draft SMTP transaction checkpointing January 2007 the retry strategy. o We have identified some security concerns that are discussed in Section 4. o A couple of trivial matters: * [RFC2119] keywords for normative requirements. * Corrected octet counts in the MAIL command length limit increase. 3.4. Prefer checkpoint/resume to checkpoint/restart Even after updating and clarifying [RFC1845], there remains a significant problem with checkpoint/restart, which is why this memo defines and prefers the not-quite-backwards-compatible variant checkpoint/resume. The problem is that checkpoint/restart adds a pipeline stall to each transaction. This increases the time taken to send a message, which is particularly undesirable for message submission clients on high- latency connections. It also prevents clients from using the BDAT command [RFC3030] to stream messages to the server without pipeline stalls. This is particularly unfortunate because losing a connection during pipelined streaming can affect multiple messages, so it is desirable to have some way of recovering the state of transactions after a lost connection. The pipeline stall in checkpoint/restart has two functions, which checkpoint/resume handles in ways that avoid the stall. o In checkpoint/restart, the server tells the client whether the transaction is new or is being resumed. In checkpoint/resume, this information goes from the client to the server, which removes the need for an extra round trip in normal transactions. o When recovering from a lost connection, the client must find out the at which it will resume transmitting message data. This implies a client-server round trip before resuming the transaction. In checkpoint/resume this round trip is detached from the transaction using the RESUME command. The RESUME command can be pipelined, so if multiple transactions are to be resumed only one extra round trip per connection is needed, not one per message. Finch Expires July 27, 2007 [Page 16] Internet-Draft SMTP transaction checkpointing January 2007 4. Security considerations The difficulties fall into three areas: additional storage requirements on the server; vulnerabilities associated with spoofed transaction IDs; and undesirable bounce messages. Both are significantly mitigated if checkpoint/resume is only permitted for trustworthy clients (which generally implies secure authentication). This section also applies to [RFC1845], which does not discuss security. 4.1. Server storage A significant difference between partial transactions and complete transactions is that the server can recover the storage used by complete transactions by delivering their messages. Thus if a server gives partial transactions a long lifetime, it can be easier for an attacker to exhaust the server's disk space than with un-extended SMTP. The attacker does not have to outrun the server's ability to process messages, nor search for a destination address to which the server cannot deliver immediately (such as an over-quota user). The extensions in this memo permit a server to choose whether or not to store partial messages according to operational needs. For example, partial transaction storage might be permitted for authorized message submission clients, but MX clients might be required to recover from failed transactions by retrying the transaction from the start. When partial messages are stored, their lifetime should be scaled according to the typical time it takes for a client to recover from a lost connection, which will depend on the operational environment but will often be on the order of several minutes. Once a transaction is complete, the server still keeps some resume state so that clients can recover from connection loss during the [RFC1047] synchronization gap. Abusive clients can waste this space by failing to close connections cleanly with the QUIT command. The server has to restrict the lifetime of this resume state in a similar way to partial transactions. This resume state is relatively small, and is important for correctness (avoidance of duplicate messages) so its lifetime can be longer than that of partial messages. This memo does not provide a mechanism for clients to discover the lifetime of partial transactions and resume state on the server. Finch Expires July 27, 2007 [Page 17] Internet-Draft SMTP transaction checkpointing January 2007 4.2. Transaction IDs If an attacker can guess a client's transaction IDs, it can perform a variety of attacks based on confusing a client about the state of a transaction or inserting unwanted data into a message. These attacks are made much easier by the extensions defined in this memo than in un-extended SMTP. We reduce the opportunity for attack by making transaction IDs unguessable and by tying them to the client's authenticated identity. Resumable transactions can be used to turn a truncation attack into a message modification attack. If an attacker can cause the client's connection to break in the middle of a message, the attacker can resume the transaction and append any data to the end of the message. TCP truncation attacks are much easier than TCP modification attacks. Un-guessable transaction IDs prevent attackers that cannot sniff the client-server connection (e.g. because they are off the client-server path or because the connection has a security layer) from doing this. Unguessability can be achieved by including enough random data. [RFC4086] provides some guidelines on how to do this securely. An attacker that guesses a transaction ID before the client uses it could potentially append data to the start of the message. In checkpoint/resume this is prevented by the client asserting that a transaction is new. In checkpoint/restart this is prevented by requiring clients to detect this attack and reset the transaction. As well as these client-side protections, the server has to tie transaction IDs to the client's authenticated identity. Even if an attacker can guess a transaction ID, it also has to break the client's authentication credentials in order to succeed. If it doesn't manage to do both then the server will consider the client's and attacker's transactions to be different. 4.3. Unnecessary bounces It is generally preferable for SMTP servers to reject messages during the SMTP transaction instead of accepting them and later generating a bounce message. This allows message submission clients to present the error to the user immediately in a friendly manner. It also reduces the problem of backscatter from forged spam, assuming that spam sending software has a partial SMTP implementation that doesn't emit bounces when its spam is rejected. The RESUME specification requires a server to honor its original responses to the RCPT commands. If the status of a recipient address changes from good to bad between the start of a transaction and its Finch Expires July 27, 2007 [Page 18] Internet-Draft SMTP transaction checkpointing January 2007 completion, it would seem that the server is forced to accept the message then generate a bounce. This should be a rare situation if we assume clients will prefer to resume transactions promptly. However it is possible for servers to reduce the problem by giving a negative response to the end of the message data: in general a 450 response if not all the recipients' statuses have changed, so that the client will retry the delivery later. Finch Expires July 27, 2007 [Page 19] Internet-Draft SMTP transaction checkpointing January 2007 5. IANA Considerations This memo defines the SMTP service extension for checkpoint/resume, with the EHLO keyword value "RESUME", in Section 2. This memo replaces RFC 1845 as the definition of the SMTP service extension for checkpoint/restart, with the EHLO keyword value "CHECKPOINT", in Section 3. Finch Expires July 27, 2007 [Page 20] Internet-Draft SMTP transaction checkpointing January 2007 6. References 6.1. Normative references [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April 2001. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [RFC2920] Freed, N., "SMTP Service Extension for Command Pipelining", STD 60, RFC 2920, September 2000. [RFC3030] Vaudreuil, G., "SMTP Service Extensions for Transmission of Large and Binary MIME Messages", RFC 3030, December 2000. [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005. [RFC4409] Gellens, R. and J. Klensin, "Message Submission for Mail", RFC 4409, April 2006. 6.2. Informative references [RFC1047] Partridge, C., "Duplicate messages and SMTP", RFC 1047, February 1988. [RFC1845] Crocker, D. and N. Freed, "SMTP Service Extension for Checkpoint/Restart", RFC 1845, September 1995. [RFC1870] Klensin, J., Freed, N., and K. Moore, "SMTP Service Extension for Message Size Declaration", STD 10, RFC 1870, November 1995. [RFC2554] Myers, J., "SMTP Service Extension for Authentication", RFC 2554, March 1999. [RFC3207] Hoffman, P., "SMTP Service Extension for Secure SMTP over Transport Layer Security", RFC 3207, February 2002. [RFC3461] Moore, K., "Simple Mail Transfer Protocol (SMTP) Service Extension for Delivery Status Notifications (DSNs)", RFC 3461, January 2003. Finch Expires July 27, 2007 [Page 21] Internet-Draft SMTP transaction checkpointing January 2007 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness Requirements for Security", BCP 106, RFC 4086, June 2005. [RFC4468] Newman, C., "Message Submission BURL Extension", RFC 4468, May 2006. Finch Expires July 27, 2007 [Page 22] Internet-Draft SMTP transaction checkpointing January 2007 Appendix A. Draft discussion venue Feedback about this draft should be emailed to the author, or to the IETF SMTP discussion mailing list, , or to the Lemonade working group mailing list, . Finch Expires July 27, 2007 [Page 23] Internet-Draft SMTP transaction checkpointing January 2007 Author's Address Tony Finch University of Cambridge Computing Service New Museums Site Pembroke Street Cambridge CB2 3QH ENGLAND Phone: +44 797 040 1426 Email: dot@dotat.at URI: http://dotat.at/ Finch Expires July 27, 2007 [Page 24] Internet-Draft SMTP transaction checkpointing January 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Finch Expires July 27, 2007 [Page 25]