Addressing and sessions

One thing I skimmed over in my previous article is addressing. A system’s addressing architecture is often good as a basis for explaining the rest of its architecture.

The Internet’s addressing architecture was originally very simple. There were straight-forward mappings between host names and IP addresses, and between service names and port numbers. The general model was that of academic computing, where a large central host provides a number of different services to its users.

However it isn’t completely clean: port numbers aren’t just used to identify services, they are also used for multiplexing. Furthermore, multi-homing adds complexity to the host addressing model.

This simplicity didn’t survive beyond the mid 1990s, because it is too limiting when you get away from mainframes. Nowadays it is common for multiple host names to match to the same IP address, or for a host name to map to multiple IP addresses. We often run multiple instances of the same service on a host, rather than single instances of different services. A set of related services (such as IMAP/POP/SMTP) are often run on different (but related) hosts.

One thing that the Internet does have now that it didn’t then is a well-developed application-level addressing system - the Uniform Resource Indicator. (Probably the most interesting early application-level address is the email address, followed by pre-URL ftp locators.) One consequence of the over-simple foundation that URIs are built on is that they end up being somewhat redundant: e.g. the www in <http://www.cam.ac.uk/> or the second imap in <imap://fanf2@imap.hermes.cam.ac.uk/>.

In my model I divide the problem into addressing, routing, and multiplexing. Addresses are used to establish a session, including selection of the service, and they are only loosely-coupled to the route to the server. Routing gets packets between the programs at either end, so I’m having multiple routing endpoints per host to support concurrent sessions. Multiplexing within the session is no longer muddled with service selection: it just divides the packets into requests or streams etc.

In the previous article I said that if you squint you can view DNS as a vestigial session layer, which does the mapping from application-level addresses to routes. Note that in most cases the DNS lookup doesn’t include any mention of the service, which is why it gets encoded in host names as I pointed out above. Some applications make more advanced use of the DNS and avoid the problem, which is why you can have email addresses and Jabber IDs like <fanf2@cam.ac.uk> rather than <fanf2@mx.cam.ac.uk> or <fanf2@chat.cam.ac.uk>.

The full session layer I have in mind is much more dynamic than this, though. It ought to be an elegant replacement for the routing and reliability hacks that we currently use, such as round-robin DNS, load-balancing routers, application-level redirecting proxies, BGP anycast, etc. etc. Think of something like IMAP mailbox referrals or HTTP redirects, but implemented in an application-agnostic manner.

All very pie/sky…