What I am working on

I feel like https://www.youtube.com/watch?v=Zhoos1oY404

Most of the items below involve chunks of software development that are or will be open source. Too much of this, perhaps, is me saying "that's not right" and going in and fixing things or re-doing it properly...

Exchange 2013 recipient verification and LDAP, autoreplies to bounce messages, Exim upgrade, and logging

We have a couple of colleges deploying Exchange 2013, which means I need to implement a new way to do recipient verification for their domains. Until now, we have relied on internal mail servers rejecting invalid recipients when they get a RCPT TO command from our border mail relay. Our border relay rejects invalid recipients using Exim's verify=recipient/callout feature, so when we get a RCPT TO we try it on the target internal mail server and immediately return any error to the sender.

However Exchange 2013 does not reject invalid recipients until it has received the whole message; it rejects the entire message if any single recipient is invalid. This means if you have several users on a mailing list and one of them leaves, all of your users will stop receiving messages from the list and will eventually be unsubscribed.

The way to work around this problem is (instead of SMTP callout verification) to do LDAP queries against the Exchange server's Active Directory. To support this we need to update our Exim build to include LDAP support so we can add the necessary configuration.

Another Exchange-related annoyance is that we (postmaster) get a LOT of auto-replies to bounce messages, because Exchange does not implement RCF 3834 properly and instead has a non-standard Microsoft proprietary header for suppressing automatic replies. I have a patch to add Microsoft X-Auto-Response-Suppress headers to Exim's bounce messages which I hope will reduce this annoyance.

When preparing an Exim upgrade that included these changes, I found that exim-4.86 included a logging change that makes the +incoming_interface option also log the outgoing interface. I thought I might be able to make this change more consistent with the existing logging options, but sadly I missed the release deadline. In the course of fixing this I found Exim was running out of space for logging options so I have done a massive refactoring to make them easily expandible.

gitweb improvements

For about a year we have been carrying some patches to gitweb which improve its handling of breadcrumbs, subdirectory restrictions, and categories. These patches are in use on our federated gitolite service, git.csx. They have not been incorporated upstream owing to lack of code review. But at last I have some useful feedback so with a bit more work I should be able to get the patches committed so I can run stock gitweb again.

Delegation updates, and superglue

We now subscribe to the ISC SNS service. As well as running f.root-servers.net, the ISC run a global anycast secondary name service used by a number of ccTLDs and other deserving organizations. Fully deploying the delegation change for our 125 public zones has been delayed an embarrassingly long time.

When faced with the tedious job of updating over 100 delegations, I think to myself, I know, I'll automate it! We have domain registrations with JANET (for ac.uk and some reverse zones), Nominet (other .uk), Gandi (non-uk), and RIPE (other reverse zones), and they all have radically different APIs: EPP (Nominet), XML-RPC (Gandi), REST (RIPE), and, um, CasperJS (JANET).

But the automation is not just for this job: I want to be able to automate DNSSEC KSK rollovers. My plan is to have some code that takes a zone's apex records and uses the relevant registrar API to ensure the delegation matches. In practice KSK rollovers may use a different DNSKEY RRset than the zone's published one, but the principle is to make the registrar interface uniformly dead simple by encapsulating the APIs and non-APIs.

I have some software called "superglue" which nearly does what I want, but it isn't quite finished and at least needs its user interface and internal interfaces made consistent and coherent before I feel happy suggesting that others might want to use it.

But I probably have enough working to actually make the delegation changes so I seriously need to go ahead and do that and tell the hostmasters of our delegated subdomains that they can use ISC SNS too.

Configuration management of secrets

Another DNSSEC-related difficulty is private key management - and not just DNSSEC, but also ssh (host keys), API credentials (see above), and other secrets.

What I want is something for storing encrypted secrets in git. I'm not entirely happy with existing solutions. Often they try to conceal whether my secrets are in the clear or not, whereas I want it to be blatantly obvious whether I am in the safe or risky state. Often they use home-brew crypto whereas I would be much happier with something widely-reviewed like gpg.

My current working solution is a git repo containing a half-arsed bit of shell and a makefile that manage a gpg-encrypted tarball containing a git repo full of secrets. As a background project I have about 1/3rd of a more refined "git privacy guard" based on the same basic principle, but I have not found time to work on it seriously since March. Which is slightly awkward because when finished it should make some of my other projects significantly easier.

DNS RPZ, and metazones

My newest project is to deploy a "DNS firewall", that is, blocking access to malicious domains. The aim is to provide some extra coverage for nasties that get through our spam filters or AV software. It will use BIND's DNS response policy zones feature, with a locally-maintained blacklist and whitelist, plus subscriptions to commercial blacklists.

The blocks will only apply to people who use our default recursive servers. We will also provide alternative unfiltered servers for those who need them. Both filtered and unfiltered servers will be provided by the same instances of BIND, using "views" to select the policy.

This requires a relatively simple change to our BIND dynamic configuration update machinery, which ensures that all our DNS servers have copies of the right zones. At the moment we're using ssh to push updates, but I plan to eliminate this trust relationship leaving only DNS TSIG (which is used for zone transfers). The new setup will use nsnotifyd's simplified metazone support.

I am amused that nsnotifyd started off as a quick hack for a bit of fun but rapidly turned out to have many uses, and several users other than me!

Other activities

I frequently send little patches to the BIND developers. My most important patch (as in, running in production) which has not yet been committed upstream is automatic size limits for zone journal files.

Mail and DNS user support. Say no more.

IETF activities. I am listed as an author of the DANE SRV draft which is now in the RFC editor queue. (Though I have not had the tuits to work on DANE stuff for a long time now.)

Pending projects

Things that need doing but haven't reached the top of the list yet include:

DHCP server refresh: upgrade OS to the same version as the DNS servers and combine the DHCP Ansible setup into the main DNS Ansible setup.
Federated DHCP log access: so other University institutions that are using our DHCP service have some insight into what is happening on their networks.
Ansible management of the ipreg stuff on Jackdaw.
DNS records for mail authentication: SPF/DKIM/DANE. (We have getting on for 400 mail domains with numerous special cases so this is not entirely trivial.)
Divorce ipreg database from Jackdaw.
Overhaul ipreg user interface or replace it entirely. (Multi-year project.)