.@ Tony Finch – blog


I had a surprisingly productive day today, considering I’m still suffering a bit from the tail end of a fever I had over the weekend. The main things I wanted to do were to upgrade the Exim configuration on ppswitch with a few minor changes, disable insecure access to Hermes for those people who have not been insecure since the middle of November, and prepare an announcement about it.

The Exim change became slightly more interesting than it might otherwise have been following a complaint from a user who was being irritated by blank messages. I investigated this a bit and became somewhat confused by Exim’s handling of its $message_size variable. It turned out that all that is required to stop junk blank messages is to put deny condition = ${if ={0}{$message_size} } in the right place, but this is by no means obvious since $message_size has four slightly different meanings in various circumstances.

It became even more “interesting” when, during the roll-out of the new config, one of the changes alerted me to a long-standing bug. PPswitch performs two levels of address verification: either a basic check of the plausibility of the mail domain after the @, or a full call-out check which aims to validate the local part before the @ too. The latter is a bit problematic because of the quantity of legitimate but misconfigured email out there, so we have a list of domains for which we do callouts, which includes domains that are well-configured and frequent victims of forged spam, such as aol.com.

The change was supposed to add Cambridge domains to this list, because part of the long-standing bug was that we weren’t thoroughly checking email for systems like CUS which ppswitch doesn’t know everything about. However, after the upgrade ppswitch started doing call-out verification for all sender addresses! The problem I hadn’t noticed was that when we were doing sender verification, we were checking the recipient’s domain against the list rather than the sender’s domain; since the recipient is always (in this context) a Cambridge address the callout was always happening. This was caused by using the domains condition instead of the sender_domains condition: a common mistake…

My checks in preparation for removing people from the insecure list threw up another lurking bug. Our audit script which checks for misconfigured users was failing to notice any ~/mail users since the start of the new year. This turned out to be because my regex for extracting yesterday’s log lines assumed dates in the form Jan 04 wereas they were actually in the form Jan 4. Bah! syslogd really is the pits. I had to write a script to retro-analyse 8 days’ data, which required a fair amount of care, and I also had to ensure that this did not cause me to falsely class people as safe to remove from the list.

Anyway, all that meant I didn’t manage to get the announcement out before the end of the afternoon. However, rather than spodding, I spent a little time fiddling with my Hermesified authentication module for jabberd-2.0. Hermes uses cdb files for most of its important configuration tables, including the password files, so I ripped the cdb code out of Exim and tidied it up a bit so that jabberd2 could use the same password files. This evening I brought mmap-based cdb reading back from the dead, which may improve its performance slightly (though probably to an immesurable degree). After I’ve done some more testing on the dotat.at Jabber server it should be ready to contribute to the jabberd2 maintainers.