Just been to a party hosted by 👤j4 and 👤sion_a. Nice and small and friendly. Had some excellent old well-matured beer – a Belgian lambic and some local beer left over from the New Year’s party which had done some secondary fermentation. A unique brew now enjoyed to the full. Food-wise there was some fresh parkin which was gorgeous but fearsomely rich, and some interesting choccy brought by 👤simonb of which the one with crystallized ginger was utterly sublime.
Spent the afternoon reading about text classification algorithms and writing a short essay on better use of them in the fight against spam. I decided on Thursday night after the pub that I should try to find out if anyone in the Computer Laboratory was interested in this problem and could help me learn more about the state of the art and how to improve on what’s out there so far. On Friday morning I got a nice reply from Ted Briscoe who’s intending to use exactly this application as part of a grant proposal. Nice.
Then on Friday evening at cam.pm Hanna Wallach turned up and I managed to enthuse her about the whole thing. I don’t think that building knowledge into the parser that feeds features into the statistical engine (as in Grahamian filters) is the right way, since it’s very vulnerable to intelligent attack by spammers. I’m inclined to rely on clever maths synthesizing higher-level information from a braindead parser – the KISS principle. The first person I spoke to who actually knows something about the field was Hanna and she agrees with my approach which is very encouraging. And she’s cute :-)
So this is all good stuff. The job I’m doing was always going to be a mixed sysadmin/developer thing, but being able to get a bit of developer/research into the mix, along with a bit of training perhaps, is ace.
PS. Going into news-free mode in order to avoid war-induced depression.