.@ Tony Finch – blog


Following yesterday's chrooted /dev/random problem I thought I would try to improve BIND's error logging in this situation.

BIND has a function dst__openssl_toresult() which converts an OpenSSL error into an ISC_R_ error code. In the existing code it doesn't do much, just translating ERR_R_MALLOC_FAILURE into ISC_R_NOMEMORY. I added a case to convert ECDSA_R_RANDOM_NUMBER_GENERATION_FAILED into ISC_R_NOENTROPY. So now when I have a misconfigured /dev/random and I try to sign a zone with ECDSA, named logs "sign_apex:add_sigs: out of entropy" instead of "sign failure". This would probably have been enough to clue me in a lot faster yesterday.

After running with this patch for a while, I got a DNSSEC validation failure trying to resolve t.co. Weirdly no-one else was seeing the same problem, so I tried backing out my patch and it worked again. How on earth could ECDSA error handling affect RSA validation?!

There is an ambiguity in the DNSSEC specs to do with canonicalization of signer names in RRSIG records. In the .co TLD, the signer name is upper case, for instance:

    co.  RRSIG  DNSKEY 8 1 518400 20120908022542 20120809021825 2044 CO. (...)
    CO.  RRSIG  DNSKEY 8 1 518400 20120908022542 20120809021825 33228 CO. (...)
When it encounters one of these signatures, named logs this:
    17-Aug-2012 18:42:48.028 general: info:
        sucessfully validated after lower casing signer 'CO'
This message is triggered when named tries to validate the record verbatim, fails, retries in lower case, and succeeds:
	if (ret == DST_R_VERIFYFAILURE && !downcase) {
		downcase = ISC_TRUE;
		goto again;
	}
The DST_R_VERIFYFAILURE code comes from BIND's RSA code, and is passed through dst__openssl_toresult() in case the failure was caused by memory allocation problems. The original error from OpenSSL is RSA_R_BAD_SIGNATURE, which dst__openssl_toresult() ought not to recognise so it should pass on the DST_R_VERIFYFAILURE code unchanged. But OpenSSL's reason codes are not unique:
    openssl/rsa.h:   #define RSA_R_BAD_SIGNATURE                     104
    openssl/ecdsa.h: #define ECDSA_R_RANDOM_NUMBER_GENERATION_FAILED 104
You also have to check the OpenSSL library code (ERR_LIB_ECDSA in this case) to disambiguate its error codes. Sheesh.

Because of this, BIND's validation code was getting ISC_R_NOENTROPY, so it did not do its lower case retry but simply failed to validate.

A little added joy for a postscript: Russian GOST signing also requires random numbers, so it would benefit from the same error logging fix as ECDSA. However OpenSSL's GOST module is dynamically loaded, so it doesn't have a static library code that you can easily check. As an alternative you can do a string comparison against its name, but this is a bit much for a lightweight error code conversion routine. Amusingly, the ECDSA module lacks a library name string (which is a bug) so you can't use similar error identification code for both GOST and ECDSA ...

I believe the next release of BIND 9.9.2 will include this error logging improvement for ECDSA, though not for GOST.