Vim blowfish encryption... or why you shouldn't roll your own crypto

tl;dr: If you use encryption in Vim upgrade and :set cryptmethod=blowfish2.

The Vim editor has two modes of encryption. The old pkzip based system (which is broken, but still the default for compatiblity reasons) and the new (as of Vim 7.3) blowfish based system.

If you know about cryptography you'll understand just saying "blowfish" isn't actually specific enough to describe a complete encryption system. (You may wish to skip the next few paragraphs if you know what CBC or similar is.)

Blowfish is a block cipher, this means it encrypts a block of data at a time. There is no state kept between blocks, this is important to understand; it means the same input will result in the same output (if the key is the same).

We can demonstrate this fairly easily:

$ perl -MCrypt::Blowfish -le'my $bf = Crypt::Blowfish->new("x" x 56); print $bf->encrypt("foofight")' | hd
00000000  40 83 6d c3 02 f1 04 9d                           |@.m.....|
00000008

If you change the print statement to have a loop, e.g. add " for 1 .. 100" to the end you'll see the pattern repeats every 8 bytes. (To test this yourself you'll need Crypt::Blowfish installed, libcrypt-blowfish-perl on Debian based systems, also if your system lacks "hd", you'll find "hexdump -C" does the same)

In fact hd will helpfully tell us the pattern repeats so we don't even have to eyeball it:

00000000  40 83 6d c3 02 f1 04 9d  40 83 6d c3 02 f1 04 9d  |@.m.....@.m.....|
*
00000320

The problem with this is for anything more than 8 bytes of data it would be very easy to see times when the plain text repeats or even when the same plain text is used in two different files. There are several ways of improving on using a plain key like this.

As Vim is not using public/private crypto it derives the key from a password supplied by the user, if this was based just on the supplied password it would always be the same. Therefore a "salt" (some random data) is mixed into the key (in fact Vim uses SHA256 of the key and the salt, repeated many times).

Using a salt means the same password used on several files won't inadvertantly leak information.

In addition to this there are ways of combining the blocks in a file so that they depend on each other, therefore making it so the same plain text in the file will not encrypt to the same cipher text. These are called modes of operation.

Much like the salt used in the key derivation there is some initial random data that is used to initialise the mode (the IV). The mode Vim is attempting to use is CFB (Cipher Feedback). Wikipedia has a pretty diagram which I have borrowed:

CFB encryption diagram

You might have noticed I said "attempting", the issue is that Vim actually ends up using the same IV for the first 8 blocks (essentially repeating the first part of the diagram 8 times, then going on to the next operation that mixes in the output). So the result is something like CFB but with the first 64 bytes lacking any protection.

We can see this if we look at the raw bytes in a VimCrypt blowfish file of the string "1234567\n" repeated twice:

00000000  56 69 6d 43 72 79 70 74  7e 30 32 21 50 7b 77 71  |VimCrypt~02!P{wq|
00000010  0c 8f b0 0c 01 c4 1b 2f  e6 65 51 e9 83 5c fd 3e  |......./.eQ..\.>|
00000020  0d 45 a6 8f 83 5c fd 3e  0d 45 a6 8f               |.E...\.>.E..|
0000002c

Notice how the 4 bolded and 4 italicised bytes are identical.

So, this is all very well, but what does it actually mean?

The way CFB works is to compute a stream of data (the keystream), then XOR it with the plaintext. However Vim is reusing the keystream, in pseudocode:

keystream = Blowfish(iv)
ciphertext1 = XOR(keystream, plaintext[0:7])
ciphertext2 = XOR(keystream, plaintext[8:15])

This means by a simple relationship we can recover the keystream:

keystream = XOR(ciphertext1, plaintext[0:7])

This does mean we need to know some plaintext. However it turns out to be trivial to bruteforce because all we need to do is XOR. I have written a PoC script: vim-blowfish-bruteforce.pl:

# Carefully pick some words
egrep '^[A-Za-z]{7}$' /usr/share/dict/words | shuf | head -20 | tr '\n' ' ' > file
# Encrypt the file (you can change the key if you like)
vim --cmd 'set cryptmethod=blowfish' -c 'set key=myGOODkey' -c w -c q file
# Run the bruteforcer
perl vim-blowfish-bruteforce.pl file
valence goblins journal daytime pumpkin scoffed strands gadders
[Or whatever was in your file, up to the first 64 bytes.]

This works on my laptop in 0.2s. Obviously it needs a very specific type of file, but given how fast it is I suspect bruteforcing any plain text would be trivial. The main limitation is after the first 8 blocks (64 bytes) it is harder to decrypt, however use cases such as a small set of passwords in a VimCrypt file are likely breakable (if some plain text, such as a username is known).

This is fixed in Vim 7.4.399, for backwards compatibility reasons it needs to be a new cryptmethod, so use :set cm=blowfish2 (see :help cryptmethod for some details). To convert an existing file you can do:

vim -c 'set cm=blowfish2' -c w -c q file
This will show [blowfish2] when you save it, and also you can check the start of the file has the bytes "VimCrypt~03".

Going back to the title: don't roll your own crypto. There are many advantages to being interoperable, the key one here being fundamental design problems should be reviewed by way of standards.

Fun links:

(I presented a talk about this at OggCamp 14 in Oxford.)