Last month I wrote some notes on endianness, arguing that it is fundamentally about bit order: byte order arises from the combination of bit order and bytewide addressing. I also paid attention to the algebraic properties of endianaware bit strings, but I didn’t pay attention to performance.
 towards practical endianness
 good associations
 deserialize abstractly
 deserialize completely
 deserialize beautifully
 essential endianness
 mismatched memory
towards practical endianness
Last time I was thinking in purely functional style, but for most practical purposes we encounter endianness when reading from and writing to a buffer imperatively.
Also, in functional style, I only worked with the head/start of a bitstring, as if it were a traditional linked list. But in practice when working in a buffer we typically read from the start and write to the end.
So I have been wondering about code that deals with bitwise endianness in a practical manner while still benefiting from the algebraic properties.
good associations
Somehow last time I failed to mention “monoid” at all. Bitstrings, being strings, can be concatenated, which is an associative binary operator with the empty string as its identity. This makes a monoid, and monoids are useful enough that it’s worth paying attention when you spot one.
Concatenation works on either end of a string, which matches buffering better than last time’s design based on fold and unfold. So I need to explain how to serialize and deserialize at either end of a bitstring with either endianness.
Let’s play a trick, and allow ourselves to read a bitstring from either end, regardless of its endianness, i.e. abstracting away from its transmission order or representation in memory.
deserialize abstractly
We are looking at one end of a bistring, either the big end or the little end. We place a binary point just on this sice of the end closest to us, and extract a number:
Starting from the little end, we give each bit the value 2**i
where
i
counts up from 0. Divide the sum by 2**N
and take the remainder
as the value of our number; the quotient is what is left of the
bitstring.
Starting from the big end, we give each bit the value 2**i
where i
counts down from 1. Multiply the sum by 2**N
and take the integer
part as the value of our number; the fractional part is what is left
of the bitstring.
deserialize completely
Given a bitstring N bits long, if we convert all of it to an N bit number, we get the same value regardless of whether we start from the big end or the little end.
This gives us two equivalent dual definitions of how to convert between a bitstring and a number. And this definition is decoupled from the bistring’s endianness, and more separate from splitting and concatenation than the way I described it before.
deserialize beautifully
I originally chose the singleended way of looking at deserialization because of the way the quotient·remainder or integer·fraction parts map nicely to the split between the extracted number and the rest of the bitstring, and they way it firmly places the binary point. But there’s also a pleasing symmetry to a doubleended bidirectional conversion between bitstrings and numbers.
essential endianness
Now that we have described the value of a bitstring independently of its transmission order, all that is left of endiannes is the choice of whether the big end is first, or the little end is first.
mismatched memory
It gets messy when we need to represent bitstrings in memory, for two reasons:

memory is addressed in bytes, which makes endianness look superficially like a question of byte order rather than bit order

there is a conventional numbering of items in memory which might or might not match the bitstring’s endianness (for instance, 0based addressing doesn’t fit bigendianness comfortably)
The remaining challenge is to come up with an implementation that’s efficient without being ugly.