HAKMEMC -- HAKMEM Programming hacks in C (NOT proof-read)

This document attempts to make accessible to modern programmers the wonderful coding tricks invented at MIT in the late 60's/early 70's. It corresponds to the `Programming Hacks' section of
Beeler, M., Gosper, R.W., and Schroeppel, R. HAKMEM. MIT AI Memo 239, Feb. 29, 1972, made Web-available by Henry Baker.
The original examples were in PDP-10 assembler language. The PDP-10 had 36-bit data registers; I have tried to generalise the tricks so as to work on any word length (subject to restrictions as noted). Moreover I have chosen octal, decimal and hexadecimal notation to ease reading the magic numbers involved. I have also collected additional (non-HAKMEM) tricks.

PROGRAMMING HACKS

Note the examples have been re-ordered to give the most C-like ones first:

ITEM 155 (Liknaitzky):

Given two halfwords in a word w (here two 16 bit values in an unsigned32) to subtract the ls half from the ms half just do: w = w * 0xffff0001;

ITEM 161 (Gosper):

Original swapped the contents of two memory locations without using a work register using EXCHANGE reg,mem only. Note the similar trick for swapping two (distinct) variables u, v u ^= v; v ^= u; u ^= v; If u, v are in registers this can save the need for a temporary (which otherwise might cause other variables to spill!).

ITEM 167 (Gosper, Freiberg):

/* evenp7() adds even parity to its 7 bit character argument... */ unsigned evenp7(unsigned36 a) { return ((a * 002010040201) /* 5 adjacent copies */ & 021042104377) /* every 4th bit of left 4 copies + right copy */ % (15<<7); /* casting out 15.'s in hexadecimal shifted 7 */ } /* odd parity on 7 bits (Schroeppel) */ unsigned oddp7(unsigned36 a) { return ((a * 000010040201) /* 4 adjacent copies */ | 007555555400) /* leaves every 3rd bit+offset+right copy */ % (9<<7); /* powers of 2^3 are +-1 mod 9 */ /* changing 7555555400 to 27555555400 gives even parity */ } /* count number of 1's in 9-bit argument (Schroeppel) */ unsigned count_ones(unsigned36 a) { return ((a * 01001001001) /* 4 adjacent copies */ & 042104210421) /* every 4th bit */ % 15; /* casting out 15.'s in hexadecimal */ } /* if argument is 6 bit quantity, return 6 bits reversed (Schroeppel) */ unsigned reverse_6bits(unsigned36 a) { return ((a * 02020202) /* 4 adjacent copies shifted */ & 0104422010) /* where bits coincide with reverse repeated base 2^8 */ % 255; /* casting out 2^8 - 1's */ } /* reverse 7 bits (Schroeppel) */ unsigned reverse_7bits(unsigned36 a) { return ((a * 010004002001) /* 4 copies sep by 000's base 2 */ & 0210210210010) /* where bits coincide with reverse repeated base 2^8 */ % 255; /* casting out 2^8 - 1's */ /* reverse 8 bits (Schroeppel) */ unsigned reverse_8bits(unsigned41 a) { return ((a * 0x000202020202) /* 5 copies in 40 bits */ & 0x010884422010) /* where bits coincide with reverse repeated base 2^10 */ /* PDP-10: 041(6 bits):020420420020(35 bits) */ % 1023; /* casting out 2^10 - 1's */ }

ITEM 169 (in order of one-ups-manship: Gosper, Mann, Lenard, [Root and Mann]):

To count the ones in a 36-bit word: unsigned count_ones(unsigned36 a) { unsigned36 b; b = (a>>>1) & 0333333333333; a -= b; b = (b>>>1) & 0333333333333; a -= b; /* each octal digit is replaced by number of 1's in it */ a = ((a>>3) + a) & 0070707070707; return a % 63; /* casting out 63's */ /* This code (10 PDP-10 instructions), with constants extended, would work on word lengths up to 62; eleven suffice up to 254.

ITEM 175 (Gosper):

To get the next higher number with the same number of 1 bits: unsigned nexthi_same_count_ones(unsigned a) { /* works for any word length */ unsigned c = (a & -a); unsigned r = a+c; return (((r ^ a) >> 2) / c) | r); }

ITEM 174 (Gosper, Nelson):

21963283741 is a fixed point of the float function on the PDP-6/10, i.e., it is the only positive number whose floating point representation equals its fixed. (Mycroft) Using 32-bit IEEE arithmetic (with 1/8/23 bits having sign 0x80000000, exponent 0x7f800000, matissa 0x007fffff) and 32-bit int's the only solution to this equation is 0. I.e. union { int i; float f; } u; if ((float)u.i == u.f) /* then we have u.i = 0x00000000 */

ITEM 145 (Gosper):

Audio/Video program not yet translated to C.

ITEM 146: MUNCHING SQUARES

Audio/Video program not yet translated to C.

ITEM 147 (Schroeppel):

Munching squares is just views of the graph Y = X XOR T for consecutive values of T = time.

ITEM 148 (Cohen, Beeler):

Audio/Video program not yet translated to C.

ITEM 149 (Minsky): CIRCLE ALGORITHM

Here is an elegant way to draw almost circles on a point-plotting display:

NEW X = OLD X - epsilon * OLD Y
NEW Y = OLD Y + epsilon * NEW(!) X

This makes a very round ellipse centered at the origin with its size determined by the initial point. epsilon determines the angular velocity of the circulating point, and slightly affects the eccentricity. If epsilon is a power of 2, then we don't even need multiplication, let alone square roots, sines, and cosines! The "circle" will be perfectly stable because the points soon become periodic.

The circle algorithm was invented by mistake when I tried to save one register in a display hack! Ben Gurley had an amazing display hack using only about six or seven instructions, and it was a great wonder. But it was basically line-oriented. It occurred to me that it would be exciting to have curves, and I was trying to get a curve display hack with minimal instructions.

ITEM 150 (Schroeppel):

PROBLEM: Although the reason for the circle algorithm's stability is unclear, what is the number of distinct sets of radii? (Note: algorithm is invertible, so all points have predecessors.)

ITEM 151 (Gosper):

Separating X from Y in the above recurrence,

X(N+1) = (2 - epsilon^2) * X(N) - X(N-1)
Y(N+1) = (2 - epsilon) * Y(N) - Y(N-1).

These are just the Chebychev recurrence with cos theta (the angular increment) = 1-epsilon^2/2. Thus X(N) and Y(N) are expressible in the form R cos(N theta + phi). The phi's and R for X(N) and Y(N) can be found from N=0,1. The phi's will differ by less than pi/2 so that the curve is not really a circle. The algorithm is useful nevertheless, because it needs no sine or square root function, even to get started.

X(N) and Y(N) are also expressible in closed form in the algebra of ordered pairs described under linear recurrences, but they lack the remarkable numerical stability of the "simultaneous" form of the recurrence.

ITEM 152 (Salamin):

With exact arithmetic, the circle algorithm is stable iff |epsilon| < 2. In this case, all points lie on the ellipse

X^2 - epsilon X Y + Y^2 = constant,

where the constant is determined by the initial point. This ellipse has its major axis at 45 degrees (if epsilon > 0) or 135 degrees (if epsilon < 0) and has eccentricity

sqrt(epsilon/(1 + epsilon/2)).

ITEM 153 (Minsky):

To portray a 3-dimensional solid on a 2-dimensional display, we can use a single circle algorithm to compute orbits for the corners to follow. The (positive or negative) radius of each orbit is determined by the distance (forward or backward) from some origin to that corner. The solid will appear to wobble rigidly about the origin, instead of simply rotating.

ITEM 154 (Gosper):

The myth that any given programming language is machine independent is easily exploded by computing the sum of powers of 2.

If the result loops with period = 1 with sign +, you are on a sign-magnitude machine.
If the result loops with period = 1 at -1, you are on a twos-complement machine.
If the result loops with period > 1, including the beginning, you are on a ones-complement machine.
If the result loops with period > 1, not including the beginning, your machine isn't binary -- the pattern should tell you the base.
If you run out of memory, you are on a string or Bignum system.
If arithmetic overflow is a fatal error, some fascist pig with a read-only mind is trying to enforce machine independence. But the very ability to trap overflow is machine dependent.

By this strategy, consider the universe, or, more precisely, algebra:

let X = the sum of many powers of two = ...111111
now add X to itself; X + X = ...111110
thus, 2X = X - 1 so X = -1
therefore algebra is run on a machine (the universe) which is twos-complement.

ITEM 156 (Mitchell):

Unhelpful for C: making a AOBJN DEC-10 descriptor.

ITEM 157 (Freiberg):

Unhelpful for C: making a AOBJN DEC-10 descriptor.

ITEM 158 (Gosper, Salamin, Schroeppel):

A nice recursive sin/cos routine.

ITEM 159 (Gosper, Schroeppel):

Notes on ITEM 158 (Numbers herein are decimal.)

ITEM 160 (Gosper):

A way to get log base 2.

ITEM 162 (Gosper):

PDP-10 trick to swap two bits in an accumulator.

ITEM 163 (Sussman):

To exchange two variables in LISP without using a third variable:

(SETQ X (PROG2 0 Y (SETQ Y X)))

ITEM 164 (Samson):

PDP-10 trick to take MAX of two byte descriptors (the PDP-10 was word addressed).

ITEM 165 (Freiberg):

PDP-10 trick to convert a byte descriptors into a byte address.

ITEM 166 (Gosper, Liknaitzky):

How to use 3 `rotate' instructions to do a triple-word rotate.

ITEM 168 (PDP-1 hackers):

PDP-10 code to make `startling chords, arpeggios, and slides, with just the sign of the AC.'

ITEM 170 (Jensen):

Trick to pass negative values to routines which supposedly only print positive values in decimal to print other useful strings like CR-LF.

ITEM 171 (Gosper):

Since integer division can never produce a larger quotient than dividend, doubling the dividend and divisor beforehand will distinguish division by zero from division by 1 or anything else, on machines where division by zero leaves the dividend as result instead of trapping.

ITEM 172 (Gosper):

Trick using self-modifying code to take an element from a free-storage chain without requiring a work register.

ITEM 173 (Gosper):

Two-or-three instruction PDP-6/PDP-10 versions of int fix(float x).