This document attempts to make accessible to modern programmers the
wonderful coding tricks invented at MIT in the late 60's/early 70's.
It corresponds to the
`Programming Hacks'
section of
Beeler, M., Gosper, R.W., and Schroeppel, R.
HAKMEM. MIT AI Memo 239, Feb. 29, 1972, made
Web-available by
Henry Baker.
The original examples were in
PDP-10
assembler language.
The PDP-10 had 36-bit data registers; I have tried to generalise the
tricks so as to work on any word length (subject to restrictions as noted).
Moreover I have chosen octal, decimal and hexadecimal notation to ease reading
the magic numbers involved.
I have also collected
additional (non-HAKMEM) tricks.
PROGRAMMING HACKS
Note the examples have been re-ordered to give the most C-like ones first:
ITEM 155 (Liknaitzky):
Given two halfwords in a word w
(here two 16 bit values in an unsigned32)
to subtract the ls half from the ms half just do:
w = w * 0xffff0001;
ITEM 161 (Gosper):
Original swapped the contents of two memory locations without
using a work register using EXCHANGE reg,mem only.
Note the similar trick for swapping two (distinct) variables u, v
u ^= v;
v ^= u;
u ^= v;
If u, v are in registers this can save the need for
a temporary (which otherwise might cause other variables to spill!).
ITEM 167 (Gosper, Freiberg):
/* evenp7() adds even parity to its 7 bit character argument... */
unsigned evenp7(unsigned36 a) {
return ((a * 002010040201) /* 5 adjacent copies */
& 021042104377) /* every 4th bit of left 4 copies + right copy */
% (15<<7); /* casting out 15.'s in hexadecimal shifted 7 */
}
/* odd parity on 7 bits (Schroeppel) */
unsigned oddp7(unsigned36 a) {
return ((a * 000010040201) /* 4 adjacent copies */
| 007555555400) /* leaves every 3rd bit+offset+right copy */
% (9<<7); /* powers of 2^3 are +-1 mod 9 */
/* changing 7555555400 to 27555555400 gives even parity */
}
/* count number of 1's in 9-bit argument (Schroeppel) */
unsigned count_ones(unsigned36 a) {
return ((a * 01001001001) /* 4 adjacent copies */
& 042104210421) /* every 4th bit */
% 15; /* casting out 15.'s in hexadecimal */
}
/* if argument is 6 bit quantity, return 6 bits reversed (Schroeppel) */
unsigned reverse_6bits(unsigned36 a) {
return ((a * 02020202) /* 4 adjacent copies shifted */
& 0104422010) /* where bits coincide with reverse repeated base 2^8 */
% 255; /* casting out 2^8 - 1's */
}
/* reverse 7 bits (Schroeppel) */
unsigned reverse_7bits(unsigned36 a) {
return ((a * 010004002001) /* 4 copies sep by 000's base 2 */
& 0210210210010) /* where bits coincide with reverse repeated base 2^8 */
% 255; /* casting out 2^8 - 1's */
/* reverse 8 bits (Schroeppel) */
unsigned reverse_8bits(unsigned41 a) {
return ((a * 0x000202020202) /* 5 copies in 40 bits */
& 0x010884422010) /* where bits coincide with reverse repeated base 2^10 */
/* PDP-10: 041(6 bits):020420420020(35 bits) */
% 1023; /* casting out 2^10 - 1's */
}
ITEM 169 (in order of one-ups-manship: Gosper, Mann, Lenard, [Root and
Mann]):
To count the ones in a 36-bit word:
unsigned count_ones(unsigned36 a) {
unsigned36 b;
b = (a>>>1) & 0333333333333; a -= b;
b = (b>>>1) & 0333333333333; a -= b;
/* each octal digit is replaced by number of 1's in it */
a = ((a>>3) + a) & 0070707070707;
return a % 63; /* casting out 63's */
/*
This code (10 PDP-10 instructions), with constants extended,
would work on word lengths up to 62; eleven suffice up to 254.
ITEM 175 (Gosper):
To get the next higher number with the same number of 1 bits:
unsigned nexthi_same_count_ones(unsigned a) {
/* works for any word length */
unsigned c = (a & -a);
unsigned r = a+c;
return (((r ^ a) >> 2) / c) | r);
}
ITEM 174 (Gosper, Nelson):
21963283741 is a fixed point of the float function on the
PDP-6/10,
i.e., it is the only positive number whose floating
point representation equals its fixed.
(Mycroft) Using 32-bit IEEE arithmetic (with 1/8/23 bits having
sign 0x80000000, exponent 0x7f800000, matissa 0x007fffff) and 32-bit int's
the only solution to this equation is 0.
I.e.
union { int i; float f; } u;
if ((float)u.i == u.f) /* then we have u.i = 0x00000000 */
ITEM 145 (Gosper):
Audio/Video program not yet translated to C.
ITEM 146: MUNCHING SQUARES
Audio/Video program not yet translated to C.
ITEM 147 (Schroeppel):
Munching squares is just views of the graph Y = X XOR T for consecutive values
of T = time.
ITEM 148 (Cohen, Beeler):
Audio/Video program not yet translated to C.
ITEM 149 (Minsky): CIRCLE ALGORITHM
Here is an elegant way to draw almost circles on a point-plotting display:
NEW X = OLD X - epsilon * OLD Y
NEW Y = OLD Y + epsilon * NEW(!) X
This makes a very round ellipse centered at the origin with its size
determined by the initial point. epsilon determines the angular
velocity of the circulating point, and slightly affects the
eccentricity. If epsilon is a power of 2, then we don't even need
multiplication, let alone square roots, sines, and cosines! The
"circle" will be perfectly stable because the points soon become
periodic.
The circle algorithm was invented by mistake when I tried to save one
register in a display hack! Ben Gurley had an amazing display hack
using only about six or seven instructions, and it was a great wonder.
But it was basically line-oriented. It occurred to me that it would
be exciting to have curves, and I was trying to get a curve display
hack with minimal instructions.
ITEM 150 (Schroeppel):
PROBLEM: Although the reason for the circle algorithm's stability is
unclear, what is the number of distinct sets of radii? (Note:
algorithm is invertible, so all points have predecessors.)
ITEM 151 (Gosper):
Separating X from Y in the above recurrence,
X(N+1) = (2 - epsilon^2) * X(N) - X(N-1)
Y(N+1) = (2 - epsilon) * Y(N) - Y(N-1).
These are just the Chebychev recurrence with cos theta (the angular
increment) = 1-epsilon^2/2. Thus X(N) and Y(N) are expressible in the
form R cos(N theta + phi). The phi's and R for X(N) and Y(N) can be
found from N=0,1. The phi's will differ by less than pi/2 so that the
curve is not really a circle. The algorithm is useful nevertheless,
because it needs no sine or square root function, even to get
started.
X(N) and Y(N) are also expressible in closed form in the algebra of
ordered pairs described under linear recurrences, but they lack the
remarkable numerical stability of the "simultaneous" form of the
recurrence.
ITEM 152 (Salamin):
With exact arithmetic, the circle algorithm is stable iff |epsilon| < 2. In
this case, all points lie on the ellipse
X^2 - epsilon X Y + Y^2 = constant,
where the constant is determined by the initial point. This ellipse
has its major axis at 45 degrees (if epsilon > 0) or 135 degrees
(if epsilon < 0) and has eccentricity
sqrt(epsilon/(1 + epsilon/2)).
ITEM 153 (Minsky):
To portray a 3-dimensional solid on a 2-dimensional display, we can
use a single circle algorithm to compute orbits for the corners to
follow. The (positive or negative) radius of each orbit is determined
by the distance (forward or backward) from some origin to that corner.
The solid will appear to wobble rigidly about the origin, instead of
simply rotating.
ITEM 154 (Gosper):
The myth that any given programming language is machine independent is
easily exploded by computing the sum of powers of 2.
- If the result loops with period = 1 with sign +, you are on a
sign-magnitude machine.
- If the result loops with period = 1 at -1, you are on a
twos-complement machine.
- If the result loops with period > 1, including the beginning, you
are on a ones-complement machine.
- If the result loops with period > 1, not including the beginning,
your machine isn't binary -- the pattern should tell you the base.
- If you run out of memory, you are on a string or Bignum system.
- If arithmetic overflow is a fatal error, some fascist pig with a
read-only mind is trying to enforce machine independence. But the
very ability to trap overflow is machine dependent.
By this strategy, consider the universe, or, more precisely, algebra:
let X = the sum of many powers of two = ...111111
now add X to itself; X + X = ...111110
thus, 2X = X - 1 so X = -1
therefore algebra is run on a machine (the universe) which is twos-complement.
ITEM 156 (Mitchell):
Unhelpful for C: making a AOBJN DEC-10 descriptor.
ITEM 157 (Freiberg):
Unhelpful for C: making a AOBJN DEC-10 descriptor.
ITEM 158 (Gosper, Salamin, Schroeppel):
A nice recursive sin/cos routine.
ITEM 159 (Gosper, Schroeppel):
Notes on ITEM 158
(Numbers herein are decimal.)
ITEM 160 (Gosper):
A way to get log base 2.
ITEM 162 (Gosper):
PDP-10 trick to swap two bits in an accumulator.
ITEM 163 (Sussman):
To exchange two variables in LISP without using a third variable:
(SETQ X (PROG2 0 Y (SETQ Y X)))
ITEM 164 (Samson):
PDP-10 trick to take MAX of two byte descriptors (the PDP-10 was
word addressed).
ITEM 165 (Freiberg):
PDP-10 trick to convert a byte descriptors into a byte address.
ITEM 166 (Gosper, Liknaitzky):
How to use 3 `rotate' instructions to do a triple-word rotate.
ITEM 168 (PDP-1 hackers):
PDP-10 code to
make `startling chords, arpeggios, and slides, with just the sign of the AC.'
ITEM 170 (Jensen):
Trick to pass negative values to routines which supposedly
only print positive values in decimal to print other useful strings like
CR-LF.
ITEM 171 (Gosper):
Since integer division can never produce a larger quotient than
dividend, doubling the dividend and divisor beforehand will
distinguish division by zero from division by 1 or anything else, on
machines where division by zero leaves the dividend as result instead of
trapping.
ITEM 172 (Gosper):
Trick using self-modifying code to take an element from a free-storage
chain without requiring a work register.
ITEM 173 (Gosper):
Two-or-three instruction PDP-6/PDP-10 versions of int fix(float x).