If you’ve done any 3D programming, you’ve likely encountered the zoo of techniques and representations used when working with 3D rotations. Some of them are better than others, depending on the situation.
Linear-algebra-wise, the most straightforward representation is an orthonormal 3x3 matrix (with positive determinant). The three columns of a rotation matrix specify where the x, y, and z axes end up after the rotation.
Rotation matrices are particularly useful for transforming points: just multiply! Even better, rotation matrices can be composed with any other linear transformations via matrix multiplication. That’s why we use rotation matrices when actually drawing things on screen: only one matrix multiplication is required to transform a point from world-space to the screen. However, rotation matrices are not so useful for actually working with rotations: because they don’t form a vector space, adding together two rotation matrices will not give you a rotation matrix back. For example, animating an object by linearly interpolating between two rotation matrices adds scaling:
Another common representation is Euler angles, which specify three separate rotations about the x, y, and z axes (also known as pitch, yaw, and roll). The order in which the three component rotations are applied is an arbitrary convention—here we’ll apply x, then y, then z.
θx
θy
θz
Euler angles are generally well-understood and often used for authoring rotations. However, using them for anything else comes with some significant pitfalls. While it’s possible to manually create splines that nicely interpolate Euler angles, straightforward interpolation often produces undesirable results.
Euler angles suffer from gimbal lock when one component causes the other two axes of rotation to become parallel. Such configurations are called singularities. At a singularity, changing either of two ’locked’ angles will cause the same output rotation. You can demonstrate this phenomenon above by pressing the ’lock’ button and adjusting the x/z rotations (a quarter rotation about y aligns the z axis with the x axis).
Singularities break interpolation: if the desired path reaches a singularity, it gains a degree of freedom with which to represent its current position. Picking an arbitrary representation to continue with causes discontinuities in the interpolated output: even within an axis, interpolation won’t produce a constant angular velocity. That can be a problem if, for example, you’re using the output to drive a robot. Furthermore, since each component angle is cyclic, linear interpolation won’t always choose the shortest path between rotations.
θ0=000
θ(0.00)=−0.00−0.00−0.00
θ1=−3.140.00−3.14
Thankfully, interpolation is smooth if the path doesn’t go through a singularity, so these limitations can be worked around, especially if you don’t need to represent ‘straight up’ and ‘straight down.’
Quaternions
At this point, you might be expecting yet another article on quaternions—don’t worry, we’re not going to delve into hyper-complex numbers today. It suffices to say that unit quaternions are the standard tool for composing and interpolating rotations, since spherical linear interpolation (slerp) chooses a constant-velocity shortest path between any two quaternions. However, unit quaternions also don’t form a vector space, are unintuitive to author, and can be computationally costly to interpolate*. Further, there’s no intuitive notion of scalar multiplication, nor averaging.
But, they’re still fascinating! If you’d like to understand quaternions more deeply (or, perhaps, learn what they are in the first place), read this.
Q0=1000
Q(0.00)=−1.00−0.00−0.00−0.00
Q1=0010
Note that since quaternions double-cover the space of rotations, sometimes Q(1) will go to −Q1.
Axis/Angle Rotations
An axis/angle rotation is a 3D vector of real numbers. Its direction specifies the axis of rotation, and its magnitude specifies the angle to rotate about that axis. We’ll write axis/angle rotations as θu, where u is a unit-length vector and θ is the rotation angle.
ux
uy
uz
θ
Since axis/angle rotations are simply 3D vectors, they form a vector space: we can add, scale, and interpolate them to our heart’s content. Linearly interpolating between any two axis/angle rotations is smooth and imparts constant angular velocity. However, note that linearly interpolating between axis-angle rotations does not necessarily choose the shortest path: it depends on which axis/angle you use to specify the target rotation.
θ0u0=000
θu(0.00)=−0.00−0.00−0.00
θ1u1=03.140
Like quaternions, axis/angle vectors double-cover the space of rotations: sometimes θu(1) will go to (2π−θ1)(−u1).
The Exponential and Logarithmic Maps
Ideally, we could freely convert rotations between these diverse representations based on our use case. We will always want to get a rotation matrix out at the end, so we’ll consider matrices the ‘canonical’ form. Enter the exponential map: a function that takes a different kind of rotation object and gives us back an equivalent rotation matrix. The corresponding logarithmic map takes a rotation matrix and gives us back a rotation object. How these maps relate to the scalar exp and log functions will hopefully become clear later on.
Below, we’ll define an exp and log map translating between rotation matrices and axis/angle vectors. But first, to build up intuition, let us consider how rotations work in two dimensions.
Axis/Angle in 2D
In 2D, there’s only one axis to rotate around: the one pointing out of the plane. Hence, our ‘axis/angle’ rotations can be represented by just θ.
Given a 2D point p, how can we rotate p by θ? One way to visualize the transformation is by forming a coordinate frame in which the output is easy to describe. Consider p and its quarter (90∘) rotation Jp:
Using a bit of trigonometry, we can describe the rotated pθ in two components:
pθ=pcosθ+Jpsinθ=(cos(θ)I+sin(θ)J)p
But, what actually are I and J? The former should take a 2D vector and return it unchanged: it’s the 2x2 identity matrix. The latter should be similar, but swap and negate the two components:
IJ=[1001]=[01−10]
Just to make sure we got J right, let’s check what happens if we apply it twice (via J2):
J2=[01−10][01−10]=[−100−1]
We got J2=−I, which is a 180-degree rotation. So, J indeed represents 90-degree rotation.
That’s the standard 2D rotation matrix. What a coincidence!
Complex Rotations
If you’re familiar with complex numbers, you might notice that our first transform formula feels eerily similar to Euler’s formula, eix=cosx+isinx:
pθeiθp=(cos(θ)I+sin(θ)J)p=(cos(θ)+isin(θ))p
Where i and J both play the role of a quarter turn. We can see that in complex arithmetic, multiplying by i in fact has that effect:
Jp=[01−10][ab]=[−ba]
ip=i(a+bi)=ai+bi2=−b+ai
So, there must be some connection to the exponential function here.
The 2D Exponential Map
Recall the definition of the exponential function (or equivalently, its Taylor series):
ex=k=0∑∞k!xk=1+x+2!x2+3!x3+…
Using Euler’s formula, eiθ gave us a complex number representing a 2D rotation by θ. Can we do the same with eθJ? If we plug a matrix into the above definition, the arithmetic still works out: 2x2 matrices certainly support multiplication, addition, and scaling. (More on matrix exponentiation here.)
Let θJ=A and plug it in:
eA=k=0∑∞k!Ak=I+A+2!A2+3!A3+…
Let’s pull out the first four terms to inspect further:
These entries look familiar. Recall the Taylor series that describe the functions sin and cos:
sinxcosx=x−3!x3+5!x5−…=1−2!x2+4!x4−…
If we write out all the terms of eA, we’ll recover the expansions of sinθ and cosθ! Therefore:
eA=eθJ=[cosθsinθ−sinθcosθ]
We’ve determined that the exponential function eθJ converts our angle θ into a corresponding 2D rotation matrix. In fact, we’ve proved a version of Euler’s formula with 2x2 matrices instead of complex numbers:
⟹eθJpθ=(cos(θ)I+sin(θ)J)=eθJp
The 2D Logarithmic Map
The logarithmic map should naturally be the inverse of the exponential:
Note that in general, our exponential map is not injective. Clearly, exp(θJ)=exp((θ+2π)J), since adding an extra full turn will always give us back the same rotation matrix. Therefore, our logarithmic map can’t be surjective—we’ll define it as returning the smallest angle θJ corresponding to the given rotation matrix. Using atan2 implements this definition.
Interpolation
Consider two 2D rotation angles θ0 and θ1. The most obvious way to interpolate between these two rotations is to interpolate the angles and create the corresponding rotation matrix. This scheme is essentially a 2D version of axis-angle interpolation.
However, if θ0 and θ1 are separated by more than π, this expression will take the long way around: it’s not aware that angles are cyclic.
θ0=0
θ(0.00)=0.00
θ1=4.71
Instead, let’s devise an interpolation scheme based on our exp/log maps. Since we know the two rotation matrices R0, R1, we can express the rotation that takes us directly from the initial pose to the final pose: R1R0−1, i.e. first undo R0, then apply R1.
Using our logarithmic map, we can obtain the smallest angle that rotates from R0 to R1: log(R1R0−1). Since log gives us an axis-angle rotation, we can simply scale the result by t to perform interpolation. After scaling, we can use our exponential map to get back a rotation matrix. This matrix represents a rotation t of the way from R0 to R1.
Hence, our final parametric rotation matrix is R(t)=exp(tlog(R1R0−1)))R0.
Using exp/log for interpolation might seem like overkill for 2D—we could instead just check how far apart the angles are. But below, we’ll see how this interpolation scheme generalizes—without modification—to 3D, and in fact any number of dimensions.
Axis/Angle in 3D
We’re finally ready to derive an exponential and logarithmic map for 3D rotations. In 2D, our map arose from exponentiating θJ, i.e. θ times a matrix representing a counter-clockwise quarter turn about the axis of rotation. We will be able to do the same in 3D—but what transformation encodes a quarter turn about a 3D unit vector u?
The cross product u×p is typically defined as a vector normal to the plane containing both u and p. However, we could also interpret u×p as a quarter turn of the projection of p into the plane with normal u, which we will call p⊥:
So, if we can compute the quarter rotation of p⊥, it should be simple to recover the quarter rotation of p. Of course, p=p⊥+p∥, so we’ll just have to add back the parallel part p∥. This is correct because a rotation about u preserves p∥:
However, “u×” is not a mathematical object we can work with. Instead, we can devise a matrix u^ that when multiplied with a a vector p, outputs the same result as u×p:
We can see that u^T=−u^, so u^ is a skew-symmetric matrix. (i.e. it has zeros along the diagonal, and the two halves are equal but negated.) Note that in the 2D case, our quarter turn J was also skew-symmetric, and sneakily represented the 2D cross product! We must be on the right track.
The reason we want to use axis/angle rotations in the first place is because they form a vector space. So, let’s make sure our translation to skew-symmetric matrices maintains that property. Given two skew-symmetric matrices A1 and A2:
(A1+A2)T=A1T+A2T=−A1−A2=−(A1+A2)
Their sum is also a skew-symmetric matrix. Similarly:
(cA)T=c(AT)=−cA
Scalar multiplication also maintains skew-symmetry. The other vector space properties follow from the usual definition of matrix addition.
Finally, note that u×(u×(u×p))=−u×p. Taking the cross product three times would rotate p⊥ three-quarter turns about u, which is equivalent to a single negative-quarter turn. More generally, u^k+2=−u^k for any k>0. We could prove this by writing out all the terms, but the geometric argument is easier:
The 3D Exponential Map
Given an axis/angle rotation θu, we can make θu^ using the above construction. What happens when we exponentiate it? Using the identity u^k+2=−u^k:
In the last step, we again recover the Taylor expansions of sinθ and cosθ. Our final expression is known as Rodrigues’ formula.
This formula is already reminiscent of the 2D case: the latter two terms are building up a 2D rotation in the plane defined by u. To sanity check our 3D result, let’s compute our transform for θ=0:
e0u^p=(I+0u^+(1−1)u^2)p=p
Rotating by θ=0 preserves p, so the formula works. Then compute for θ=2π:
We end up with −p⊥+p∥, which is a half rotation. Hence θ=π is also correct.
So far, our formula checks out. Just to be sure, let’s prove that our 3D result is a rotation matrix, i.e. it’s orthonormal and has positive determinant. A matrix is orthonormal if ATA=I, so again using u^k+2=−u^k:
Therefore, eθu^ is orthonormal. We could show its determinant is positive (and therefore 1) by writing out all the terms, but it suffices to argue that:
Clearly, exp(0u^)=I=1
There is no θ, u^ such that exp(θu^)=0, since u^ and u^2 can never cancel out I.
exp is continuous with respect to θ and u^
Therefore, exp(0u^) can never become negative. That means exp(θu^) is a 3D rotation matrix!
The 3D Logarithmic Map
Similarly to the 2D case, the 3D exponential map is not injective, so the 3D logarithmic map will not be surjective. Instead, we will again define it to return the smallest magnitude axis-angle rotation corresponding to the given matrix. Our exponential map gave us:
R=exp(θu^)=I+sin(θ)u^+(1−cos(θ))u^2
We can take the trace (sum along the diagonal) of both sides:
tr(R)=tr(I)+sin(θ)tr(u^)+(1−cos(θ))tr(u^2)
Clearly tr(I)=3, and since u^ is skew-symmetric, its diagonal sum is zero. That just leaves u^2:
Our interpolation scheme produces all the nice properties of axis/angle rotations—and chooses the shortest path every time. This wouldn’t look so smooth with Euler angles!
Averaging Rotations
However, we would have gotten an equally good interpolation scheme by just using quaternions instead of messing about with all this matrix math. Let’s consider something interesting we can only easily do with axis/angle rotations: averaging a set of rotation matrices.
The most straightforward method is to convert each matrix into an axis/angle rotation, average the resulting vectors, and convert back. That is certainly a valid strategy, but the resulting behavior won’t be very intuitive:
In particular, summing axis-angle vectors can result in “catastrophic cancellation.” An extreme example is averaging [π00] and [−π00], resulting in zero—which is clearly not representative of the two equivalent rotations.
To find an alternative, let’s first consider a slightly unconventional way of averaging points in the plane. The average of a set of points is the point that minimizes total squared distance to all others. Hence, there’s an optimization-based algorithm for finding it. Given x0,…,xn, we can iterate the following procedure:
Pick an initial guess xˉ∈R2 (can be one of the points).
Repeat:
For each point, get its translation from the guess: ui←xi−xˉ
Average the vectors: u←n1∑i=1nui
Step toward the average direction: xˉ←xˉ+τu
while ∣u∣>ϵ.
As we run this procedure, xˉ will converge to the average point. Of course, we could have just averaged the points directly, but we’ll be able to translate this idea to the rotational case rather nicely.
Our logarithmic map lets us convert rotation matrices to axis axis/angle rotations, which are themselves just 3D points. So, what if we use the point averaging algorithm on rotations R0,…,Rn?
Pick an initial guess Rˉ∈R3×3 (can be I).
Repeat:
For each matrix, get its axis/angle rotation from the guess: ui←log(RiRˉ−1)
Average the vectors: u←n1∑i=1nui
Step toward the average rotation: Rˉ←exp(τu)Rˉ
while ∣u∣>ϵ.
The result of this algorithm is formally known as the Karcher mean. Just like how the average point minimizes total squared distance from all other points, the Karcher mean is a rotation that minimizes squared angular distance from all other rotations. Therefore, it won’t be subject to catastrophic cancellation—we’ll always end up with a non-zero in-between rotation.
Try comparing the two averaging algorithms—randomizing will keep them in sync. While the results are often similar, the Karcher mean exhibits more consistent behavior.
Quaternions (Again)
Warning: section assumes knowledge of quaternions
Okay, I couldn’t resist talking about quaternions at least a little bit, given how closely they’re related to axis/angle rotations. Just like how complex exponentiation turned out to be equivalent to (skew-symmetric) 2D matrix exponentiation, quaternion exponentiation is equivalent to (skew-symmetric) 3D matrix exponentiation.
In 2D, an axis/angle rotation was simply θ. We created a pure-imaginary complex number iθ and exponentiated it:
eiθ=cosθ+isinθ
We got back a complex number that when multiplied with a point, rotates it by θ. It’s always the case that ∣cosθ+isinθ∣=1, so 2D rotations can be represented as unit-norm complex numbers.
In 3D, an axis/angle rotation is a vector u such that ∣u∣=θ. What happens if we create a pure-imaginary quaternion q=uxi+uyj+uzk and exponentiate it, too?
To make evaluating eq easier, first derive the following using the quaternion multiplication rules:
Our result looks almost exactly like the 2D case, just with three imaginary axes instead of one. In 2D, our axis/angle rotation became a unit-norm complex number. In 3D, it became a unit-norm quaternion. Now we can use this quaternion to rotate 3D points! Pretty cool, right?
One advantage of using quaternions is how easy the exponential map is to compute—if you don’t need a rotation matrix, it’s a good option. The quaternion logarithmic map is similarly simple:
θ=arccos(ℜ(q)),u=sinθ1ℑ(q)
Finally, note that the way to rotate a point p by a quaternion q is by evaluating the conjugation qpq−1, where p=pxi+pyj+pzk is another pure-imaginary quaternion representing our point. The conjugation technically rotates the point by 2θ about u, but that’s easily accounted for by making ∣u∣=2θ in the beginning.
Further Reading
Made it this far? Well, there’s even more to learn about rotations.
Learn about quaternions here, and why geometric algebra is more intuitive here.
Beyond understanding the four representations covered here (plus geometric algebra), it can be enlightening to learn about the algebraic structure underlying all 3D rotations: the group SO(3). I found this video to be a great resource: it explains SO(3) both intuitively and visually, demonstrating how it relates it to the group SU(2) as well as why quaternions and axis/angle rotations double-cover 3D rotation matrices.
The wikipedia page on SO(3) is also informative, though very math heavy. It touches on connections with axis/angle rotations, topology, SU(2), quaternions, and Lie algebra. It turns out the vector space of skew-symmetric matrices we derived above makes up so(3), the Lie algebra that corresponds to SO(3).