These are some notes on what I have done with Rust over the last week, in the course of doing something highly experimental that could well turn out to be foolishly over-complicated. (I'll be interested to hear about other ways of doing this, in other programming languages as well as Rust!) But it involves some cool ideas - abstract interpretation, type-level programming in Rust, automatically optimized data structures, and a bit of geometric algebra for 3D graphics - and I want to share what I learned.
Health warning: this did turn out to be foolishly over-complicated - I suggest reading the followup before wading in to the rest of these notes, if you have not already read them.
Aside: There are a lot of code snippets in these notes, and a couple of Rust Playground links. The code is currently a barely-working proof-of-concept with very few comments. These notes are the first piece of documentation.
context
Recently I made some animations (involving Truchet tiles and the Hilbert curve), but I was lazy and hacked them together in C. Foolish! I missed a good opportunity to practise coding in Rust.
For my next animation I will use Rust, and I want to do some 3D graphics. I don't think I have done any 3D graphics since I was a teenager, so I needed to brush up my knowledge and remind myself how it works.
Back then the standard 3D graphics techniques were based on projective geometry and homogeneous coordinates. I found out that nowadays quaternions are often preferred for doing rotations instead of using a matrix. But even quaternions are being superseded.
geometric algebra
Geometric algebra is a framework that generalizes things like complex numbers, quaternions, vectors, matrices, etc. It can be used to model lots of things, only a few of which I understand: Euclidean and non-Euclidean geometry, projective geometry, the geometry of Einsteinian space-time, quantum mechanics, etc. I am going to use it for 3D projective geometry, as described in these slides about geometric algebra for computer graphics, from a tutorial at SIGGRAPH 2019. You can find the SIGGRAPH course notes and a lot more at biVector.net. I also found the Euclidean Space website and Eric Chisolm's introduction helpful. My friend and colleague Rich Wareham did a PhD on geometric algebra for computer graphics including non-Euclidean and conformal geometry.
multivectors
These notes are more about Rust than about geometric algebra; but geometric algebra is why I am doing wild things with Rust. The wild things are an answer to the question of how to represent geometric objects both elegantly and efficiently — but what makes them inefficient or inelegant?
Geometric algebra works with objects called multivectors. In PGA3D (projective geometric algebra for 3D graphics) a multivector has 5 parts:
0: A scalar part, which is just a real number.
1: A vector part, consisting of four numbers, which in PGA3D represents a plane or a rotation in that plane, like quaternions.
2: A bivector part, consisting of six numbers, which in PGA3D represents a line, like Plücker coordinates.
3: A trivector part, consisting of four numbers, which in PGA3D represents a point or direction, like a vector in homogeneous coordinates.
4: A pseudoscalar part, like an imaginary number, which is used for things like inversion.
That's a total of 16 numbers, same as the kind of 4x4 transformation matrix I used for 3D graphics when I was a teenager.
a dilemma
OK, so on the one hand we have an elegant way to describe all sorts of geometric objects and operations in terms of multivectors that contain 16 numbers.
And on the other hand, in a more traditional setup we have specialized objects for points (4 numbers), lines (6 numbers), and rotations (4 numbers).
In a program that is working with a lot of objects, we want them to be as compact as possible, so that we don't waste memory or bus bandwidth.
And the extra space used by multivectors is obnoxiously wasteful: it is mostly zeroes, and if we know what kind of object the multivector represents, we know which parts of it must be zero.
zero-sized zero
Unlike C and C++, Rust has zero-sized types. I can declare
struct Zero;
to create a type representing zeroes that vanish at run time: they take no space. I can implement Rust's operator overloading traits so that I can use my vanishing zeroes in arithmetic in the usual way. For example, when I add a number to zero, I get back the same number:
impl Add<Num> for Zero {
type Output = Num;
fn add(self, other: Num) -> Num {
other
}
}
but when I multiply a number by zero, the number vanishes and I get zero:
impl Mul<Num> for Zero {
type Output = Zero;
fn mul(self, _: Num) -> Zero {
Zero
}
}
Aside: I have pointlessly declared a type alias
type Num = f64
in case I want to change it, even though I probably won't.
variable-sized multivectors
I can define a fixed-size PGA3D multivector like
struct Multivector(V0, V1, V2, V3, V4);
using some helper types V0
... V4
that I have defined to stop me
from muddling up the various parts.
I can make this into a variable-sized multivector If I parameterize all these types:
struct Multivec<T0, T1, T2, T3, T4>
(V0<T0>, V1<T1>, V2<T2>, V3<T3>, V4<T4>);
Then a full-sized multivector is:
type Multivector =
Multivec<Num, Num, Num, Num, Num>;
And I can make parts of the multivector vanish by setting the corresponding type parameters to zero.
Aside: This
Multivec<>
is variable-sizes in the sense that it can have different shapes in different parts of the source code, but each shape is a fixed size at run time. In Rust terms, all the instances areSized
.
the big idea
I want to write geometric algebra code that is both efficient and elegant.
Efficient means that a multivector is specialized to the geometric object that it represents, and any parts that are known to be zero vanish: they are not present in memory and no calculations are performed on them.
Elegant means that the code I write looks like it is working with fully-general multivectors: I don't have to tell the compiler in detail which specialized type each variable has, because the compiler can work that out for itself.
When my code cares about both efficiency and elegance, I should be able to write a transformation in elegant style, and just specify the type of the result. For example, I might want to be sure that a rotated point is still just a point. The compiler should be able to tell me when I get it wrong.
I want to do this without too much machinery. Multiplying two multivectors, a full geometric product, is a beastly 16x16→16 monster expression. I want to write this once, and get the compiler to simplify it automatically for the various specialized cases.
I have a proof-of-concept that I think will be able to do this!
demo test
What does this idea look like in practice? Let's try a few basic tests of a simplified version that works on complex numbers instead of multivectors.
First, efficiency: complex numbers whose real or imaginary parts are zero should be the same size as a normal number.
let re = Complex(1.0, Zero);
assert_eq!(size_of_val(&re), size_of::<Num>());
let im = Complex(Zero, 1.0);
assert_eq!(size_of_val(&im), size_of::<Num>());
Elegance: I can add complex numbers without saying anything about which parts are zero.
let c = re + im;
The compiler has worked out for itself that neither real nor imaginary part is known to be zero, so the result has to contain two numbers.
assert_eq!(size_of_val(&c), size_of::<Num>() * 2);
Back to efficiency: I can add complex numbers without saying anything about the result type, and my half-size complex numbers do not become bigger when they don't need to.
let re2 = re + re;
assert_eq!(size_of_val(&re2), size_of::<Num>());
let im2 = im + im;
assert_eq!(size_of_val(&im2), size_of::<Num>());
Finally, a few arithmetic expressions work as expected.
assert_eq!(re2, 2.0 * re);
assert_eq!(im2, im * 2.0);
assert_eq!(im * im, -re);
assert_eq!(re * im, im);
assert_eq!(c * im, im - re);
assert_eq!(c, 1.0 + im);
OK, not too bad!
Here's how it works...
abstract interpretation
A compiler will often have some partial information about the inputs to an expression, and it needs to work out how that information propagates through the expression. This is abstract interpretation.
In our case, the information we want to track is whether a value is always guaranteed to be zero or not. When we are operating on simple scalar values, the rules for propagating zeroness are simple:
Add | Zero | Num Mul | Zero | Num
-----+------+----- -----+------+-----
Zero | Zero | Num Zero | Zero | Zero
Num | Num | Num Num | Zero | Num
The zeroness rules for multiplying multivectors are much more complicated! But most of the difficulty also happens in the same kind of way when we multiply complex numbers, so I'm going to explain how to implement complex multiplication with vanishing real and imaginary parts. Later on we'll come back to the issue of scaling up to multivectors.
Here's a straightforward multiplication function, without any type parameterization:
fn mul(a: Complex, b: Complex) -> Complex {
let Complex(ar, ai) = a;
let Complex(br, bi) = b;
Complex(ar * br - ai * bi,
ar * bi + ai * br)
}
Abstract interpretation is something a compiler will do to this expression as part of the compiler's built-in analysis. I want to implement my own zeroness analysis on top of the compiler not inside it, so I won't be doing abstract interpretation as such. Instead, alongside this expression which describes how to multiply numbers at run time, I'm going to write a second expression that calculates its zeroness at compile time.
This second expression does calculations inside Rust's type system. It
works out the types of each part of the run-time expression, either
Zero
or Num
, to describe which parts we know at compile time will
always be zero at run time. The compiler then makes Zero
values and
expressions vanish.
type-level function definitions
OK, how do I write these calculations inside Rust's type system?
A trait is a type-level function declaration:
trait Sum {
type Output;
}
The type-level function's argument is the (implicit) Self
type for
which we will implement the trait. The type-level function's result is
the associated type Output
. (It could have any name but Output
is
conventional.)
The body of this Sum
type-level function is all the impl Sum
trait
implementations. Each impl Sum
has a self type, which is its
argument pattern, and the body of the implementation is the result for
that argument. So a type-level function is defined using a collection
of pattern-result clauses, like functions in Haskell.
For example, here's a type-level function definition of our zeroness analysis for addition (or subtraction) of two values, as in the table above.
impl Sum for (Zero,Zero) {
type Output = Zero;
}
impl Sum for (Zero,Num) {
type Output = Num;
}
impl Sum for (Num,Zero) {
type Output = Num;
}
impl Sum for (Num,Num) {
type Output = Num;
}
In my Sum
type-level function I have used a tuple to bundle two
parameters into a single argument. But we can also define type-level
functions in a sort-of-Curried asymmetrical style, as I am about to
demonstrate with Mul
.
Later on I will show how to call a type-level function, but first, it turns out that some of the type-level functions I need are predefined by Rust.
a lesson from rustc
In the early stages of my struggle to get this to work, at one point I
had my own type-level functions for sums and products (similar to
Sum
above) guarding an expression for calculating the run-time
result, but the rest of the type-level machinery was still missing.
Now, in Rust when you are writing a (normal) generic function that
does arithmetic, you need to put trait bounds on the type parameters
to tell the compiler that the arithmetic will work. For example, the
Mul
trait bound here tells the compiler that the *
operator will
work on values of whatever type T
turns out to be.
struct Bar<T>(T);
fn babar<T>(a: Bar<T>, b: Bar<T>) -> Bar<T>
where T: Mul<Output = T>
{
Bar(a.0 * b.0)
}
Aside: I expect Rustaceans will think that this is a very strange description of trait bounds. Yes! Yes, it is. But I hope this odd perspective will make a bit more sense when we get to type-level function calls and expressions.
I was getting the Rust compiler to help me fill in the missing type
machinery, and I was amused when I realised that the error messages
were telling me to add trait bounds for Mul
that were exactly
parallel to my own type-level functions for abstract zeroness.
But why?
To implement operator-overloaded multiplication with vanished zeroes
at run time (or rather, to tell the compiler that it can optimize them
out), I need a few implementations of the std::ops::Mul
trait:
impl Mul<Zero> for Zero {
type Output = Zero;
fn mul(self, _: Zero) -> Zero {
Zero
}
}
impl Mul<Zero> for Num {
type Output = Zero;
fn mul(self, _: Zero) -> Zero {
Zero
}
}
impl Mul<Num> for Zero {
type Output = Zero;
fn mul(self, _: Num) -> Zero {
Zero
}
}
Rust does not assume that operators are commutative, so I need two
implementations for Num*Zero
and Zero*Num
. The Num*Num
case is already covered by Rust's f64*f64
implementation.
The Rust compiler is doing its own abstract interpretation of my
run-time expression to prove that the overloaded operators have the
necessary implementations, and to find out what their Output
types
are.
My type-level zeroness calculation must correspond to my run-time
expression, and it needs to calculate the same Output
types, so it
can piggy-back on the existing operator overloading machinery!
This is really sweet! And I love the way that the Rust compiler nudges me towards a better design. (This has happened to me a few times now!)
But all was not sweetness and light. I ran into some problems that made my type expressions explode and become impossibly unwieldy. Let's look at a simple example, then it will become clear how the types went wrong.
type-level function calls
Normally when defining overloaded operators, you are either dealing
with concrete types (as in the implementations of Mul
for Zero
and
Num
above), or dealing with generic types where the argument and
result types are the same (as in babar()
above). Either way, the
type expressions remain fairly simple.
However, my vanishing zeroes need to be fully generic, because every
part of my number might be Zero
or a Num
. This is what that looks
like when multiplying complex numbers:
use std::ops::{Add, Mul, Sub};
struct Complex<R, I>(R, I);
impl<Ar, Ai, Br, Bi>
Mul<Complex<Br, Bi>>
for Complex<Ar, Ai>
where
Ar: Copy,
Ai: Copy,
Br: Copy,
Bi: Copy,
Ar: Mul<Br>,
Ar: Mul<Bi>,
Ai: Mul<Br>,
Ai: Mul<Bi>,
<Ar as Mul<Br>>::Output: Sub<<Ai as Mul<Bi>>::Output>,
<Ar as Mul<Bi>>::Output: Add<<Ai as Mul<Br>>::Output>,
{
type Output = Complex<
<<Ar as Mul<Br>>::Output as Sub<
<Ai as Mul<Bi>>::Output>>::Output,
<<Ar as Mul<Bi>>::Output as Add<
<Ai as Mul<Br>>::Output>>::Output,
>;
fn mul(self, other: Complex<Br, Bi>)
-> Self::Output {
let Complex(ar, ai) = self;
let Complex(br, bi) = other;
Complex(ar * br - ai * bi,
ar * bi + ai * br)
}
}
[ Rust Playground ]
Our Complex
type is fully generic in its real and imaginary parts.
Our impl Mul for Complex
is also fully generic, with four
independent type variables for each component of each argument.
The where
clause is where most of the action happens.
Ar: Copy
...
These lines say we need to be able to copy each part of each argument so they can be used in both parts of the run-time expression. (So, not interesting for compile time.)
Ar: Mul<Br>
...
These lines say we need to be able to multiply each part of a
with
each part of b
. These trait bounds do two things: they allow me to
multiply at run time, as in ar*br
, and they allow me to call a
type-level function to work out zeroness at compile time.
How does that work?
The type-level meaning of Ar: Mul<Br>
is a declaration that I am
going to call the type-level function Mul
with arguments Ar
and
Br
.
<Ar as Mul<Br>>::Output
This expression is the actual type-level function call. It extracts
the result of the call from the trait implementation, which is the
Output
associated type. It corresponds to the type (and zeroness) of
the run time expression ar*br
.
Given the results of our four multiplications, we need to do an addition and a subtraction. This is a bit hairy:
<Ar as Mul<Br>>::Output: Sub<<Ai as Mul<Bi>>::Output>
If I abbreviate the multiplications like this:
ArBr = <Ar as Mul<Br>>::Output
AiBi = <Ai as Mul<Bi>>::Output
the trait bound looks like this:
ArBr: Sub<AiBi>
This allows me to write ar*br - ai*bi
in the run time expression,
and at compile time it allows me to calculate the type of the real
part of the result. The abbreviated version of that type looks like:
<ArBr as Sub<AiBi>>::Output
And you can see it in its full glory inside the definition of the
Output = Complex<...>
associated type:
<<Ar as Mul<Br>>::Output as Sub<
<Ai as Mul<Bi>>::Output>>::Output
Whew!
quadratic explosion
OK, that's already pretty unpalatable, if you compare the size of the run-time calculation to the size of the compile-time calculation. What happens if we try to scale it up to multivectors?
Well, each multivector argument has 5 generic type parameters, so
impl Mul
has 10 type parameters, and its trait bounds list 25
different An: Mul<Bn>
combinations.
But the real problem happens with the sums. In a full geometric product, there are 16 result expressions and many of the expressions are sums of 16 products.
In the trait bounds, we are required to write the type of every subexpression, in full, twice, and there are no ways to abbreviate the types:
we can't name intermediate types and refer back to them inside trait bounds;
hygiene prevents us from using C-style textual macrology to generate things like multiple trait bounds with a single macro invocation.
I struggled a lot trying to find a way to persuade the Rust compiler that surely it can work out these intermediate types for itself, but no, it has to be led by the hand through every step of the derivation.
The compiler finds it difficult because it is dealing with an open
world. Contrast pattern matching with Rust enum
s: we know the set of
possibilities is fixed, so the compiler can check the pattern is
exhaustive. But with trait bounds on type variables, someone can come
along and define a new type and new trait implementations that break
the assumptions in a generic function. So the assumptions have to be
written down and checked.
I wondered if I could persuade the compiler that the generic type
parameters could only be Zero
or Num
and no other types, but all
we can say in trait bounds is that the type has to implement a trait,
and traits are open to extension.
concrete brutalism
Eventually I found a combination of techniques to get these horrible type expressions under control:
Split out the run-time expression for each of the result parts, (for complex numbers, the real part or the imaginary part) and make them completely concrete. This means there's no need to convince the compiler that I am allowed to add an
f64
to anf64
.Choose which concrete version of each result expression to invoke based on the result types, instead of the argument types. Each part of the result can be either
Zero
or aNum
. So for each part I only need two concrete implementations: a no-opZero
, or the full result as aNum
expression.Lean on the optimizer more. The
Complex
Mul
above is using overloaded operators which are defined as no-ops when either of the arguments isZero
. In the concrete version, the code says that everyZero
in the arguments should be un-vanished into0.0
before doing the calculation; I am assuming that the generic multiply will be monomorphized, the concrete result expressions will be inlined, and constant propagation will turn expressions involving theZero
arguments back into no-ops.The type-level zeroness calculation is now decoupled from the run-time result calculation. This allows me to make the multivector type-level calculation more abstract and simpler, because it only needs to talk about the 5x5 combination of the parts, not the full 16x16 multiplication table. (The complex number version is too simple for this to make any difference.)
Make the type-level functions as flat and wide as possible. Every subexpression requires more clauses in the trait bounds, so wide and shallow expressions require fewer lines than narrow and deep expressions.
scalable multiplication
Let's see how these changes look in the simple setting of complex multiplication. The extra machinery makes the code a fair bit bigger, but it's a modest linear overhead so for multivectors it ends up being smaller compared to the quadratic growth of the naive design. These code snippets are taken from the full version on the Rust Playground
For our concrete output expressions, there's a trait which has a function for each part of the result. (This is an ordinary trait not a type-level function trait.)
trait CMul<A, B> {
fn re(_: A, _: B) -> Self;
fn im(_: A, _: B) -> Self;
}
This trait is implemented for return types of Zero
and Num
. It is
invoked separately for each part of the result, and each invocation
can have a different result type. The invocation looks simple, because
the types are specified elsewhere:
fn mul(self, other: Complex<Br, Bi>)
-> Self::Output {
Complex(CMul::re(self, other),
CMul::im(self, other))
}
The no-op CMul
trait implementation is simple: a Zero
result just
makes its arguments vanish.
impl<A, B> CMul<A, B> for Zero {
fn re(_: A, _: B) -> Self { Zero }
fn im(_: A, _: B) -> Self { Zero }
}
The CMul for Num
implementation does two things: it converts
arguments of a partially-vanished type to the fully-hydrated
Num
-everywhere concrete type, then does the calculations on the
concrete type. In the case of multivectors, the expressions are much
bigger than the conversions.
type ComplexNum = Complex<Num,Num>;
impl<Ar, Ai, Br, Bi>
CMul<Complex<Ar, Ai>,
Complex<Br, Bi>> for Num
where
ComplexNum: From<Complex<Ar, Ai>>,
ComplexNum: From<Complex<Br, Bi>>,
{
fn re(a: Complex<Ar, Ai>,
b: Complex<Br, Bi>) -> Self {
let Complex(ar, ai) = ComplexNum::from(a);
let Complex(br, bi) = ComplexNum::from(b);
ar * br - ai * bi
}
fn im(a: Complex<Ar, Ai>,
b: Complex<Br, Bi>) -> Self {
let Complex(ar, ai) = ComplexNum::from(a);
let Complex(br, bi) = ComplexNum::from(b);
ar * bi + ai * br
}
}
In the scalable impl Mul for Complex
the trait bounds start with the
same set of multiplications as the naive version. This time they are
only used for type-level calculation, because we don't need them for
run-time overloading any more.
Ar: Mul<Br>,
Ar: Mul<Bi>,
Ai: Mul<Br>,
Ai: Mul<Bi>,
At the top level we have an alias to make it nicer to write the type-level result of multiplication.
type Prod<A, B> = <A as Mul<B>>::Output;
If you remember way back when I introduced type-level function
definitions, I used trait Sum
as my example. Well, that
sum-of-a-tuple is how I make the type-level sum calculation wide and
shallow, because a tuple can wrap an arbitrary number of arguments.
Back to the trait bounds, next I declare how I will calculate my result types. The difference between addition and subtraction doesn't matter when we only care about zeroness.
(Prod<Ar, Br>, Prod<Ai, Bi>): Sum,
(Prod<Ar, Bi>, Prod<Ai, Br>): Sum,
Now a new part. I need to say that the zeroness calculated by Sum
determines which implementation of my CMul
concrete multiplication
to use.
<(Prod<Ar, Br>, Prod<Ai, Bi>) as Sum>::Output:
CMul<Complex<Ar, Ai>, Complex<Br, Bi>>,
<(Prod<Ar, Bi>, Prod<Ai, Br>) as Sum>::Output:
CMul<Complex<Ar, Ai>, Complex<Br, Bi>>,
Finally I need to satisfy a couple of run time requirements: I need to
copy my arguments into all of the result parts, and CMul
may need to
convert them.
Complex<Ar, Ai>: Copy + Into<ComplexNum>,
Complex<Br, Bi>: Copy + Into<ComplexNum>,
The zeronesses calculated by Sum
are also the types of each part of
the output type:
type Output = Complex<
<(Prod<Ar, Br>, Prod<Ai, Bi>) as Sum>::Output,
<(Prod<Ar, Bi>, Prod<Ai, Br>) as Sum>::Output,
>;
You can see a complete version of the complex number code on the Rust Playground
exponential growth
But I should warn you that there are a couple of issues with this design. Fortunately they don't seem to be troublesome so far, even though both of them involve exponential growth.
The first is an annoying aspect of the way Rust's conversion traits
are defined. I want to be able to specify how to convert Zero
into
Num
, then define ComplexNum
conversions in terms of anything that
can be converted into a Num
, like this:
impl<R,I> From<Complex<R,I>> for ComplexNum
where
R: Into<Num>,
I: Into<Num>,
{
fn from(c: Complex<R, I>) -> Self {
let Complex(r, i) = c;
Complex(r.into(), i.into())
}
}
But sadly the Rust compiler complains:
error[E0119]: conflicting implementations of trait
`std::convert::From<complex::Complex<f64, f64>>`
for type `complex::Complex<f64, f64>`:
= note: conflicting implementation in crate `core`:
- impl<T> From<T> for T;
Ugh! Instead I use a macro to implement the three conversions
involving Zero and leave out the no-op From<ComplexNum> for
ComplexNum
that conflicts with the standard library.
For multivectors, unfortunately I need 31 of these conversions. Yuck.
The other case where it goes exponential is my wide and shallow Sum
trait. For multivectors, I need to get the zeroness of the sum of up
to 9 (nine!) arguments, each of which can be Zero
or Num
.
Aside: I said earlier that the multivector result expressions often have 16 additions. Why is it only 9 now? This is where the abstract type-level zeroness expression can be simpler than the concrete run-time expression, because the type-level expression deals with the 5x5 parts, whereas the run-time expression deals with the 16x16 numbers.
In total the Sum
trait has 820 implementations! (I don't need
argument lists of every possible length.) This seems like a lot, but
the trait implementations don't involve any run time code, so they
cause no slow LLVM codegen / optimization. The compiler seems to
handle it without trouble.
Aside: in one of my futile detours, I tried using macros to generate all the 1024 possible concrete implementations of
Mul
for multivectors. This did make the compiler sad: it took well over ten seconds to compile or type-check a trivial program. The code described here compiles in less than a second which is much more tolerable.
To keep these 820 implementations linear at the source code level, I have used a little bit of macrology, so I can fit them all into 50 lines of code.
There's one last obvious problem with using this code for more than
wild experiments: my Num
type is currently fixed; ideally it should
be generic over numeric types. There are probably a few more
significant hurdles that would make this difficult, but for now I am
going to leave it on the todo list.
what next
Well these notes have become rather more than 4000 words according to
wc
! Well done if you read this far :-)
As I said earlier, what I have so far is only a proof-of-concept. It has passed some trivial tests, but I have not yet tried to use it properly. So the next thing is obviously to try to draw some pictures!