Vanishing zeroes for geometric algebra in Rust

These are some notes on what I have done with Rust over the last week, in the course of doing something highly experimental that could well turn out to be foolishly over-complicated. (I'll be interested to hear about other ways of doing this, in other programming languages as well as Rust!) But it involves some cool ideas - abstract interpretation, type-level programming in Rust, automatically optimized data structures, and a bit of geometric algebra for 3D graphics - and I want to share what I learned.

Health warning: this did turn out to be foolishly over-complicated - I suggest reading the followup before wading in to the rest of these notes, if you have not already read them.

Aside: There are a lot of code snippets in these notes, and a couple of Rust Playground links. The code is currently a barely-working proof-of-concept with very few comments. These notes are the first piece of documentation.

context

Recently I made some animations (involving Truchet tiles and the Hilbert curve), but I was lazy and hacked them together in C. Foolish! I missed a good opportunity to practise coding in Rust.

For my next animation I will use Rust, and I want to do some 3D graphics. I don't think I have done any 3D graphics since I was a teenager, so I needed to brush up my knowledge and remind myself how it works.

Back then the standard 3D graphics techniques were based on projective geometry and homogeneous coordinates. I found out that nowadays quaternions are often preferred for doing rotations instead of using a matrix. But even quaternions are being superseded.

geometric algebra

Geometric algebra is a framework that generalizes things like complex numbers, quaternions, vectors, matrices, etc. It can be used to model lots of things, only a few of which I understand: Euclidean and non-Euclidean geometry, projective geometry, the geometry of Einsteinian space-time, quantum mechanics, etc. I am going to use it for 3D projective geometry, as described in these slides about geometric algebra for computer graphics, from a tutorial at SIGGRAPH 2019. You can find the SIGGRAPH course notes and a lot more at biVector.net. I also found the Euclidean Space website and Eric Chisolm's introduction helpful. My friend and colleague Rich Wareham did a PhD on geometric algebra for computer graphics including non-Euclidean and conformal geometry.

multivectors

These notes are more about Rust than about geometric algebra; but geometric algebra is why I am doing wild things with Rust. The wild things are an answer to the question of how to represent geometric objects both elegantly and efficiently — but what makes them inefficient or inelegant?

Geometric algebra works with objects called multivectors. In PGA3D (projective geometric algebra for 3D graphics) a multivector has 5 parts:

0: A scalar part, which is just a real number.
1: A vector part, consisting of four numbers, which in PGA3D represents a plane or a rotation in that plane, like quaternions.
2: A bivector part, consisting of six numbers, which in PGA3D represents a line, like Plücker coordinates.
3: A trivector part, consisting of four numbers, which in PGA3D represents a point or direction, like a vector in homogeneous coordinates.
4: A pseudoscalar part, like an imaginary number, which is used for things like inversion.

That's a total of 16 numbers, same as the kind of 4x4 transformation matrix I used for 3D graphics when I was a teenager.

a dilemma

OK, so on the one hand we have an elegant way to describe all sorts of geometric objects and operations in terms of multivectors that contain 16 numbers.

And on the other hand, in a more traditional setup we have specialized objects for points (4 numbers), lines (6 numbers), and rotations (4 numbers).

In a program that is working with a lot of objects, we want them to be as compact as possible, so that we don't waste memory or bus bandwidth.

And the extra space used by multivectors is obnoxiously wasteful: it is mostly zeroes, and if we know what kind of object the multivector represents, we know which parts of it must be zero.

zero-sized zero

Unlike C and C++, Rust has zero-sized types. I can declare

    struct Zero;

to create a type representing zeroes that vanish at run time: they take no space. I can implement Rust's operator overloading traits so that I can use my vanishing zeroes in arithmetic in the usual way. For example, when I add a number to zero, I get back the same number:

    impl Add<Num> for Zero {
        type Output = Num;
        fn add(self, other: Num) -> Num {
            other
        }
    }

but when I multiply a number by zero, the number vanishes and I get zero:

    impl Mul<Num> for Zero {
        type Output = Zero;
        fn mul(self, _: Num) -> Zero {
            Zero
        }
    }

Aside: I have pointlessly declared a type alias type Num = f64 in case I want to change it, even though I probably won't.

variable-sized multivectors

I can define a fixed-size PGA3D multivector like

    struct Multivector(V0, V1, V2, V3, V4);

using some helper types V0 ... V4 that I have defined to stop me from muddling up the various parts.

I can make this into a variable-sized multivector If I parameterize all these types:

    struct Multivec<T0, T1, T2, T3, T4>
        (V0<T0>, V1<T1>, V2<T2>, V3<T3>, V4<T4>);

Then a full-sized multivector is:

    type Multivector =
             Multivec<Num, Num, Num, Num, Num>;

And I can make parts of the multivector vanish by setting the corresponding type parameters to zero.

Aside: This Multivec<> is variable-sizes in the sense that it can have different shapes in different parts of the source code, but each shape is a fixed size at run time. In Rust terms, all the instances are Sized.

the big idea

I want to write geometric algebra code that is both efficient and elegant.

Efficient means that a multivector is specialized to the geometric object that it represents, and any parts that are known to be zero vanish: they are not present in memory and no calculations are performed on them.

Elegant means that the code I write looks like it is working with fully-general multivectors: I don't have to tell the compiler in detail which specialized type each variable has, because the compiler can work that out for itself.

When my code cares about both efficiency and elegance, I should be able to write a transformation in elegant style, and just specify the type of the result. For example, I might want to be sure that a rotated point is still just a point. The compiler should be able to tell me when I get it wrong.

I want to do this without too much machinery. Multiplying two multivectors, a full geometric product, is a beastly 16x16→16 monster expression. I want to write this once, and get the compiler to simplify it automatically for the various specialized cases.

I have a proof-of-concept that I think will be able to do this!

demo test

What does this idea look like in practice? Let's try a few basic tests of a simplified version that works on complex numbers instead of multivectors.

First, efficiency: complex numbers whose real or imaginary parts are zero should be the same size as a normal number.

    let re = Complex(1.0, Zero);
    assert_eq!(size_of_val(&re), size_of::<Num>());
    let im = Complex(Zero, 1.0);
    assert_eq!(size_of_val(&im), size_of::<Num>());

Elegance: I can add complex numbers without saying anything about which parts are zero.

    let c = re + im;

The compiler has worked out for itself that neither real nor imaginary part is known to be zero, so the result has to contain two numbers.

    assert_eq!(size_of_val(&c), size_of::<Num>() * 2);

Back to efficiency: I can add complex numbers without saying anything about the result type, and my half-size complex numbers do not become bigger when they don't need to.

    let re2 = re + re;
    assert_eq!(size_of_val(&re2), size_of::<Num>());
    let im2 = im + im;
    assert_eq!(size_of_val(&im2), size_of::<Num>());

Finally, a few arithmetic expressions work as expected.

    assert_eq!(re2, 2.0 * re);
    assert_eq!(im2, im * 2.0);
    assert_eq!(im * im, -re);
    assert_eq!(re * im, im);
    assert_eq!(c * im, im - re);
    assert_eq!(c, 1.0 + im);

OK, not too bad!

Here's how it works...

abstract interpretation

A compiler will often have some partial information about the inputs to an expression, and it needs to work out how that information propagates through the expression. This is abstract interpretation.

In our case, the information we want to track is whether a value is always guaranteed to be zero or not. When we are operating on simple scalar values, the rules for propagating zeroness are simple:

    Add  | Zero | Num     Mul  | Zero | Num
    -----+------+-----    -----+------+-----
    Zero | Zero | Num     Zero | Zero | Zero
    Num  | Num  | Num     Num  | Zero | Num

The zeroness rules for multiplying multivectors are much more complicated! But most of the difficulty also happens in the same kind of way when we multiply complex numbers, so I'm going to explain how to implement complex multiplication with vanishing real and imaginary parts. Later on we'll come back to the issue of scaling up to multivectors.

Here's a straightforward multiplication function, without any type parameterization:

    fn mul(a: Complex, b: Complex) -> Complex {
        let Complex(ar, ai) = a;
        let Complex(br, bi) = b;
        Complex(ar * br - ai * bi,
                ar * bi + ai * br)
    }

Abstract interpretation is something a compiler will do to this expression as part of the compiler's built-in analysis. I want to implement my own zeroness analysis on top of the compiler not inside it, so I won't be doing abstract interpretation as such. Instead, alongside this expression which describes how to multiply numbers at run time, I'm going to write a second expression that calculates its zeroness at compile time.

This second expression does calculations inside Rust's type system. It works out the types of each part of the run-time expression, either Zero or Num, to describe which parts we know at compile time will always be zero at run time. The compiler then makes Zero values and expressions vanish.

type-level function definitions

OK, how do I write these calculations inside Rust's type system?

A trait is a type-level function declaration:

    trait Sum {
        type Output;
    }

The type-level function's argument is the (implicit) Self type for which we will implement the trait. The type-level function's result is the associated type Output. (It could have any name but Output is conventional.)

The body of this Sum type-level function is all the impl Sum trait implementations. Each impl Sum has a self type, which is its argument pattern, and the body of the implementation is the result for that argument. So a type-level function is defined using a collection of pattern-result clauses, like functions in Haskell.

For example, here's a type-level function definition of our zeroness analysis for addition (or subtraction) of two values, as in the table above.

    impl Sum for (Zero,Zero) {
        type Output = Zero;
    }
    impl Sum for (Zero,Num) {
        type Output = Num;
    }
    impl Sum for (Num,Zero) {
        type Output = Num;
    }
    impl Sum for (Num,Num) {
        type Output = Num;
    }

In my Sum type-level function I have used a tuple to bundle two parameters into a single argument. But we can also define type-level functions in a sort-of-Curried asymmetrical style, as I am about to demonstrate with Mul.

Later on I will show how to call a type-level function, but first, it turns out that some of the type-level functions I need are predefined by Rust.

a lesson from rustc

In the early stages of my struggle to get this to work, at one point I had my own type-level functions for sums and products (similar to Sum above) guarding an expression for calculating the run-time result, but the rest of the type-level machinery was still missing.

Now, in Rust when you are writing a (normal) generic function that does arithmetic, you need to put trait bounds on the type parameters to tell the compiler that the arithmetic will work. For example, the Mul trait bound here tells the compiler that the * operator will work on values of whatever type T turns out to be.

    struct Bar<T>(T);

    fn babar<T>(a: Bar<T>, b: Bar<T>) -> Bar<T>
    where T: Mul<Output = T>
    {
        Bar(a.0 * b.0)
    }

Aside: I expect Rustaceans will think that this is a very strange description of trait bounds. Yes! Yes, it is. But I hope this odd perspective will make a bit more sense when we get to type-level function calls and expressions.

I was getting the Rust compiler to help me fill in the missing type machinery, and I was amused when I realised that the error messages were telling me to add trait bounds for Mul that were exactly parallel to my own type-level functions for abstract zeroness.

But why?

To implement operator-overloaded multiplication with vanished zeroes at run time (or rather, to tell the compiler that it can optimize them out), I need a few implementations of the std::ops::Mul trait:

    impl Mul<Zero> for Zero {
        type Output = Zero;
        fn mul(self, _: Zero) -> Zero {
            Zero
        }
    }

    impl Mul<Zero> for Num {
        type Output = Zero;
        fn mul(self, _: Zero) -> Zero {
            Zero
        }
    }

    impl Mul<Num> for Zero {
        type Output = Zero;
        fn mul(self, _: Num) -> Zero {
            Zero
        }
    }

Rust does not assume that operators are commutative, so I need two implementations for Num*Zero and Zero*Num. The Num*Num case is already covered by Rust's f64*f64 implementation.

The Rust compiler is doing its own abstract interpretation of my run-time expression to prove that the overloaded operators have the necessary implementations, and to find out what their Output types are.

My type-level zeroness calculation must correspond to my run-time expression, and it needs to calculate the same Output types, so it can piggy-back on the existing operator overloading machinery!

This is really sweet! And I love the way that the Rust compiler nudges me towards a better design. (This has happened to me a few times now!)

But all was not sweetness and light. I ran into some problems that made my type expressions explode and become impossibly unwieldy. Let's look at a simple example, then it will become clear how the types went wrong.

type-level function calls

Normally when defining overloaded operators, you are either dealing with concrete types (as in the implementations of Mul for Zero and Num above), or dealing with generic types where the argument and result types are the same (as in babar() above). Either way, the type expressions remain fairly simple.

However, my vanishing zeroes need to be fully generic, because every part of my number might be Zero or a Num. This is what that looks like when multiplying complex numbers:

    use std::ops::{Add, Mul, Sub};

    struct Complex<R, I>(R, I);

    impl<Ar, Ai, Br, Bi>
        Mul<Complex<Br, Bi>>
        for Complex<Ar, Ai>
    where
        Ar: Copy,
        Ai: Copy,
        Br: Copy,
        Bi: Copy,
        Ar: Mul<Br>,
        Ar: Mul<Bi>,
        Ai: Mul<Br>,
        Ai: Mul<Bi>,
        <Ar as Mul<Br>>::Output: Sub<<Ai as Mul<Bi>>::Output>,
        <Ar as Mul<Bi>>::Output: Add<<Ai as Mul<Br>>::Output>,
    {
        type Output = Complex<
            <<Ar as Mul<Br>>::Output as Sub<
             <Ai as Mul<Bi>>::Output>>::Output,
            <<Ar as Mul<Bi>>::Output as Add<
             <Ai as Mul<Br>>::Output>>::Output,
        >;
        fn mul(self, other: Complex<Br, Bi>)
                -> Self::Output {
            let Complex(ar, ai) = self;
            let Complex(br, bi) = other;
            Complex(ar * br - ai * bi,
                    ar * bi + ai * br)
        }
    }

[ Rust Playground ]

Our Complex type is fully generic in its real and imaginary parts.

Our impl Mul for Complex is also fully generic, with four independent type variables for each component of each argument.

The where clause is where most of the action happens.

        Ar: Copy
        ...

These lines say we need to be able to copy each part of each argument so they can be used in both parts of the run-time expression. (So, not interesting for compile time.)

        Ar: Mul<Br>
        ...

These lines say we need to be able to multiply each part of a with each part of b. These trait bounds do two things: they allow me to multiply at run time, as in ar*br, and they allow me to call a type-level function to work out zeroness at compile time.

How does that work?

The type-level meaning of Ar: Mul<Br> is a declaration that I am going to call the type-level function Mul with arguments Ar and Br.

        <Ar as Mul<Br>>::Output

This expression is the actual type-level function call. It extracts the result of the call from the trait implementation, which is the Output associated type. It corresponds to the type (and zeroness) of the run time expression ar*br.

Given the results of our four multiplications, we need to do an addition and a subtraction. This is a bit hairy:

        <Ar as Mul<Br>>::Output: Sub<<Ai as Mul<Bi>>::Output>

If I abbreviate the multiplications like this:

        ArBr = <Ar as Mul<Br>>::Output
        AiBi = <Ai as Mul<Bi>>::Output

the trait bound looks like this:

        ArBr: Sub<AiBi>

This allows me to write ar*br - ai*bi in the run time expression, and at compile time it allows me to calculate the type of the real part of the result. The abbreviated version of that type looks like:

        <ArBr as Sub<AiBi>>::Output

And you can see it in its full glory inside the definition of the Output = Complex<...> associated type:

        <<Ar as Mul<Br>>::Output as Sub<
         <Ai as Mul<Bi>>::Output>>::Output

Whew!

quadratic explosion

OK, that's already pretty unpalatable, if you compare the size of the run-time calculation to the size of the compile-time calculation. What happens if we try to scale it up to multivectors?

Well, each multivector argument has 5 generic type parameters, so impl Mul has 10 type parameters, and its trait bounds list 25 different An: Mul<Bn> combinations.

But the real problem happens with the sums. In a full geometric product, there are 16 result expressions and many of the expressions are sums of 16 products.

In the trait bounds, we are required to write the type of every subexpression, in full, twice, and there are no ways to abbreviate the types:

we can't name intermediate types and refer back to them inside trait bounds;
hygiene prevents us from using C-style textual macrology to generate things like multiple trait bounds with a single macro invocation.

I struggled a lot trying to find a way to persuade the Rust compiler that surely it can work out these intermediate types for itself, but no, it has to be led by the hand through every step of the derivation.

The compiler finds it difficult because it is dealing with an open world. Contrast pattern matching with Rust enums: we know the set of possibilities is fixed, so the compiler can check the pattern is exhaustive. But with trait bounds on type variables, someone can come along and define a new type and new trait implementations that break the assumptions in a generic function. So the assumptions have to be written down and checked.

I wondered if I could persuade the compiler that the generic type parameters could only be Zero or Num and no other types, but all we can say in trait bounds is that the type has to implement a trait, and traits are open to extension.

concrete brutalism

Eventually I found a combination of techniques to get these horrible type expressions under control:

Split out the run-time expression for each of the result parts, (for complex numbers, the real part or the imaginary part) and make them completely concrete. This means there's no need to convince the compiler that I am allowed to add an f64 to an f64.
Choose which concrete version of each result expression to invoke based on the result types, instead of the argument types. Each part of the result can be either Zero or a Num. So for each part I only need two concrete implementations: a no-op Zero, or the full result as a Num expression.
Lean on the optimizer more. The Complex Mul above is using overloaded operators which are defined as no-ops when either of the arguments is Zero. In the concrete version, the code says that every Zero in the arguments should be un-vanished into 0.0 before doing the calculation; I am assuming that the generic multiply will be monomorphized, the concrete result expressions will be inlined, and constant propagation will turn expressions involving the Zero arguments back into no-ops.
The type-level zeroness calculation is now decoupled from the run-time result calculation. This allows me to make the multivector type-level calculation more abstract and simpler, because it only needs to talk about the 5x5 combination of the parts, not the full 16x16 multiplication table. (The complex number version is too simple for this to make any difference.)
Make the type-level functions as flat and wide as possible. Every subexpression requires more clauses in the trait bounds, so wide and shallow expressions require fewer lines than narrow and deep expressions.

scalable multiplication

Let's see how these changes look in the simple setting of complex multiplication. The extra machinery makes the code a fair bit bigger, but it's a modest linear overhead so for multivectors it ends up being smaller compared to the quadratic growth of the naive design. These code snippets are taken from the full version on the Rust Playground

For our concrete output expressions, there's a trait which has a function for each part of the result. (This is an ordinary trait not a type-level function trait.)

    trait CMul<A, B> {
        fn re(_: A, _: B) -> Self;
        fn im(_: A, _: B) -> Self;
    }

This trait is implemented for return types of Zero and Num. It is invoked separately for each part of the result, and each invocation can have a different result type. The invocation looks simple, because the types are specified elsewhere:

    fn mul(self, other: Complex<Br, Bi>)
            -> Self::Output {
        Complex(CMul::re(self, other),
                CMul::im(self, other))
    }

The no-op CMul trait implementation is simple: a Zero result just makes its arguments vanish.

    impl<A, B> CMul<A, B> for Zero {
        fn re(_: A, _: B) -> Self { Zero }
        fn im(_: A, _: B) -> Self { Zero }
    }

The CMul for Num implementation does two things: it converts arguments of a partially-vanished type to the fully-hydrated Num-everywhere concrete type, then does the calculations on the concrete type. In the case of multivectors, the expressions are much bigger than the conversions.

    type ComplexNum = Complex<Num,Num>;

    impl<Ar, Ai, Br, Bi>
        CMul<Complex<Ar, Ai>,
             Complex<Br, Bi>> for Num
    where
        ComplexNum: From<Complex<Ar, Ai>>,
        ComplexNum: From<Complex<Br, Bi>>,
    {
        fn re(a: Complex<Ar, Ai>,
              b: Complex<Br, Bi>) -> Self {
            let Complex(ar, ai) = ComplexNum::from(a);
            let Complex(br, bi) = ComplexNum::from(b);
            ar * br - ai * bi
        }
        fn im(a: Complex<Ar, Ai>,
              b: Complex<Br, Bi>) -> Self {
            let Complex(ar, ai) = ComplexNum::from(a);
            let Complex(br, bi) = ComplexNum::from(b);
            ar * bi + ai * br
        }
    }

In the scalable impl Mul for Complex the trait bounds start with the same set of multiplications as the naive version. This time they are only used for type-level calculation, because we don't need them for run-time overloading any more.

    Ar: Mul<Br>,
    Ar: Mul<Bi>,
    Ai: Mul<Br>,
    Ai: Mul<Bi>,

At the top level we have an alias to make it nicer to write the type-level result of multiplication.

    type Prod<A, B> = <A as Mul<B>>::Output;

If you remember way back when I introduced type-level function definitions, I used trait Sum as my example. Well, that sum-of-a-tuple is how I make the type-level sum calculation wide and shallow, because a tuple can wrap an arbitrary number of arguments.

Back to the trait bounds, next I declare how I will calculate my result types. The difference between addition and subtraction doesn't matter when we only care about zeroness.

    (Prod<Ar, Br>, Prod<Ai, Bi>): Sum,
    (Prod<Ar, Bi>, Prod<Ai, Br>): Sum,

Now a new part. I need to say that the zeroness calculated by Sum determines which implementation of my CMul concrete multiplication to use.

    <(Prod<Ar, Br>, Prod<Ai, Bi>) as Sum>::Output:
        CMul<Complex<Ar, Ai>, Complex<Br, Bi>>,
    <(Prod<Ar, Bi>, Prod<Ai, Br>) as Sum>::Output:
        CMul<Complex<Ar, Ai>, Complex<Br, Bi>>,

Finally I need to satisfy a couple of run time requirements: I need to copy my arguments into all of the result parts, and CMul may need to convert them.

    Complex<Ar, Ai>: Copy + Into<ComplexNum>,
    Complex<Br, Bi>: Copy + Into<ComplexNum>,

The zeronesses calculated by Sum are also the types of each part of the output type:

    type Output = Complex<
        <(Prod<Ar, Br>, Prod<Ai, Bi>) as Sum>::Output,
        <(Prod<Ar, Bi>, Prod<Ai, Br>) as Sum>::Output,
    >;

You can see a complete version of the complex number code on the Rust Playground

exponential growth

But I should warn you that there are a couple of issues with this design. Fortunately they don't seem to be troublesome so far, even though both of them involve exponential growth.

The first is an annoying aspect of the way Rust's conversion traits are defined. I want to be able to specify how to convert Zero into Num, then define ComplexNum conversions in terms of anything that can be converted into a Num, like this:

    impl<R,I> From<Complex<R,I>> for ComplexNum
    where
        R: Into<Num>,
        I: Into<Num>,
    {
        fn from(c: Complex<R, I>) -> Self {
            let Complex(r, i) = c;
            Complex(r.into(), i.into())
        }
    }

But sadly the Rust compiler complains:

    error[E0119]: conflicting implementations of trait
        `std::convert::From<complex::Complex<f64, f64>>`
        for type `complex::Complex<f64, f64>`:
    = note: conflicting implementation in crate `core`:
            - impl<T> From<T> for T;

Ugh! Instead I use a macro to implement the three conversions involving Zero and leave out the no-op From<ComplexNum> for ComplexNum that conflicts with the standard library.

For multivectors, unfortunately I need 31 of these conversions. Yuck.

The other case where it goes exponential is my wide and shallow Sum trait. For multivectors, I need to get the zeroness of the sum of up to 9 (nine!) arguments, each of which can be Zero or Num.

Aside: I said earlier that the multivector result expressions often have 16 additions. Why is it only 9 now? This is where the abstract type-level zeroness expression can be simpler than the concrete run-time expression, because the type-level expression deals with the 5x5 parts, whereas the run-time expression deals with the 16x16 numbers.

In total the Sum trait has 820 implementations! (I don't need argument lists of every possible length.) This seems like a lot, but the trait implementations don't involve any run time code, so they cause no slow LLVM codegen / optimization. The compiler seems to handle it without trouble.

Aside: in one of my futile detours, I tried using macros to generate all the 1024 possible concrete implementations of Mul for multivectors. This did make the compiler sad: it took well over ten seconds to compile or type-check a trivial program. The code described here compiles in less than a second which is much more tolerable.

To keep these 820 implementations linear at the source code level, I have used a little bit of macrology, so I can fit them all into 50 lines of code.

There's one last obvious problem with using this code for more than wild experiments: my Num type is currently fixed; ideally it should be generic over numeric types. There are probably a few more significant hurdles that would make this difficult, but for now I am going to leave it on the todo list.

what next

Well these notes have become rather more than 4000 words according to wc! Well done if you read this far :-)

As I said earlier, what I have so far is only a proof-of-concept. It has passed some trivial tests, but I have not yet tried to use it properly. So the next thing is obviously to try to draw some pictures!