Deriving the quaternion product
Introduction
I spent a little time this week getting to grips with quaternions in order to better understand some orientation filters I was using at work. I had worked with quaternions before but never had the time to fully digest them and understand them deeply. That’s not a position I like to be in as someone who thinks of themselves as a mathematician; using a mathematical tool without understanding it is iffy at the best of times, and I’m a big believer in learning for it’s own sake anyway.
Sure enough, my limited understanding led to a few simple tasks taking a lot longer than they should have. In the end I spent a morning reading (ok, skimming) a 150-page book on quaternions, following a few worked examples, proving a few theorems and properties, and felt much better about them.
I decided to write up what I learned, and write a small library to do the work, in order to further cement and test my understanding. That work is presented here. I’ll start with
- a definition of quaternions,
- deriving the quaternion product.
In later articles I will discuss the creation of the C++ header-only quaternion library, and talk about the main practical use of quaternions: 3D rotations.
Quaternions
Quaternion definition
Quaternions are an extension of the real numbers which introduces two new imaginary numbers, called j and k. These numbers are imaginary in the sense that they have the property j² = −1, k² = −1, but they are not equal to the more familiar imaginary number i. The usual set of statements that define these numbers is as follows:
Quaternions in general are non-commutative, and in particular the products in equation 1 are in fact anti-commutative. This means that ij = k implies ji = −k.
A quaternion, then, is an element in a four-dimensional space where the standard basis vectors are the set {1, i, j, k}. In this document I will usually use the Cartesian notation which is exactly analogous to the Cartesian form of complex numbers. In this notation, a quaternion is given by
q₀ may be called the real part or the scalar part of the quaternion, and the other elements taken together may be called the imaginary part or the vector part. A pure quaternion is one where the real part is equal to zero, and the identity quaternion has scalar part 1 and vector part 0.
Defining the quaternion product
Using the Cartesian form for quaternions, let’s define two quaternions, p and q, such that
On the assumption that multiplication is distributive over addition, we can expand the product pq to give
Using our definition statement in equation 1 we can simplify this rather lengthy expression by replacing imaginary squares with −1, and products of different imaginary numbers with the single imaginary number they correspond to according to equation 1. We also group all terms involving i together, and likewise for j and k.
From here on we can forget about the imaginary nature of i, j, k and treat them simply as unit basis vectors in the four-dimensional quaternion space. The computer scientists among you may like to stop here, as this is the most conveniently-computable form of the product. For the mathematicians, it should be easy to convince yourself that the first line is just
where p is the vector part of p and q is the vector part of q. So
It’s not quite as obvious how the remaining parts can be simplified, but those of you who have done enough linear algebra can probably sense a cross product trying to break out. Let’s work out the cross product of two 3-vectors p and q, and see if we can shake it loose.
Now that we have it here for comparison, we can see that the cross product is indeed present in equation 7, but it doesn’t account for all the terms. Let’s write it in and see what we’re left with.
By collecting together all the terms containing p₀, and all the terms containing q₀, we can see that we have a term p₀q = p₀(q₁i + q₂j + q₃k), and a similar term q₀p. Therefore we can write what remains as a couple of scalar multiples of the vector parts of our quaternions. This is the result:
This is about the best we can do, and it is the canonical quaternion product. You’re probably not looking forward to doing any quaternion calculations by hand, and that’s pretty reasonable of you. If you’re like me you’re probably having flashbacks to inverting a 4x4 matrix, or something along those lines. Luckily there aren’t really any contexts outside of education where you’d ever need to do this by hand. The most common use of quaternions is in computing 3D rotations in computer graphics, or in electromechanical settings where you need to track the orientation of something in 3-dimensional space. In both of those situations you will certainly be using a computer to do all the calculating.
Visual multiplication
If you’d like a more visual way to follow the terms through this multiplication, consider the following table.
The blue shaded regions are p₀q and q₀p. The red region is −p.q. The three green regions are the three terms of the cross product p x q. The grey region is the remaining scalar term p₀q₀. The sum of all these regions is the full quaternion product. The algebra might have made it seem like some terms came out of nowhere, but hopefully the table helps convince you there’s no magic involved.
Quaternions for rotation
If you’re here at all, you’re almost certainly aware that the main practical use of quaternions is for computing 3D rotations without having to fight the limitations of Euler angles. I will be discussing rotations in another article.
C++ quaternion library
As part of my journey to understand quaternions, I wrote a header-only C++ library which you can find here: