Skip to content

Posts tagged ‘math’

24
Jan

The lost art of fixed-point math – Part One

Long time before floating point numbers were so commonly in use as today, fixed point representation was far more popular due to its high performance and accuracy, even on low-powered computers.
Since then, fixed point representation have become something like a “lost art”, at least on PC and full-sized game consoles (although it shouldn’t be nonsense saying some PSX games made use of it): the actual market offers powerful and cheap GPU that lets a programmer compute a 3d dot-product between two vectors in a single instruction, so why bother with fixed point anymore?
The growing popularity of cellphones and hand-held computers brings fixed point math back to the forefront of game and 3d graphics development: specifically, the market pressure to hardware manufacturers to make small and low-cost embedded chips eliminates the possibility of having a dedicated FPU for some time to come, but… hey! The ALU is still there!

Fixed point (fp) numbers are based upon the computer representation of integers and are used to represent a subset of the real numbers: like integers representation which fixed point rely on, fp numbers are finite, so its physically impossible to be able to represent the whole set of real numbers.
Nevertheless, computing range and precision limitation of a fixed point representation is actually really simple and we’ll get to this later on: now let’s review an 8-bit unsigned value binary representation:

27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1

You should be familiar with this: each bit represents twice the value of the bit at its right, with the LSb representing 1.
This representation is the one of integral values, where no fractional values are involved at all, so, how an integer can be used to represent both integral and fractional quantities?
As long as decimal numbers can have a decimal point representing the break between integral and fractional values, binary values can have a binary point as well, or generically, a radix point.
In fact, looking to the aforementioned bit-table for an 8-bit unsigned value, you are indeed looking to a fixed point representation where the radix point is to the right of the rightmost bit!
If you still can’t get it and you feel confused, think about that invisible radix point, now we are going to visualize it:

27 26 25 24 23 22 21 20 .
128 64 32 16 8 4 2 1

Simply put, if you used integer math, you always used fixed point math, but without knowing it: so, you got the point. ;)
Who said the radix point does have to be placed there? In fact, you are free to decide where to move it and this is what we are going to do right now: we’ll move the radix point to the middle of the integer representation, giving this way some bit-space for representing fractional values as well at the expense of loosing bit-space for the integral values representation.
Fixing the radix point to an n-bit position as we’ll do, clearly gives sense to the so-called “fixed point arithmetic”.
So, let’s define our first fixed point format to be the one resulting from moving the radix point to the left just to be able to end up with 4 bits on its left and 4 bits on its right:

23 22 21 20 . 2-1 2-2 2-3 2-4
8 4 2 1 1/2 1/4 1/8 1/16

The standard nomenclature for a fixed point format is M-dot-N, thus we are dealing with an 4-dot-4 fixed point format; “M” stands for the number of integral bits while “N” stands for the number of fractional bits.
Still every bit represents twice the value of the bit at its right, but now we are representing 1/16 (0.0625) with the rightmost bit: this is the scaling factor for the 4-dot-4 format and represents its precision granularity; so a simple 32-bit integer can be thought as a 32-dot-0 fixed point format, where its precision granularity is 1.

In the next part i’ll address how to convert Integral and Real types to a fixed point representation, how to compute the four basic operations between two fixed point numbers and how to overcome overflow and underflow situations with precision reduction and how to compute numerical limits for a given fixed point format.