It was in the late ’80s when I started to become interested in fractals. At the time then computers very unbelievable slow compared to what we have today thus I tried to speed up the algorithms in several ways.

This article is about using integer operations to calculate fractal images, more specifically I’ll explain how to replace floating point operations by integer operations (It is not about the theory behind fractals. There’s a lot literature about that.).

### History

I used an Amiga 500 computer which was an incredible system in the late ’80s. The system design was far ahead of its time and also the performance was pretty high. Intel PCs at the time then looked like an old rusty bicycle compared to the Amiga computer. The Amiga’s core was a Motorola 68000 CPU which internally was a full 32 bit microprocessor using the CISC design. Thus, it was a pleasure to write assembler code compared to other CPUs. But still, most systems had no floating point unit as it usual for modern computers of today. Thus floating point operations had to be implemented in software.

### Background

CPUs internally have a set of registers (which could be considered as local variables) of a fixed bit width. In case of the 68000 these are 32 bit wide registers.^{1} Almost every CPU can execute the four basic arithmetical integer operations add, subtract, multiply, and divide^{2} where the arguments to these ops are the registers. Additionally there are logical operations such as bit shifting.

Not all operations are executed at the same speed. Add/subtract and bit shifting typically is faster than multiplications and divisions.^{3} All other mathematical operations and functions have to be constructed using a sequence of these basic arithmetic instructions and thus are obviously slower. Such functions can be e.g. operations on integers which are longer than the native bit size of the registers, floating point arithmetics, or even more complex algorithms such as trigonometric functions or similar.

The goal is to avoid expensive operations particularly in the inner loops of algorithms.

### Fixed Point Arithmetic

Replacing floating point operations by integer operations technically is a fixed point arithmetic. That is that we move the decimal point to another fixed position during the algorithm and finally move it back to get the result. This sounds weird but is pretty easy explained with an example. Consider the two random decimal numbers 3.14 and 0.69. If we add or muliply these two the results are 3.83 = 3.14 + 0.69, and 2.16 = 3.14 * 0.69. Unfortunately our CPU cannot deal with decimal numbers thus we loose the fractional part and in turn we will loose a lot of precision: 3 = 3 + 0, and 0 = 3 * 0.

To solve this we move the point 2 digits to the right. Mathematically this is multiplying by 100 in this example. Thus 314 = 3.14 * 100 and 69 = 0.69 * 100. Now we can add them since they are integer numbers: 383 = 314 + 69. To get the result just divide by 100 again which is 3.83 = 383 / 100.

This works with any multiplier *f*. Every variable is considered to be multiplied by *f*, e.g. *a → a · f*. We then have to make sure that the equations stay in balance if the operands are multiplied by *f*.

If *a + b = c* then also *a · f + b · f = c · f*. At multiplications (*a · b = c*) the result has to be divided by f again, because *a · f · b · f = c · f ^{2}*, thus

*c ·f = (a · f · b · f) / f*.

As already mentioned, multiplications and divisions are expensive, thus there is another optimization. If the factor *f* is chosen to be a power of 2, multiplications and divisions (of integer numbers) can be replaced by bit shift operations. Shifting a number to the left by one bit is equal to multiplying with 2.

### Application

The following code shows the inner loop which is executed for each pixel of a fractal image.

int iterate(double real0, double imag0) { double realq, imagq, real, imag; int i; real = real0; imag = imag0; for (i = 0; i < MAXITERATE; i++) { realq = real * real; imagq = imag * imag; if ((realq + imagq) > 4) break; imag = real * imag * 2 + imag0; real = realq - imagq + real0; } return i; }

There are several operations in it. To rewrite this to integer operations, additions can be directly done as discussed before. The result of the multiplications has to be shifted to the right by the number of bits which expresses the multiplication factor. The functions is called with the parameters `real0`

and `imag0`

already multiplied by *f*.

int iterate(nint_t real0, nint_t imag0) { nint_t realq, imagq, real, imag; int i; real = real0; imag = imag0; for (i = 0; i < MAXITERATE; i++) { realq = (real * real) >> NORM_BITS; imagq = (imag * imag) >> NORM_BITS; if ((realq + imagq) > (nint_t) 4 * NORM_FACT) break; imag = ((real * imag) >> (NORM_BITS - 1)) + imag0; real = realq - imagq + real0; } return i; }

In this code snippet the cpp macro NORM_FACT represents the factor f and NORM_BITS the number of bits. Thus NORM_FACT = 2^{NORM_BITS}. Line number 16 shows a bit shift to the right by just NORM_BITS – 1. This is because the result is multiplied by two again which is a bit shift to the left by one.

The full source code can be downloaded from cypherpunk.at/download/intfract.

### Limitations

Although this probably sounds great, there is a limitation and this is the range of numbers. Integer numbers have a fixed width of bits and thus a maximum number of different values.. If we use a multiplication factor as show above, we can do arithmetics on fractional numbers and thus also on smaller numbers between 0 and 1 but the maximum number of different values expressible by an integer does still not change. A 32 bit integer has a range from -2147483648 to 2147483647. If we choose the factor *f* to be 2^{13} = 8192, then the tiniest fractional number is 1 / 8192 which is approx. 0.000122.

- The Intel i286 had 16 bit registers, the i386 up to the Pentium generation had 32 bit registers, and modern Intel cores have 64 bit registers. In the embedded area a lot of 8 bit CPU are in uses such as e.g. the AVR cores, or the 8051-based cores. ↩
- E.g. the AVR core has no divide instruction. ↩
- On the M68000 a multiplication used up to 70 CPU cycles in contrast to an addition which took just 4 to 8 cycles. ↩