janmr blog

Implementing Multiple-Precision Arithmetic, Part 1


This article is the first in a series dealing with algorithms for multiple-precision arithmetic. The goal is to present both a theoretical foundation with high-level algorithm descriptions (based on Section 4.3.1, The Classical Algorithms, of The Art of Computer Programming, Volume 2, by Donald E. Knuth) and a portable C++ implementation of the algorithms. The theory and high-level algorithms will be quite universal and generic, whereas the presented code will be just one way to implement the algorithms in a specific programming language.

We start out by considering only non-negative integers. A number will be represented in radix using the notation

We will call an -digit number and , , etc., its digits. Unless stated otherwise we will always have that the most-significant digit is non-zero, here , and we will represent zero with no digits, . We have , implying for .

Let the word size of the data type T used for each digit be . For instance, if T is a 32 bit unsigned integer, we have . We will implement the algorithms using and exploit the fact that C++ does arithmetic on unsigned integers modulo (see paragraph 3.9.1 (4) of the C++ standard). This makes it possible to implement portable algorithms. They will not be optimal with respect to speed, however, and it will be noted when specialized operations, such as add-with-carry instructions, would lead to more efficient implementations.

Data Structures

A non-negative number will be represented in C++ as an instance of the class NonNegativeInteger:

template <typename T, typename V=detail::SimpleDigitVector<T> >
class NonNegativeInteger {
  boost::shared_ptr<V> digitvec;
  // ...
  NonNegativeInteger(T value);
  // ...

The type argument T is used to represent each digit. It must be integer and unsigned, so unsigned char, unsigned short, unsigned int, unsigned long, and unsigned long long can all be used (the type long long int is not standard C++, but is, e.g., supported by GCC). If a digit type with 8, 16, 32, or 64 bits is needed, the boost integer types uint8_t, uint16_t, uint32_t, uint64_t (from the namespace boost) can be used with portability ensured (uint64_t is not always available, but the macro BOOST_NO_INT64_T will tell you if it is not).

The type argument V is used as the container type for the digits. The default container is SimpleDigitVector (which at this time is also the only container supported). This default container simply wraps an array of size (at least) the number of digits.

Note that the digit container digitvec of NonNegativeInteger is wrapped in a boost shared pointer. Consider the small code excerpt:

NonNegativeInteger<unsigned> a=value1(), b=value2(), c;
c = a;
b += a;
c += b;

Because of the shared pointer, the first assignment, c = a, is very cheap and will not result in copying the digit container. Instead, the digitvec of both a and c will refer to the same instance of the digit container (which gets a reference count of 2). This also makes it cheap to pass arguments by value and returning numbers by value. The next statement, b += a, can add a to b in place (assuming there is enough space in b's digitvec to contain the result) because it is possible to check if b is the only instance referring to its digit container. Contrarywise, the last statement, c += a, cannot add a to c in place because a refers to the same container as c does.


Adding radix -numbers is quite straightforward and is easily done using the familiar pencil-and-paper method.

To formalize, we first consider adding two -digit numbers, , and , obtaining an -digit sum . Note that we here may end up with (in which case ). We have

Note that for , where is equal to if is true and equal to if is false. This means and furthermore . Using that and it is quite easy to show that, in fact, .

The cases where or is zero makes addition trivial. Similarly, if the number of digits in and are different, is it quite easy to adjust the algorithm above.

Let us now look at some implementation details. How do we compute and for ? We have several options, some of which are

  1. Use . Then and can be computed directly.
  2. Use and the CPU's add and add-with-carry instructions.
  3. Use , but the computations must be done in some portable C++ way.

Option 1 is actually not an option because we insist on using . Option 2 leads to the most efficient code, regarding both space and speed. The problem is that these special instructions are not directly accessible via the C++ standard. Some compilers, though, make it possible to use inline assembly. For instance, GCC has such capabilities.

Option 3 is the way to go. As mentioned earlier, C++ does calculations modulo , so comes ‘for free’ as simply z = x + y in C++. Left is how to detect whether a carry occurs during an addition. One way to do that is the following. Consider for which there are two possibilities. Either () which implies and , or we have () which implies , leading to and . So . Another way to detect whether a carry occurs is to split and into a low and high part, and then adding the low and high parts seperately—keeping track of a possible intermediate carry, of course.


An algorithm for multiple-precision subtraction is similar to addition, as just considered. Again we set and with . To ensure that the result is a non-negative integer we furthermore require that . We now have the algorithm:

Note that any digit of the result may end up being zero—we only know that . Note furthermore that . Verification of the algorithm is easily done using the fact that for .

The options when implementing the algorithm is, again, much like for addition. Using the CPU's subtract-with-borrow instruction would be ideal here, but it cannot be done portably in C++. The borrow when calculating can be computed as simply , or, to avoid branching instructions, we can split and into two parts and do the subtraction for each part seperately.


We seek an algorithm to compute where

We first, however, consider the simpler operation where

The following algorithm suggests itself:

If then obviously . If then may be zero, in which case , since

Note that since . The algorithm above is easily verified, using that .

We now turn to and get

From this we see that we can compute by doing a number of -type operations, if we start with and then do in-place updates for each . The -factor simply determines the ‘digit offset’ on which the updates should be done:

for . Note that may be zero, in which case , since .

Now for some implementation details. We note that the only non-trivial computation is , where , followed by computing and . Most CPUs have instructions that can multiply two word-size numbers and produce a double-word answer. As was the case for addition and subtraction, we don't have access to these instructions from standard C++. We have but every multiplication result in our portable C++ implementation must be smaller than . We can do this by using a new base number where (we assume that is chosen appropriately so is integer) and setting

We can now use the multiplication algorithm above on a ‘smaller scale’ to compute the product . We can furthermore expand the methods slightly and incorporate the addition of and in an elegant way:

We now have and .

Concluding Remarks

We have now covered addition, subtraction, and multiplication of non-negative integers of arbitrary magnitude. Left is how to do division, which will be the subject of the next article.

Update 2010-07-03: The code has undergone some changes since this article was written. See the project page for more information.