RT++ --- Tutorial and Reference Manual -- Long Integer Multiplication

$\"\"$ $\"\"$ $\"\"$
Go backward to Quicksort
Go up to Tutorial
Go forward to Long Integer Multiplication (Bag Version)

Long Integer Multiplication

The following piece of code depicts the skeleton of an RT++ program for the multiplication of two arbitrary precision integers (represented as arrays of digits). The full program can be found here.

#include <rt++.h>

typedef char Digit;
typedef Array<Digit> Number;
...

// ----------------------------------------------------------------------------
// sum of two thread results
// ----------------------------------------------------------------------------
Number sum(Thread<Number> ta, Thread<Number> tb)
{
  Number a = Wait(ta); Number b = Wait(tb);
  ...
  return(s);
}

// ----------------------------------------------------------------------------
// product of number and digit
// ----------------------------------------------------------------------------
Number prod(Number a, Digit b)
{
  ...
  return(p);
}

// ----------------------------------------------------------------------------
// multiplication of two numbers
// ----------------------------------------------------------------------------
Number mult(Number a, Number b)
{
  int lb = Length(b);
  List< Thread<Number> > l;                  // l = Thread<Number>::Nil();
  for (int i = 0; i < lb; i++)
    {
      Thread2<Number, Number, Digit> t(prod, a, b[i]);
      l.cons(t, l);                          // l = Cons(t, l);
    }
  int n = 2*lb-1;
  for (int j = 0; j < n-2 ; j += 2)
    {
      Thread<Number> ta(Head(l)); l.tail(); // l = Tail(l);
      Thread<Number> tb(Head(l)); l.tail(); // l = Tail(l);
      Thread2<Number, Thread<Number>, Thread<Number> > t(sum, ta, tb);
      l.cons(t, l);                         // l = Cons(t, l);
    }
  return(Wait(Head(l)));
}

int multmain(int argc, char *argv[]) 
{
  ...
  Number a(la); Number b(lb);
  ...
  Number c = mult(a, b);
  ...
}

The core of this program is the function mult which implements a parallel version of the traditional school algorithm where a number is an object of type Array<Digit>:

In the first phase, the algorithm creates a thread for each multiplication of number $a$ with some digit $b$ _i. All threads are collected in a list $l$ .
In the second phase, the algorithm creates a thread for the addition of each pair of thread results by removing the corresponding threads from $l$ and inserting the new thread into $l$ .

Eventually there is only thread left in

l

whose result denotes the overall computation result.

There are several issues remarkable about above implementation:

The program applies the RT++ type constructor List to construct element sequences of arbitrary length. New elements can be inserted at the head of a list and a list can be decomposed into its head element and the tail list.
The thread list $l$ has type List< Thread<Number> > i.e. each element of a list holds a thread returning a number. The type Thread<...> does (in contrast to the types Threadn<...>) not specify which argument types these threads have; correspondingly it can hold any kind of thread with the denoted result type (indeed both program phases insert threads of different argument types into $l$ ).
The program does not wait for the results of the first phase threads before creating the second phase threads. It simply extracts two appropriate threads from the list and creates a new thread that take the other threads as arguments. Threads are thus objects like any other kind of runtime data, in particular they can be stored in data structures (as shown above), passed as thread arguments and also returned as thread results.
The program has a design flaw by inserting the second phase threads at the begin of the thread list $l$ (i.e. using $l$ as a stack) and thus generating an unbalanced "thread tree" of depth $n-1$ (where $n$ is the number of threads). It would be better to insert the second phase threads at the end of $l$ (i.e. to use $l$ as a queue) and thus generate a fully balanced tree of depth $logn$ . This can be efficiently done by making use of RT++ destructive list operations.

The program makes heavy use of the RT++ type constructors and must provide the following top-level declarations for the garbage collector:

Atom(Digit);
Ref(Number);
Ref(Thread<Number>);

ThreadRes(Number);
ThreadArg2(A, Number, Number, Digit);
ThreadArg2(B, Number, Thread<Number>, Thread<Number>);

The first three declarations are necessary because digits, numbers, and threads are used as thread arguments and results and stored in lists and arrays, respectively. The remaining three declarations refer to the kinds of threads used in the program. The ThreadRes<...> declaration is required for every kind of Thread<...> type. If forgotten, the linker complains with error messages as the following

mult.o(.text+0xef8): undefined reference to 
  `Type<ResultThread<Number> >::init'
mult.o(.text+0xf0b): undefined reference to 
  `Type<ResultThread<Number> >::scan'

Author: Wolfgang Schreiner
Last Modification: April 12, 1997