Memory Access
- Row algorithm has computation/memory ratio O(n/p).
- Block algorithm uses computation/memory ratio O(n/sqrt(p)).
- Block algorithm has higher data locality.
- Cache performance of algorithm improves.
- Large input matices.
- Row algorithm: subsequent accesses to B cannot be cached
->O(n3/p) memory operations.
- Block algorithm: subsequent accesses to B can be cached
->O(n3/pc) memory operations.
- Important especially for distributed shared memory
architectures.
Reduce average memory latency time by increasing locality.
Author: Wolfgang Schreiner
Last Modification: October 27, 1997