Go backward to Scalar Loop
Go up to Top
Go forward to Chaining

Vector Loop

Vectorized assembly code:

V1 <- A
V2 <- B
V3 <- V1+V2
A  <- V2

4n clock cycles, because no loop iteration overhead (ignoring speedup by pipelining)

Author: Wolfgang Schreiner
Last Modification: October 14, 1997