Processor-efficient sparse matrix-vector multiplication

L.S Heath; C.J Ribbens; S.V Pemmaraju

doi:10.1016/j.camwa.2003.06.009

Back

Processor-efficient sparse matrix-vector multiplication

Journal article

Open access

Peer reviewed

Processor-efficient sparse matrix-vector multiplication

L.S Heath, C.J Ribbens and S.V Pemmaraju

Computers & mathematics with applications (1987), Vol.48(3), pp.589-608

2004

DOI: 10.1016/j.camwa.2003.06.009

Files and links (1)

url

https://doi.org/10.1016/j.camwa.2003.06.009View

Published (Version of record) Open Access

Abstract

The matrix-vector multiplication operation is the kernel of most numerical algorithms.Typically, matrix-vector multiplication algorithms exploit the sparsity of a matrix, either to reduce the time taken or the memory used. In the case of parallel implementations of numerical algorithms, the sparsity of a matrix can also be used to reduce the number of processing elements (PEs) required. For example, in Leiserson's systolic array algorithm for matrix-vector multiplication the nonzeros in the matrix are partitioned into bands and each band is fed into a different PE. Similarly, in Melhem's algorithm, the nonzeros are partitioned into stripes and each stripe is fed into a distinct PE. However, in many practical applications, such as the numerical solution of partial differential equations, a matrix is much sparser than is indicated by its bandwidth or by its stripe width. A new measure of sparsity of a matrix called staircase width is introduced. A staircase in a matrix M = (m i,j)n×n is a set of matrix positions S = {(i, j) | m i,j ≠ 0, such that no position in S is strictly to the northeast of another position in S. The staircase cover of a matrix is a partition of the positions of nonzero entries in the matrix into staircases and the staircase width of a matrix is the size of a smallest staircase cover of the matrix. The staircase width of a matrix is never larger than the bandwidth or the stripe width of a matrix. Moreover, it is significantly smaller than either for some commonly occurring types of matrices such as arrowhead matrices. Experiments with sparse matrices derived from a variety of engineering problems suggest that, in practice, the staircase width of a matrix is about half the stripe width of the matrix. An efficient algorithm to compute a minimum width staircase cover of a matrix is presented. Staircase covers are used as the basis of a new parallel algorithm to compute a matrix-vector product on a linear systolic array of PEs. This algorithm uses as many PEs as the staircase width of the matrix, and therefore, saves on the number of PEs relative to algorithms of Leiserson and Melhem. It is shown that despite using fewer PEs, the algorithm uses no more time steps than Leiserson's or Melhem's algorithm.

Bandwidth

Matrix-vector multiplication

Partial orders

Sparse matrices

Staircase width

Stripe width

Systolic arrays

Details

Title: Subtitle: Processor-efficient sparse matrix-vector multiplication
Creators: L.S Heath - University of Iowa
C.J Ribbens - University of Iowa
S.V Pemmaraju - University of Iowa
Resource Type: Journal article
Publication Details: Computers & mathematics with applications (1987), Vol.48(3), pp.589-608
DOI: 10.1016/j.camwa.2003.06.009
ISSN: 0898-1221
eISSN: 1873-7668
Publisher: Elsevier Ltd
Language: English
Date published: 2004
Academic Unit: Computer Science
Record Identifier: 9984259437302771

Metrics

15 Record Views

5 Times Cited - Web of Science