We present a method for parallel block-sparse matrix-matrix multiplication.
A distributed quadtree matrix representation allows exploitation of data locality.
The quadtree structure is implemented using the Chunks and Tasks programming model.
Data locality is exploited without prior information about matrix sparsity pattern.
Constant communication per node on average is achieved in weak scaling tests.