GraphBLAS Passes and graphblas-opt¶
The mlir-graphblas package includes an drop-in replacement for mlir-opt
called graphblas-opt
which includes the custom graphblas
dialect along
with all the other dialects that ship with MLIR.
This document is not intended to be a complete tutorial on graphblas-opt
and acts more
as a reference manual for the passes exclusively available for graphblas-opt
. For tutorials
on these passes specific to graphblas-opt
, see our GraphBLAS Dialect Tutorials.
There are five passes specific to graphblas-opt
with two intended usages:
Standard Lowering
--graphblas-structuralize
: Lowers higher-levelgraphblas
ops into lower-level genericgraphblas
ops--graphblas-dwim
: Addsconvert-layout
calls to inputs to match the expected shape and type of genericgraphblas
ops--graphblas-optimize
: Fuses lower-level genericgraphblas
ops together to eliminate temporary tensors and redundant loops.--graphblas-lower
: Convertgraphblas
ops to thescf
,std
, andmemref
dialects.
Linalg Lowering
--graphblas-structuralize
: Lowers higher-levelgraphblas
ops into lower-level genericgraphblas
ops--graphblas-optimize
: Fuses lower-level genericgraphblas
ops together to eliminate temporary tensors and redundant loops.--graphblas-linalg-lower
: Convertgraphblas
ops to thelinalg
dialects, specifically tolinalg.generic
calls. These will be further lowered by thesparse_tensor
dialect.
--graphblas-structuralize
Pass¶
The structuralization pass performs eight generic transformations:
Transform
graphblas.matrix_multiply
ops into equivalentgraphblas.matrix_multiply_generic
ops.Transform
graphblas.apply
ops into equivalentgraphblas.apply_generic
ops.Transform
graphblas.select
ops into equivalentgraphblas.select_generic
ops.Transform
graphblas.union
ops into equivalentgraphblas.union_generic
ops.Transform
graphblas.intersect
ops into equivalentgraphblas.intersect_generic
ops.Transform
graphblas.update
ops into equivalentgraphblas.update_generic
ops.Transform
graphblas.reduce_to_vector
ops into equivalentgraphblas.reduce_to_vector_generic
ops.Transform
graphblas.reduce_to_scalar
ops into equivalentgraphblas.reduce_to_scalar_generic
ops.
--graphblas-dwim
Pass¶
The DWIM (Do-What-I-Mean) pass automatically converts the layout of input matrices to conform to the requirements each pass. DWIM transformations are included for:
graphblas.transpose
graphblas.reduce_to_vector
graphblas.matrix_multiply
graphblas.matrix_multiply_reduce_to_scalar
--graphblas-optimize
Pass¶
The optimization pass performs two transformations:
Combine
graphblas.matrix_multiply
followed bygraphblas.apply
into a singlegraphblas.matrix_multiply
with an additional scalar transformation of the output elements as they are written to memory.Combine
graphblas.matrix_multiply
followed bygraphblas.matrix_reduce_to_scalar
into a newgraphblas.matrix_multiply_reduce_to_scalar
op that does the sparse multiply and accumulate in a single pass (rather than two passes forgraphblas.matrix_multiply
).
--graphblas-lower
Pass¶
The lowering pass is intended to allow the graphblas
ops to be executed on
a CPU target. Although the lowering assumes the same C++ structure as is used in
the existing sparse tensor related passes in MLIR, --graphblas-lower
does
not use the linalg
dialect, as there is not yet support for sparse output.
The graphblas
lowering pass requires a small runtime based on the MLIR
sparse runtime, but with some additional functions that allow for the
allocation, resizing, and deallocation of the sparse tensor data structure.
In general, this pass uses scf
ops to handle traversing the sparse data
structure, with scf.parallel used when possible to indicate loops where the
loop body iterations are independent of each other, or are performing a
well-defined reduction.
--graphblas-linalg-lower
Pass¶
The linalg lowering will eventually be a full replacement for the standard lowering pass,
once improvements are made to the sparse_tensor
handling of linalg.generic
.
One benefit of linalg lowering is a much smaller and simpler lowering output to maintain. Another is the ability to handle more types of inputs, eliminating the need for the DWIM pass.