This page was generated from dialect/graphblas_dialect_tutorials/graphblas_lower/matrix_ops.ipynb.
Matrix Ops¶
This example will go over how to use the --graphblas-lower pass from graphblas-opt to lower several ops from the GraphBLAS dialect that deal with CSR or CSC matrices. This tutorial does not exhaustively cover all such ops. It merely intends to provide some examples to learn from.
Let’s first import some necessary modules and generate an instance of our JIT engine.
import mlir_graphblas
import numpy as np
from mlir_graphblas.tools.utils import sparsify_array
engine = mlir_graphblas.MlirJitEngine()
Using development graphblas-opt: /Users/pnguyen/code/mlir-graphblas/mlir_graphblas/src/build/bin/graphblas-opt
Here are the passes we’ll use.
passes = [
"--graphblas-structuralize",
"--graphblas-dwim",
"--graphblas-optimize",
"--graphblas-lower",
"--sparsification",
"--sparse-tensor-conversion",
"--linalg-bufferize",
"--func-bufferize",
"--tensor-bufferize",
"--finalizing-bufferize",
"--convert-linalg-to-loops",
"--convert-vector-to-llvm",
"--convert-math-to-llvm",
"--convert-math-to-libm",
"--convert-scf-to-cf",
"--convert-memref-to-llvm",
"--convert-openmp-to-llvm",
"--convert-arith-to-llvm",
"--convert-std-to-llvm",
"--reconcile-unrealized-casts",
]
Overview of graphblas.matrix_multiply¶
Here, we’ll show how to use the graphblas.matrix_multiply op.
graphblas.matrix_multiply performs a matrix multiply according to the given semiring and optional structural mask. See the ops reference for further details regarding the intended behavior.
We’ll show some examples here of how to use graphblas.matrix_multiply.
Let’s first create some example input tensors.
A_dense = np.array(
[
[1, 0, 0, 0],
[2, 0, 3, 4],
[0, 0, 0, 0],
[0, 0, 5, 6],
],
dtype=np.float64
)
A = sparsify_array(A_dense, [False, True])
B_dense = np.array(
[
[0, 7, 0, 7],
[0, 1, 0, 0],
[0, 1, 7, 0],
[0, 7, 2, 0],
],
dtype=np.float64
)
B = sparsify_array(B_dense, [False, True])
Here is some MLIR code to perform a conventional matrix-multiply by using graphblas.matrix_multiply with the plus-times semiring.
Note that both of the matrices above are CSR matrices. Thus, we’ll need to convert the layout of the second operand in our code below.
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (i,j)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
#CSC64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (j,i)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @matrix_multiply_plus_times(%a: tensor<?x?xf64, #CSR64>, %b: tensor<?x?xf64, #CSR64>) -> tensor<?x?xf64, #CSR64> {
%answer = graphblas.matrix_multiply %a, %b { semiring = "plus_times" } : (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) to tensor<?x?xf64, #CSR64>
return %answer : tensor<?x?xf64, #CSR64>
}
}
"""
Let’s run it and verify that it gets the same results as NumPy.
engine.add(mlir_text, passes)
['matrix_multiply_plus_times']
sparse_matmul_result = engine.matrix_multiply_plus_times(A, B)
sparse_matmul_result.toarray()
array([[ 0., 7., 0., 7.],
[ 0., 45., 29., 14.],
[ 0., 0., 0., 0.],
[ 0., 47., 47., 0.]])
np.all(A_dense @ B_dense == sparse_matmul_result.toarray())
True
We’ll now show how to perform a matrix-multiply with the plus-plus semiring with and without a structural mask to show how the behavior differs.
mask = sparsify_array(
np.array(
[
[0, 0, 9, 8],
[0, 0, 7, 6],
[0, 0, 5, 4],
[0, 0, 3, 2],
],
dtype=np.float64
),
[False, True]
)
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (i,j)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
#CSC64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (j,i)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @matrix_multiply_plus_plus_no_mask(%a: tensor<?x?xf64, #CSR64>, %b: tensor<?x?xf64, #CSR64>) -> tensor<?x?xf64, #CSR64> {
%answer = graphblas.matrix_multiply %a, %b { semiring = "plus_plus" } : (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) to tensor<?x?xf64, #CSR64>
return %answer : tensor<?x?xf64, #CSR64>
}
func @matrix_multiply_plus_plus(%a: tensor<?x?xf64, #CSR64>, %b: tensor<?x?xf64, #CSR64>, %m: tensor<?x?xf64, #CSR64>) -> tensor<?x?xf64, #CSR64> {
%answer = graphblas.matrix_multiply %a, %b, %m { semiring = "plus_plus" } : (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) to tensor<?x?xf64, #CSR64>
return %answer : tensor<?x?xf64, #CSR64>
}
}
"""
engine.add(mlir_text, passes)
['matrix_multiply_plus_plus_no_mask', 'matrix_multiply_plus_plus']
no_mask_result = engine.matrix_multiply_plus_plus_no_mask(A, B)
with_mask_result = engine.matrix_multiply_plus_plus(A, B, mask)
no_mask_result.toarray()
array([[ 0., 8., 0., 8.],
[ 0., 24., 16., 9.],
[ 0., 0., 0., 0.],
[ 0., 19., 20., 0.]])
with_mask_result.toarray()
array([[ 0., 0., 0., 8.],
[ 0., 0., 16., 9.],
[ 0., 0., 0., 0.],
[ 0., 0., 20., 0.]])
Note how the results in the masked output only have elements present in the positions where the mask had elements present.
Since we can’t verify the results via NumPy given that it doesn’t support semirings in its matrix multiply implementation, we’ll leave the task of verifying the results as an exercise for the reader. Note that the “multiplication” portion of the semiring applies intersection behavior on the elements, meaning if a position in the dot product has not overlap, that position will contribute no value to be aggregated.
Next, we’ll show how the mask_complement attribute for graphblas.matrix_multiply works. It simply makes the graphblas.matrix_multiply only calculate output values in positions not present in the mask.
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (i,j)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
#CSC64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (j,i)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @matrix_multiply_plus_plus_mask_complement(%a: tensor<?x?xf64, #CSR64>, %b: tensor<?x?xf64, #CSR64>, %m: tensor<?x?xf64, #CSR64>) -> tensor<?x?xf64, #CSR64> {
%answer = graphblas.matrix_multiply %a, %b, %m { semiring = "plus_plus", mask_complement = true } : (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) to tensor<?x?xf64, #CSR64>
return %answer : tensor<?x?xf64, #CSR64>
}
}
"""
engine.add(mlir_text, passes)
['matrix_multiply_plus_plus_mask_complement']
mask_complement_result = engine.matrix_multiply_plus_plus_mask_complement(A, B, mask)
mask_complement_result.toarray()
array([[ 0., 8., 0., 0.],
[ 0., 24., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 19., 0., 0.]])
All of the above behavior works similarly with vectors. We’ll leave exploring that as an exercise for the reader.
Overview of graphblas.apply¶
Here, we’ll show how to use the graphblas.apply op.
graphblas.apply applies in an element-wise fashion the function indicated by the apply_operator attribute to each element of the given sparse tensor. The operator can be unary or binary. Binary operators require a thunk. The supported binary operators are “min”, “div”, and “fill”. Unary operators cannot take a thunk. Unary operators cannot take a thunk. The supported unary operators are “abs”, “minv” (i.e. multiplicative inverse or 1/x), “ainv” (i.e. additive inverse or -x), and
“identity”.
The given sparse tensor must either be a CSR matrix, CSC matrix, or a sparse vector.
We’ll show some examples here of how to use graphblas.apply.
First, we’ll clip the values of a sparse matrix to be no higher than a given limit using the “min” operator.
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed" ],
dimOrdering = affine_map<(i,j) -> (i,j)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @clip(%sparse_tensor: tensor<?x?xf64, #CSR64>, %limit: f64) -> tensor<?x?xf64, #CSR64> {
%answer = graphblas.apply %sparse_tensor, %limit { apply_operator = "min" } : (tensor<?x?xf64, #CSR64>, f64) to tensor<?x?xf64, #CSR64>
return %answer : tensor<?x?xf64, #CSR64>
}
}
"""
engine.add(mlir_text, passes)
['clip']
sparse_result = engine.clip(A, 3)
sparse_result.toarray()
array([[1., 0., 0., 0.],
[2., 0., 3., 3.],
[0., 0., 0., 0.],
[0., 0., 3., 3.]])
Using the “min” operator is simple since it is a symmetric function. There are some binary operators that are not symmetric, e.g. the “div” operator is not symmetric since dividing all the elements in a sparse tensor by x will lead to a different result than dividing x by each element in a sparse tensor.
We’ll show some code below that’ll demonstrate how to specify the order in which the elements of the given sparse tensor and the thunk are passed to the operator. We’ll show this using the “div” operator.
mlir_text = """
#CV64 = #sparse_tensor.encoding<{
dimLevelType = [ "compressed" ],
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @apply_vector_left_thunk_div(%sparse_tensor: tensor<?xf64, #CV64>, %thunk: f64) -> tensor<?xf64, #CV64> {
%answer = graphblas.apply %thunk, %sparse_tensor { apply_operator = "div" } : (f64, tensor<?xf64, #CV64>) to tensor<?xf64, #CV64>
return %answer : tensor<?xf64, #CV64>
}
func @apply_vector_right_thunk_div(%sparse_tensor: tensor<?xf64, #CV64>, %thunk: f64) -> tensor<?xf64, #CV64> {
%answer = graphblas.apply %sparse_tensor, %thunk { apply_operator = "div" } : (tensor<?xf64, #CV64>, f64) to tensor<?xf64, #CV64>
return %answer : tensor<?xf64, #CV64>
}
}
"""
Note how in apply_vector_left_thunk_div, the thunk is passed as the first argument to graphblas.apply. This means that the thunk is the dividend,
Since the thunk is passed as the second argument to graphblas.apply in apply_vector_right_thunk_div, this means the thunk is the divisor.
engine.add(mlir_text, passes)
['apply_vector_left_thunk_div', 'apply_vector_right_thunk_div']
Since our code operators on vectors, let’s create some vector inputs.
dense_vector = np.array([0, 0, -100, 0, 0, 0, 200, -300, 0, 0, 400, 0, 0], dtype=np.float64)
sparse_vector = sparsify_array(dense_vector, [True])
Let’s see what we get as outputs.
thunk_dividend_result = engine.apply_vector_left_thunk_div(sparse_vector, 10)
thunk_dividend_result.toarray()
array([ 0. , 0. , -0.1 , 0. , 0. ,
0. , 0.05 , -0.03333333, 0. , 0. ,
0.025 , 0. , 0. ])
thunk_divisor_result = engine.apply_vector_right_thunk_div(sparse_vector, 10)
thunk_divisor_result.toarray()
array([ 0., 0., -10., 0., 0., 0., 20., -30., 0., 0., 40.,
0., 0.])
Notice that in the result of thunk_dividend_result.toarray(), we have zeros where there were missing values.
This is somewhat unintuitive. We’d expect some sort of NaN values where we see zeros since that’s the expected result for 10.0/0.0. This is because graphblas.apply only operates on values present in the given sparse tensor and does nothing to the missing values. MLIRSparseTensor.toarray materializes the missing values as zeros, but it could also materialize them as any other value.
thunk_dividend_result.toarray(missing=-1)
array([-1. , -1. , -0.1 , -1. , -1. ,
-1. , 0.05 , -0.03333333, -1. , -1. ,
0.025 , -1. , -1. ])
Our final example for graphblas.apply will demonstrate how to apply the absolute value function to each element of a sparse vector.
Because abs doesn’t require a thunk value, none is provided.
mlir_text = """
#CV64 = #sparse_tensor.encoding<{
dimLevelType = [ "compressed" ],
pointerBitWidth = 64,
indexBitWidth = 64
}>
module {
func @vector_abs(%sparse_tensor: tensor<?xf64, #CV64>) -> tensor<?xf64, #CV64> {
%answer = graphblas.apply %sparse_tensor { apply_operator = "abs" } : (tensor<?xf64, #CV64>) to tensor<?xf64, #CV64>
return %answer : tensor<?xf64, #CV64>
}
}
"""
engine.add(mlir_text, passes)
['vector_abs']
sparse_result = engine.vector_abs(sparse_vector)
sparse_result.toarray()
array([ 0., 0., 100., 0., 0., 0., 200., 300., 0., 0., 400.,
0., 0.])