This page was generated from tools/engine/matrix_plus_broadcasted_vector.ipynb.
JIT Engine: Matrix + Broadcasted Vector¶
This example will go over how to compile MLIR code intended to add a matrix to a vector using broadcasting.
In other words, we want to write a function equivalent to this NumPy code.
import numpy as np
matrix = np.full([4,3], 100, dtype=np.float32)
vector = np.arange(4, dtype=np.float32)
matrix + np.expand_dims(vector, 1)
array([[100., 100., 100.],
[101., 101., 101.],
[102., 102., 102.],
[103., 103., 103.]], dtype=float32)
Let’s first import some necessary modules and generate an instance of our JIT engine.
import mlir_graphblas
engine = mlir_graphblas.MlirJitEngine()
Using development graphblas-opt: /Users/pnguyen/code/mlir-graphblas/mlir_graphblas/src/build/bin/graphblas-opt
Here’s the MLIR code we’ll use.
mlir_text = """
#trait_matplusvec = {
indexing_maps = [
affine_map<(i,j) -> (i,j)>,
affine_map<(i,j) -> (i)>,
affine_map<(i,j) -> (i,j)>
],
iterator_types = ["parallel", "parallel"]
}
func @mat_plus_vec(%arga: tensor<10x?xf32>, %argb: tensor<10xf32>) -> tensor<10x?xf32> {
%c1 = arith.constant 1 : index
%arga_dim1 = tensor.dim %arga, %c1 : tensor<10x?xf32>
%output_tensor = linalg.init_tensor [10, %arga_dim1] : tensor<10x?xf32>
%answer = linalg.generic #trait_matplusvec
ins(%arga, %argb : tensor<10x?xf32>, tensor<10xf32>)
outs(%output_tensor: tensor<10x?xf32>) {
^bb(%a: f32, %b: f32, %x: f32):
%sum = arith.addf %a, %b : f32
linalg.yield %sum : f32
} -> tensor<10x?xf32>
return %answer : tensor<10x?xf32>
}
"""
Note that the input matrix has an arbitrary number of columns.
These are the passes we’ll use to optimize and compile our MLIR code.
passes = [
"--graphblas-structuralize",
"--graphblas-optimize",
"--graphblas-lower",
"--sparsification",
"--sparse-tensor-conversion",
"--linalg-bufferize",
"--arith-bufferize",
"--func-bufferize",
"--tensor-bufferize",
"--finalizing-bufferize",
"--convert-linalg-to-loops",
"--convert-scf-to-cf",
"--convert-memref-to-llvm",
"--convert-math-to-llvm",
"--convert-openmp-to-llvm",
"--convert-arith-to-llvm",
"--convert-math-to-llvm",
"--convert-std-to-llvm",
"--reconcile-unrealized-casts"
]
Let’s compile our MLIR code using our JIT engine.
engine.add(mlir_text, passes)
mat_plus_vec = engine.mat_plus_vec
Let’s see how well our function works.
# generate inputs
m = np.arange(40, dtype=np.float32).reshape([10,4])
v = np.arange(10, dtype=np.float32) / 10
# generate output
result = mat_plus_vec(m, v)
result
array([[ 0. , 1. , 2. , 3. ],
[ 4.1, 5.1, 6.1, 7.1],
[ 8.2, 9.2, 10.2, 11.2],
[12.3, 13.3, 14.3, 15.3],
[16.4, 17.4, 18.4, 19.4],
[20.5, 21.5, 22.5, 23.5],
[24.6, 25.6, 26.6, 27.6],
[28.7, 29.7, 30.7, 31.7],
[32.8, 33.8, 34.8, 35.8],
[36.9, 37.9, 38.9, 39.9]], dtype=float32)
Let’s verify that our result is correct.
np.all(result == m + np.expand_dims(v, 1))
True