This page was generated from tools/cli/using_debugresult.ipynb.
Using DebugResult¶
Here, we will show how to use DebugResult
to debug some problems we might encounter when using our mlir-opt CLI Wrapper.
Let’s first import some necessary classes and generate an instance of our mlir-opt CLI Wrapper.
from mlir_graphblas import MlirOptCli
cli = MlirOptCli(executable=None, options=None)
Using development graphblas-opt: /Users/pnguyen/code/mlir-graphblas/mlir_graphblas/src/build/bin/graphblas-opt
Generate Example Input¶
Let’s say we have a bunch of MLIR code that we’re not familiar with.
mlir_string = """
#trait_sum_reduction = {
indexing_maps = [
affine_map<(i,j,k) -> (i,j,k)>, // A
affine_map<(i,j,k) -> ()> // x (scalar out)
],
iterator_types = ["reduction", "reduction", "reduction"],
doc = "x += SUM_ijk A(i,j,k)"
}
#sparseTensor = #sparse_tensor.encoding<{
dimLevelType = [ "compressed", "compressed", "compressed" ],
dimOrdering = affine_map<(i,j,k) -> (i,j,k)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
func @func_f32(%argA: tensor<10x20x30xf32, #sparseTensor>) -> f32 {
%out_tensor = linalg.init_tensor [] : tensor<f32>
%reduction = linalg.generic #trait_sum_reduction
ins(%argA: tensor<10x20x30xf32, #sparseTensor>)
outs(%out_tensor: tensor<f32>) {
^bb(%a: f32, %x: f32):
%0 = arith.addf %x, %a : f32
linalg.yield %0 : f32
} -> tensor<f32>
%answer = tensor.extract %reduction[] : tensor<f32>
return %answer : f32
}
"""
mlir_bytes = mlir_string.encode()
Since we’re not familiar with this code, we don’t exactly know what passes are necessary or in what order they should go in.
Let’s say that this is the first set of passes we try.
passes = [
"--sparsification",
"--sparse-tensor-conversion",
"--linalg-bufferize",
"--arith-bufferize",
"--func-bufferize",
"--tensor-bufferize",
"--finalizing-bufferize",
"--convert-linalg-to-loops",
"--convert-vector-to-llvm",
"--convert-math-to-llvm",
"--convert-math-to-libm",
"--convert-memref-to-llvm",
"--convert-openmp-to-llvm",
"--convert-arith-to-llvm",
"--convert-std-to-llvm",
"--reconcile-unrealized-casts"
]
Let’s see what results we get.
result = cli.apply_passes(mlir_bytes, passes)
[stderr] <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
[stderr] %reduction = linalg.generic #trait_sum_reduction
[stderr] ^
[stderr] <stdin>:20:16: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index
---------------------------------------------------------------------------
MlirOptError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 result = cli.apply_passes(mlir_bytes, passes)
File ~/code/mlir-graphblas/mlir_graphblas/cli.py:93, in MlirOptCli.apply_passes(self, file, passes)
91 input = self._read_input(fp)
92 err.debug_result = self.debug_passes(input, passes) if passes else None
---> 93 raise err
MlirOptError: <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
%reduction = linalg.generic #trait_sum_reduction
^
We get an exception.
Unfortunately, the exception message isn’t very clear as it only gives us the immediate error message but doesn’t inform us of the context in which it occurred, e.g. in which pass the error occurred (if any) or if any necessary passes are missing.
We only know that the operation builtin.unrealized_conversion_cast
shows up somewhere and that it’s a problem.
Let’s try to use the debug_passes
method instead of the apply_passes
to get more information.
result = cli.debug_passes(mlir_bytes, passes)
result
=================================================
Error when running reconcile-unrealized-casts
=================================================
<stdin>:24:10: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
%4 = builtin.unrealized_conversion_cast %3 : i64 to index
^
<stdin>:24:10: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index loc("<stdin>":24:10)
=======================================
Input to reconcile-unrealized-casts
=======================================
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1|module attributes {llvm.data_layout = ""} {
2| llvm.func @malloc(i64) -> !llvm.ptr<i8>
3| llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
4| %0 = llvm.mlir.constant(1 : index) : i64
5| %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
6| llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
7| %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
8| llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
9| }
10| llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
11| llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
12| %0 = llvm.mlir.constant(1 : index) : i64
13| %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
14| llvm.call @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
15| %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
16| llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
17| }
18| llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
19| llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
20| %0 = llvm.mlir.constant(0 : index) : i64
21| %1 = builtin.unrealized_conversion_cast %0 : i64 to index
22| %2 = builtin.unrealized_conversion_cast %1 : index to i64
23| %3 = llvm.mlir.constant(1 : index) : i64
24| %4 = builtin.unrealized_conversion_cast %3 : i64 to index
25| %5 = builtin.unrealized_conversion_cast %4 : index to i64
26| %6 = llvm.mlir.constant(2 : index) : i64
27| %7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
28| %8 = llvm.call @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
29| %9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
30| %10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
31| %11 = llvm.call @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
32| %12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
33| %13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
34| %14 = llvm.call @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
35| %15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
36| %16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
37| %17 = llvm.call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
38| %18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
39| %19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
40| %20 = llvm.mlir.constant(1 : index) : i64
41| %21 = llvm.mlir.null : !llvm.ptr<f32>
42| %22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
43| %23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
44| %24 = llvm.call @malloc(%23) : (i64) -> !llvm.ptr<i8>
45| %25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
46| %26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
47| %27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
48| %28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
49| %29 = llvm.mlir.constant(0 : index) : i64
50| %30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
51| %31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
52| llvm.store %7, %31 : !llvm.ptr<f32>
53| %32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
54| %33 = llvm.load %32 : !llvm.ptr<f32>
55| %34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
56| %35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
57| %36 = llvm.load %35 : !llvm.ptr<i64>
58| %37 = builtin.unrealized_conversion_cast %36 : i64 to index
59| %38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
60| %39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
61| %40 = llvm.load %39 : !llvm.ptr<i64>
62| %41 = builtin.unrealized_conversion_cast %40 : i64 to index
63| %42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
64| %46 = builtin.unrealized_conversion_cast %arg1 : index to i64
65| %47 = builtin.unrealized_conversion_cast %arg1 : index to i64
66| %48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
67| %49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
68| %50 = llvm.load %49 : !llvm.ptr<i64>
69| %51 = builtin.unrealized_conversion_cast %50 : i64 to index
70| %52 = llvm.add %46, %3 : i64
71| %53 = builtin.unrealized_conversion_cast %52 : i64 to index
72| %54 = builtin.unrealized_conversion_cast %53 : index to i64
73| %55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
74| %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
75| %57 = llvm.load %56 : !llvm.ptr<i64>
76| %58 = builtin.unrealized_conversion_cast %57 : i64 to index
77| %59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
78| %60 = builtin.unrealized_conversion_cast %arg3 : index to i64
79| %61 = builtin.unrealized_conversion_cast %arg3 : index to i64
80| %62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
81| %63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
82| %64 = llvm.load %63 : !llvm.ptr<i64>
83| %65 = builtin.unrealized_conversion_cast %64 : i64 to index
84| %66 = llvm.add %60, %3 : i64
85| %67 = builtin.unrealized_conversion_cast %66 : i64 to index
86| %68 = builtin.unrealized_conversion_cast %67 : index to i64
87| %69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
88| %70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
89| %71 = llvm.load %70 : !llvm.ptr<i64>
90| %72 = builtin.unrealized_conversion_cast %71 : i64 to index
91| %73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
92| %74 = builtin.unrealized_conversion_cast %arg5 : index to i64
93| %75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
94| %76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
95| %77 = llvm.load %76 : !llvm.ptr<f32>
96| %78 = llvm.fadd %arg6, %77 : f32
97| scf.yield %78 : f32
98| }
99| scf.yield %73 : f32
100| }
101| scf.yield %59 : f32
102| }
103| %43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
104| llvm.store %42, %43 : !llvm.ptr<f32>
105| %44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
106| %45 = llvm.load %44 : !llvm.ptr<f32>
107| llvm.return %45 : f32
108| }
109|}
110|
================================
Input to convert-std-to-llvm
================================
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%0 = llvm.mlir.constant(0 : index) : i64
%1 = builtin.unrealized_conversion_cast %0 : i64 to index
%2 = builtin.unrealized_conversion_cast %1 : index to i64
%3 = llvm.mlir.constant(1 : index) : i64
%4 = builtin.unrealized_conversion_cast %3 : i64 to index
%5 = builtin.unrealized_conversion_cast %4 : index to i64
%6 = llvm.mlir.constant(2 : index) : i64
%7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
%8 = llvm.call @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%11 = llvm.call @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%14 = llvm.call @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%17 = llvm.call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
%19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%20 = llvm.mlir.constant(1 : index) : i64
%21 = llvm.mlir.null : !llvm.ptr<f32>
%22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
%24 = llvm.call @malloc(%23) : (i64) -> !llvm.ptr<i8>
%25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
%26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%29 = llvm.mlir.constant(0 : index) : i64
%30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %7, %31 : !llvm.ptr<f32>
%32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%33 = llvm.load %32 : !llvm.ptr<f32>
%34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%36 = llvm.load %35 : !llvm.ptr<i64>
%37 = builtin.unrealized_conversion_cast %36 : i64 to index
%38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%40 = llvm.load %39 : !llvm.ptr<i64>
%41 = builtin.unrealized_conversion_cast %40 : i64 to index
%42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
%46 = builtin.unrealized_conversion_cast %arg1 : index to i64
%47 = builtin.unrealized_conversion_cast %arg1 : index to i64
%48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%50 = llvm.load %49 : !llvm.ptr<i64>
%51 = builtin.unrealized_conversion_cast %50 : i64 to index
%52 = llvm.add %46, %3 : i64
%53 = builtin.unrealized_conversion_cast %52 : i64 to index
%54 = builtin.unrealized_conversion_cast %53 : index to i64
%55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%57 = llvm.load %56 : !llvm.ptr<i64>
%58 = builtin.unrealized_conversion_cast %57 : i64 to index
%59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
%60 = builtin.unrealized_conversion_cast %arg3 : index to i64
%61 = builtin.unrealized_conversion_cast %arg3 : index to i64
%62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%64 = llvm.load %63 : !llvm.ptr<i64>
%65 = builtin.unrealized_conversion_cast %64 : i64 to index
%66 = llvm.add %60, %3 : i64
%67 = builtin.unrealized_conversion_cast %66 : i64 to index
%68 = builtin.unrealized_conversion_cast %67 : index to i64
%69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%71 = llvm.load %70 : !llvm.ptr<i64>
%72 = builtin.unrealized_conversion_cast %71 : i64 to index
%73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
%74 = builtin.unrealized_conversion_cast %arg5 : index to i64
%75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%77 = llvm.load %76 : !llvm.ptr<f32>
%78 = llvm.fadd %arg6, %77 : f32
scf.yield %78 : f32
}
scf.yield %73 : f32
}
scf.yield %59 : f32
}
%43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %42, %43 : !llvm.ptr<f32>
%44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%45 = llvm.load %44 : !llvm.ptr<f32>
llvm.return %45 : f32
}
}
==================================
Input to convert-arith-to-llvm
==================================
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%0 = llvm.mlir.constant(0 : index) : i64
%1 = builtin.unrealized_conversion_cast %0 : i64 to index
%2 = builtin.unrealized_conversion_cast %1 : index to i64
%3 = llvm.mlir.constant(1 : index) : i64
%4 = builtin.unrealized_conversion_cast %3 : i64 to index
%5 = builtin.unrealized_conversion_cast %4 : index to i64
%6 = llvm.mlir.constant(2 : index) : i64
%7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
%8 = llvm.call @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%11 = llvm.call @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%14 = llvm.call @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%17 = llvm.call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
%19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%20 = llvm.mlir.constant(1 : index) : i64
%21 = llvm.mlir.null : !llvm.ptr<f32>
%22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
%24 = llvm.call @malloc(%23) : (i64) -> !llvm.ptr<i8>
%25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
%26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%29 = llvm.mlir.constant(0 : index) : i64
%30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %7, %31 : !llvm.ptr<f32>
%32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%33 = llvm.load %32 : !llvm.ptr<f32>
%34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%36 = llvm.load %35 : !llvm.ptr<i64>
%37 = builtin.unrealized_conversion_cast %36 : i64 to index
%38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%40 = llvm.load %39 : !llvm.ptr<i64>
%41 = builtin.unrealized_conversion_cast %40 : i64 to index
%42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
%46 = builtin.unrealized_conversion_cast %arg1 : index to i64
%47 = builtin.unrealized_conversion_cast %arg1 : index to i64
%48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%50 = llvm.load %49 : !llvm.ptr<i64>
%51 = builtin.unrealized_conversion_cast %50 : i64 to index
%52 = llvm.add %46, %3 : i64
%53 = builtin.unrealized_conversion_cast %52 : i64 to index
%54 = builtin.unrealized_conversion_cast %53 : index to i64
%55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%57 = llvm.load %56 : !llvm.ptr<i64>
%58 = builtin.unrealized_conversion_cast %57 : i64 to index
%59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
%60 = builtin.unrealized_conversion_cast %arg3 : index to i64
%61 = builtin.unrealized_conversion_cast %arg3 : index to i64
%62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%64 = llvm.load %63 : !llvm.ptr<i64>
%65 = builtin.unrealized_conversion_cast %64 : i64 to index
%66 = llvm.add %60, %3 : i64
%67 = builtin.unrealized_conversion_cast %66 : i64 to index
%68 = builtin.unrealized_conversion_cast %67 : index to i64
%69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%71 = llvm.load %70 : !llvm.ptr<i64>
%72 = builtin.unrealized_conversion_cast %71 : i64 to index
%73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
%74 = builtin.unrealized_conversion_cast %arg5 : index to i64
%75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%77 = llvm.load %76 : !llvm.ptr<f32>
%78 = llvm.fadd %arg6, %77 : f32
scf.yield %78 : f32
}
scf.yield %73 : f32
}
scf.yield %59 : f32
}
%43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %42, %43 : !llvm.ptr<f32>
%44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%45 = llvm.load %44 : !llvm.ptr<f32>
llvm.return %45 : f32
}
}
===================================
Input to convert-openmp-to-llvm
===================================
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%0 = builtin.unrealized_conversion_cast %c0 : index to i64
%c1 = arith.constant 1 : index
%1 = builtin.unrealized_conversion_cast %c1 : index to i64
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%2 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = builtin.unrealized_conversion_cast %2 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%4 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%5 = builtin.unrealized_conversion_cast %4 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%6 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%7 = builtin.unrealized_conversion_cast %6 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%8 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%9 = builtin.unrealized_conversion_cast %8 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%10 = llvm.mlir.constant(1 : index) : i64
%11 = llvm.mlir.null : !llvm.ptr<f32>
%12 = llvm.getelementptr %11[%10] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%13 = llvm.ptrtoint %12 : !llvm.ptr<f32> to i64
%14 = llvm.call @malloc(%13) : (i64) -> !llvm.ptr<i8>
%15 = llvm.bitcast %14 : !llvm.ptr<i8> to !llvm.ptr<f32>
%16 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%17 = llvm.insertvalue %15, %16[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%18 = llvm.insertvalue %15, %17[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%19 = llvm.mlir.constant(0 : index) : i64
%20 = llvm.insertvalue %19, %18[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%21 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %cst, %21 : !llvm.ptr<f32>
%22 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%23 = llvm.load %22 : !llvm.ptr<f32>
%24 = llvm.extractvalue %3[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%25 = llvm.getelementptr %24[%0] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%26 = llvm.load %25 : !llvm.ptr<i64>
%27 = arith.index_cast %26 : i64 to index
%28 = llvm.extractvalue %3[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%29 = llvm.getelementptr %28[%1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%30 = llvm.load %29 : !llvm.ptr<i64>
%31 = arith.index_cast %30 : i64 to index
%32 = scf.for %arg1 = %27 to %31 step %c1 iter_args(%arg2 = %23) -> (f32) {
%36 = builtin.unrealized_conversion_cast %arg1 : index to i64
%37 = llvm.extractvalue %5[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%38 = llvm.getelementptr %37[%36] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%39 = llvm.load %38 : !llvm.ptr<i64>
%40 = arith.index_cast %39 : i64 to index
%41 = arith.addi %arg1, %c1 : index
%42 = builtin.unrealized_conversion_cast %41 : index to i64
%43 = llvm.extractvalue %5[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%44 = llvm.getelementptr %43[%42] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%45 = llvm.load %44 : !llvm.ptr<i64>
%46 = arith.index_cast %45 : i64 to index
%47 = scf.for %arg3 = %40 to %46 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%48 = builtin.unrealized_conversion_cast %arg3 : index to i64
%49 = llvm.extractvalue %7[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%50 = llvm.getelementptr %49[%48] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%51 = llvm.load %50 : !llvm.ptr<i64>
%52 = arith.index_cast %51 : i64 to index
%53 = arith.addi %arg3, %c1 : index
%54 = builtin.unrealized_conversion_cast %53 : index to i64
%55 = llvm.extractvalue %7[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%57 = llvm.load %56 : !llvm.ptr<i64>
%58 = arith.index_cast %57 : i64 to index
%59 = scf.for %arg5 = %52 to %58 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%60 = builtin.unrealized_conversion_cast %arg5 : index to i64
%61 = llvm.extractvalue %9[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%62 = llvm.getelementptr %61[%60] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%63 = llvm.load %62 : !llvm.ptr<f32>
%64 = arith.addf %arg6, %63 : f32
scf.yield %64 : f32
}
scf.yield %59 : f32
}
scf.yield %47 : f32
}
%33 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %32, %33 : !llvm.ptr<f32>
%34 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%35 = llvm.load %34 : !llvm.ptr<f32>
return %35 : f32
}
}
===================================
Input to convert-memref-to-llvm
===================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
memref.store %cst, %4[] : memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
=================================
Input to convert-math-to-libm
=================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
memref.store %cst, %4[] : memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
=================================
Input to convert-math-to-llvm
=================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
memref.store %cst, %4[] : memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
===================================
Input to convert-vector-to-llvm
===================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
memref.store %cst, %4[] : memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
====================================
Input to convert-linalg-to-loops
====================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
=================================
Input to finalizing-bufferize
=================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
=============================
Input to tensor-bufferize
=============================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%13 = memref.load %1[%arg1] : memref<?xi64>
%14 = arith.index_cast %13 : i64 to index
%15 = arith.addi %arg1, %c1 : index
%16 = memref.load %1[%15] : memref<?xi64>
%17 = arith.index_cast %16 : i64 to index
%18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%19 = memref.load %2[%arg3] : memref<?xi64>
%20 = arith.index_cast %19 : i64 to index
%21 = arith.addi %arg3, %c1 : index
%22 = memref.load %2[%21] : memref<?xi64>
%23 = arith.index_cast %22 : i64 to index
%24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%25 = memref.load %3[%arg5] : memref<?xf32>
%26 = arith.addf %arg6, %25 : f32
scf.yield %26 : f32
}
scf.yield %24 : f32
}
scf.yield %18 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = bufferization.to_tensor %4 : memref<f32>
%12 = tensor.extract %11[] : tensor<f32>
return %12 : f32
}
}
===========================
Input to func-bufferize
===========================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%13 = memref.load %1[%arg1] : memref<?xi64>
%14 = arith.index_cast %13 : i64 to index
%15 = arith.addi %arg1, %c1 : index
%16 = memref.load %1[%15] : memref<?xi64>
%17 = arith.index_cast %16 : i64 to index
%18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%19 = memref.load %2[%arg3] : memref<?xi64>
%20 = arith.index_cast %19 : i64 to index
%21 = arith.addi %arg3, %c1 : index
%22 = memref.load %2[%21] : memref<?xi64>
%23 = arith.index_cast %22 : i64 to index
%24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%25 = memref.load %3[%arg5] : memref<?xf32>
%26 = arith.addf %arg6, %25 : f32
scf.yield %26 : f32
}
scf.yield %24 : f32
}
scf.yield %18 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = bufferization.to_tensor %4 : memref<f32>
%12 = tensor.extract %11[] : tensor<f32>
return %12 : f32
}
}
============================
Input to arith-bufferize
============================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%13 = memref.load %1[%arg1] : memref<?xi64>
%14 = arith.index_cast %13 : i64 to index
%15 = arith.addi %arg1, %c1 : index
%16 = memref.load %1[%15] : memref<?xi64>
%17 = arith.index_cast %16 : i64 to index
%18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%19 = memref.load %2[%arg3] : memref<?xi64>
%20 = arith.index_cast %19 : i64 to index
%21 = arith.addi %arg3, %c1 : index
%22 = memref.load %2[%21] : memref<?xi64>
%23 = arith.index_cast %22 : i64 to index
%24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%25 = memref.load %3[%arg5] : memref<?xf32>
%26 = arith.addf %arg6, %25 : f32
scf.yield %26 : f32
}
scf.yield %24 : f32
}
scf.yield %18 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = bufferization.to_tensor %4 : memref<f32>
%12 = tensor.extract %11[] : tensor<f32>
return %12 : f32
}
}
=============================
Input to linalg-bufferize
=============================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%13 = memref.load %1[%arg1] : memref<?xi64>
%14 = arith.index_cast %13 : i64 to index
%15 = arith.addi %arg1, %c1 : index
%16 = memref.load %1[%15] : memref<?xi64>
%17 = arith.index_cast %16 : i64 to index
%18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%19 = memref.load %2[%arg3] : memref<?xi64>
%20 = arith.index_cast %19 : i64 to index
%21 = arith.addi %arg3, %c1 : index
%22 = memref.load %2[%21] : memref<?xi64>
%23 = arith.index_cast %22 : i64 to index
%24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%25 = memref.load %3[%arg5] : memref<?xf32>
%26 = arith.addf %arg6, %25 : f32
scf.yield %26 : f32
}
scf.yield %24 : f32
}
scf.yield %18 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = bufferization.to_tensor %4 : memref<f32>
%12 = tensor.extract %11[] : tensor<f32>
return %12 : f32
}
}
=====================================
Input to sparse-tensor-conversion
=====================================
module {
func @func_f32(%arg0: tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = sparse_tensor.pointers %arg0, %c0 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
%1 = sparse_tensor.pointers %arg0, %c1 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
%2 = sparse_tensor.pointers %arg0, %c2 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
%3 = sparse_tensor.values %arg0 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xf32>
%4 = memref.alloc() : memref<f32>
linalg.fill(%cst, %4) : f32, memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%13 = memref.load %1[%arg1] : memref<?xi64>
%14 = arith.index_cast %13 : i64 to index
%15 = arith.addi %arg1, %c1 : index
%16 = memref.load %1[%15] : memref<?xi64>
%17 = arith.index_cast %16 : i64 to index
%18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%19 = memref.load %2[%arg3] : memref<?xi64>
%20 = arith.index_cast %19 : i64 to index
%21 = arith.addi %arg3, %c1 : index
%22 = memref.load %2[%21] : memref<?xi64>
%23 = arith.index_cast %22 : i64 to index
%24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%25 = memref.load %3[%arg5] : memref<?xf32>
%26 = arith.addf %arg6, %25 : f32
scf.yield %26 : f32
}
scf.yield %24 : f32
}
scf.yield %18 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = bufferization.to_tensor %4 : memref<f32>
%12 = tensor.extract %11[] : tensor<f32>
return %12 : f32
}
}
===========================
Input to sparsification
===========================
#trait_sum_reduction = {
indexing_maps = [
affine_map<(i,j,k) -> (i,j,k)>, // A
affine_map<(i,j,k) -> ()> // x (scalar out)
],
iterator_types = ["reduction", "reduction", "reduction"],
doc = "x += SUM_ijk A(i,j,k)"
}
#sparseTensor = #sparse_tensor.encoding<{
dimLevelType = [ "compressed", "compressed", "compressed" ],
dimOrdering = affine_map<(i,j,k) -> (i,j,k)>,
pointerBitWidth = 64,
indexBitWidth = 64
}>
func @func_f32(%argA: tensor<10x20x30xf32, #sparseTensor>) -> f32 {
%out_tensor = linalg.init_tensor [] : tensor<f32>
%reduction = linalg.generic #trait_sum_reduction
ins(%argA: tensor<10x20x30xf32, #sparseTensor>)
outs(%out_tensor: tensor<f32>) {
^bb(%a: f32, %x: f32):
%0 = arith.addf %x, %a : f32
linalg.yield %0 : f32
} -> tensor<f32>
%answer = tensor.extract %reduction[] : tensor<f32>
return %answer : f32
}
This large output may seem intimidating due to it’s size, but it’s mostly large since it’s showing the inputs to each pass.
We know that the error happens when the builtin.unrealized_conversion_cast
operation occurs.
We can see from the output above that it happens during the convert-std-to-llvm
pass.
It’s likely that there’s something problematic in the input to that pass, so it’s worth looking into the IR that was given to the convert-std-to-llvm
pass, which we can see under the section labelled ````. We’ll show a sort snippet of it below.
result_string = str(result)
lines = result_string.splitlines()
lines = lines[lines.index(" Input to convert-std-to-llvm ")-1:]
lines = lines[:lines.index("")]
print("\n".join(lines))
================================
Input to convert-std-to-llvm
================================
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%0 = llvm.mlir.constant(0 : index) : i64
%1 = builtin.unrealized_conversion_cast %0 : i64 to index
%2 = builtin.unrealized_conversion_cast %1 : index to i64
%3 = llvm.mlir.constant(1 : index) : i64
%4 = builtin.unrealized_conversion_cast %3 : i64 to index
%5 = builtin.unrealized_conversion_cast %4 : index to i64
%6 = llvm.mlir.constant(2 : index) : i64
%7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
%8 = llvm.call @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%11 = llvm.call @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%14 = llvm.call @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
%16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%17 = llvm.call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
%19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%20 = llvm.mlir.constant(1 : index) : i64
%21 = llvm.mlir.null : !llvm.ptr<f32>
%22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
%24 = llvm.call @malloc(%23) : (i64) -> !llvm.ptr<i8>
%25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
%26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%29 = llvm.mlir.constant(0 : index) : i64
%30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %7, %31 : !llvm.ptr<f32>
%32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%33 = llvm.load %32 : !llvm.ptr<f32>
%34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%36 = llvm.load %35 : !llvm.ptr<i64>
%37 = builtin.unrealized_conversion_cast %36 : i64 to index
%38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%40 = llvm.load %39 : !llvm.ptr<i64>
%41 = builtin.unrealized_conversion_cast %40 : i64 to index
%42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
%46 = builtin.unrealized_conversion_cast %arg1 : index to i64
%47 = builtin.unrealized_conversion_cast %arg1 : index to i64
%48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%50 = llvm.load %49 : !llvm.ptr<i64>
%51 = builtin.unrealized_conversion_cast %50 : i64 to index
%52 = llvm.add %46, %3 : i64
%53 = builtin.unrealized_conversion_cast %52 : i64 to index
%54 = builtin.unrealized_conversion_cast %53 : index to i64
%55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%57 = llvm.load %56 : !llvm.ptr<i64>
%58 = builtin.unrealized_conversion_cast %57 : i64 to index
%59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
%60 = builtin.unrealized_conversion_cast %arg3 : index to i64
%61 = builtin.unrealized_conversion_cast %arg3 : index to i64
%62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%64 = llvm.load %63 : !llvm.ptr<i64>
%65 = builtin.unrealized_conversion_cast %64 : i64 to index
%66 = llvm.add %60, %3 : i64
%67 = builtin.unrealized_conversion_cast %66 : i64 to index
%68 = builtin.unrealized_conversion_cast %67 : index to i64
%69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
%70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%71 = llvm.load %70 : !llvm.ptr<i64>
%72 = builtin.unrealized_conversion_cast %71 : i64 to index
%73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
%74 = builtin.unrealized_conversion_cast %arg5 : index to i64
%75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
%76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
%77 = llvm.load %76 : !llvm.ptr<f32>
%78 = llvm.fadd %arg6, %77 : f32
scf.yield %78 : f32
}
scf.yield %73 : f32
}
scf.yield %59 : f32
}
%43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
llvm.store %42, %43 : !llvm.ptr<f32>
%44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
%45 = llvm.load %44 : !llvm.ptr<f32>
llvm.return %45 : f32
}
}
While this is a good idea in general, it doesn’t seem to be useful here. When MLIR applies a pass, that pass is applied until quiescence, i.e. it keeps applying the pass until nothing changes (or until some limit on the number of applications is reached).
It seems that the convert-std-to-llvm
pass has already been applied a few times since we see several ops from the LLVM dialect already present in the IR shown under the Input to convert-std-to-llvm
section (for example, we see llvm.mlir.constant
).
Another good place to look is in the output of the last pass right before we get our error. Let’s look at the result of the convert-math-to-llvm
pass.
lines = result_string.splitlines()
lines = lines[lines.index(" Input to convert-math-to-llvm ")-1:]
lines = lines[:lines.index("")]
print("\n".join(lines))
=================================
Input to convert-math-to-llvm
=================================
module {
func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
%3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
%4 = memref.alloc() : memref<f32>
memref.store %cst, %4[] : memref<f32>
%5 = memref.load %4[] : memref<f32>
%6 = memref.load %0[%c0] : memref<?xi64>
%7 = arith.index_cast %6 : i64 to index
%8 = memref.load %0[%c1] : memref<?xi64>
%9 = arith.index_cast %8 : i64 to index
%10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
%12 = memref.load %1[%arg1] : memref<?xi64>
%13 = arith.index_cast %12 : i64 to index
%14 = arith.addi %arg1, %c1 : index
%15 = memref.load %1[%14] : memref<?xi64>
%16 = arith.index_cast %15 : i64 to index
%17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
%18 = memref.load %2[%arg3] : memref<?xi64>
%19 = arith.index_cast %18 : i64 to index
%20 = arith.addi %arg3, %c1 : index
%21 = memref.load %2[%20] : memref<?xi64>
%22 = arith.index_cast %21 : i64 to index
%23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
%24 = memref.load %3[%arg5] : memref<?xf32>
%25 = arith.addf %arg6, %24 : f32
scf.yield %25 : f32
}
scf.yield %23 : f32
}
scf.yield %17 : f32
}
memref.store %10, %4[] : memref<f32>
%11 = memref.load %4[] : memref<f32>
return %11 : f32
}
}
We see that the ops are mostly ops from the standard, llvm, and builtin dialects. However, there are some ops from the scf
dialect. It would make sense that the convert-std-to-llvm
pass would be able to handle ops from the builtin dialect. It would make sense that it be able to handle ops from the llvm dialect since that’s the target diallect. It’s unclear whether or not the convert-std-to-llvm
dialect can handle ops from the scf
dialect. Given the name of the
convert-std-to-llvm
pass, we can infer that it will mostly handle ops from the std
dialect and cannot handle ops from the scf
dialect. Let’s see if there are any passes that can convert from the scf
dialect to the std
dialect.
!mlir-opt --help | grep "scf"
Available Dialects: acc, affine, amx, arith, arm_neon, arm_sve, async, bufferization, builtin, cf, complex, dlti, emitc, gpu, linalg, llvm, math, memref, nvvm, omp, pdl, pdl_interp, quant, rocdl, scf, shape, sparse_tensor, spv, std, tensor, test, tosa, vector, x86vector
--async-parallel-for - Convert scf.parallel operations to multiple async compute ops executed concurrently for non-overlapping iteration ranges
--convert-linalg-tiled-loops-to-scf - Lower linalg tiled loops to SCF loops and parallel loops
--convert-openacc-to-scf - Convert the OpenACC ops to OpenACC with SCF dialect
--convert-parallel-loops-to-gpu - Convert mapped scf.parallel ops to gpu launch operations
--convert-scf-to-cf - Convert SCF dialect to ControlFlow dialect, replacing structured control flow with a CFG
--convert-scf-to-openmp - Convert SCF parallel loop to OpenMP parallel + workshare constructs.
--convert-scf-to-spirv - Convert SCF dialect to SPIR-V dialect.
--convert-vector-to-scf - Lower the operations from the vector dialect into the SCF dialect
--scf-bufferize - Bufferize the scf dialect.
--scf-for-loop-canonicalization - Canonicalize operations within scf.for loop bodies
--scf-for-loop-peeling - Peel `for` loops at their upper bounds.
--scf-for-loop-range-folding - Fold add/mul ops into loop range
--scf-for-loop-specialization - Specialize `for` loops for vectorization
--scf-for-to-while - Convert SCF for loops to SCF while loops
--scf-parallel-loop-collapsing - Collapse parallel loops to use less induction variables
--scf-parallel-loop-fusion - Fuse adjacent parallel loops
--scf-parallel-loop-specialization - Specialize parallel loops for vectorization
--scf-parallel-loop-tiling - Tile parallel loops
--test-scf-for-utils - test scf.for utils
--test-scf-if-utils - test scf.if utils
--test-scf-pipelining - test scf.forOp pipelining
--test-vector-transfer-full-partial-split - Test lowering patterns to split transfer ops via scf.if + linalg ops
--tosa-to-scf - Lower TOSA to the SCF dialect
The pass convert-scf-to-cf
seems promising as it intends to convert the scf
dialect to cf
dialect.
Let’s see if running the convert-scf-to-cf
pass any of the conversion passes will get rid of our exception.
passes = [
"--sparsification",
"--sparse-tensor-conversion",
"--linalg-bufferize",
"--arith-bufferize",
"--func-bufferize",
"--tensor-bufferize",
"--finalizing-bufferize",
"--convert-scf-to-cf", # newly added
"--convert-linalg-to-loops",
"--convert-vector-to-llvm",
"--convert-math-to-llvm",
"--convert-math-to-libm",
"--convert-memref-to-llvm",
"--convert-openmp-to-llvm",
"--convert-arith-to-llvm",
"--convert-std-to-llvm",
"--reconcile-unrealized-casts"
]
result = cli.apply_passes(mlir_bytes, passes)
print(result[:1500])
module attributes {llvm.data_layout = ""} {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
%2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
}
llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
%0 = llvm.mlir.constant(1 : index) : i64
%1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
It looks like it fixed our issue!