Using DebugResult

Here, we will show how to use DebugResult to debug some problems we might encounter when using our mlir-opt CLI Wrapper.

Let’s first import some necessary classes and generate an instance of our mlir-opt CLI Wrapper.

from mlir_graphblas import MlirOptCli

cli = MlirOptCli(executable=None, options=None)
Using development graphblas-opt: /Users/pnguyen/code/mlir-graphblas/mlir_graphblas/src/build/bin/graphblas-opt

Generate Example Input

Let’s say we have a bunch of MLIR code that we’re not familiar with.

mlir_string = """
#trait_sum_reduction = {
  indexing_maps = [
    affine_map<(i,j,k) -> (i,j,k)>,  // A
    affine_map<(i,j,k) -> ()>        // x (scalar out)
  iterator_types = ["reduction", "reduction", "reduction"],
  doc = "x += SUM_ijk A(i,j,k)"

#sparseTensor = #sparse_tensor.encoding<{
  dimLevelType = [ "compressed", "compressed", "compressed" ],
  dimOrdering = affine_map<(i,j,k) -> (i,j,k)>,
  pointerBitWidth = 64,
  indexBitWidth = 64

func @func_f32(%argA: tensor<10x20x30xf32, #sparseTensor>) -> f32 {
  %out_tensor = linalg.init_tensor [] : tensor<f32>
  %reduction = linalg.generic #trait_sum_reduction
     ins(%argA: tensor<10x20x30xf32, #sparseTensor>)
    outs(%out_tensor: tensor<f32>) {
      ^bb(%a: f32, %x: f32):
        %0 = arith.addf %x, %a : f32
        linalg.yield %0 : f32
  } -> tensor<f32>
  %answer = tensor.extract %reduction[] : tensor<f32>
  return %answer : f32
mlir_bytes = mlir_string.encode()

Since we’re not familiar with this code, we don’t exactly know what passes are necessary or in what order they should go in.

Let’s say that this is the first set of passes we try.

passes = [

Let’s see what results we get.

result = cli.apply_passes(mlir_bytes, passes)
[stderr] <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
[stderr]   %reduction = linalg.generic #trait_sum_reduction
[stderr]                ^
[stderr] <stdin>:20:16: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index
MlirOptError                              Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 result = cli.apply_passes(mlir_bytes, passes)

File ~/code/mlir-graphblas/mlir_graphblas/, in MlirOptCli.apply_passes(self, file, passes)
     91         input = self._read_input(fp)
     92 err.debug_result = self.debug_passes(input, passes) if passes else None
---> 93 raise err

MlirOptError: <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
  %reduction = linalg.generic #trait_sum_reduction

We get an exception.

Unfortunately, the exception message isn’t very clear as it only gives us the immediate error message but doesn’t inform us of the context in which it occurred, e.g. in which pass the error occurred (if any) or if any necessary passes are missing.

We only know that the operation builtin.unrealized_conversion_cast shows up somewhere and that it’s a problem.

Let’s try to use the debug_passes method instead of the apply_passes to get more information.

result = cli.debug_passes(mlir_bytes, passes)


  Error when running reconcile-unrealized-casts
<stdin>:24:10: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
<stdin>:24:10: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index loc("<stdin>":24:10)

  Input to reconcile-unrealized-casts
             10        20        30        40        50        60        70        80        90        100       110       120       130       140       150       160       170       180       190       200
  1|module attributes {llvm.data_layout = ""} {
  2|  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  3|  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
  4|    %0 = llvm.mlir.constant(1 : index) : i64
  5|    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
  6| @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
  7|    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
  8|    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  9|  }
 10|  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
 11|  llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
 12|    %0 = llvm.mlir.constant(1 : index) : i64
 13|    %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
 14| @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
 15|    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
 16|    llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 17|  }
 18|  llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
 19|  llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
 20|    %0 = llvm.mlir.constant(0 : index) : i64
 21|    %1 = builtin.unrealized_conversion_cast %0 : i64 to index
 22|    %2 = builtin.unrealized_conversion_cast %1 : index to i64
 23|    %3 = llvm.mlir.constant(1 : index) : i64
 24|    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
 25|    %5 = builtin.unrealized_conversion_cast %4 : index to i64
 26|    %6 = llvm.mlir.constant(2 : index) : i64
 27|    %7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
 28|    %8 = @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 29|    %9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
 30|    %10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 31|    %11 = @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 32|    %12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
 33|    %13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 34|    %14 = @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 35|    %15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
 36|    %16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 37|    %17 = @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
 38|    %18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
 39|    %19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
 40|    %20 = llvm.mlir.constant(1 : index) : i64
 41|    %21 = llvm.mlir.null : !llvm.ptr<f32>
 42|    %22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
 43|    %23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
 44|    %24 = @malloc(%23) : (i64) -> !llvm.ptr<i8>
 45|    %25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
 46|    %26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 47|    %27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 48|    %28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 49|    %29 = llvm.mlir.constant(0 : index) : i64
 50|    %30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 51|    %31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 52| %7, %31 : !llvm.ptr<f32>
 53|    %32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
 54|    %33 = llvm.load %32 : !llvm.ptr<f32>
 55|    %34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 56|    %35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 57|    %36 = llvm.load %35 : !llvm.ptr<i64>
 58|    %37 = builtin.unrealized_conversion_cast %36 : i64 to index
 59|    %38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 60|    %39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 61|    %40 = llvm.load %39 : !llvm.ptr<i64>
 62|    %41 = builtin.unrealized_conversion_cast %40 : i64 to index
 63|    %42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
 64|      %46 = builtin.unrealized_conversion_cast %arg1 : index to i64
 65|      %47 = builtin.unrealized_conversion_cast %arg1 : index to i64
 66|      %48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 67|      %49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 68|      %50 = llvm.load %49 : !llvm.ptr<i64>
 69|      %51 = builtin.unrealized_conversion_cast %50 : i64 to index
 70|      %52 = llvm.add %46, %3  : i64
 71|      %53 = builtin.unrealized_conversion_cast %52 : i64 to index
 72|      %54 = builtin.unrealized_conversion_cast %53 : index to i64
 73|      %55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 74|      %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 75|      %57 = llvm.load %56 : !llvm.ptr<i64>
 76|      %58 = builtin.unrealized_conversion_cast %57 : i64 to index
 77|      %59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
 78|        %60 = builtin.unrealized_conversion_cast %arg3 : index to i64
 79|        %61 = builtin.unrealized_conversion_cast %arg3 : index to i64
 80|        %62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 81|        %63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 82|        %64 = llvm.load %63 : !llvm.ptr<i64>
 83|        %65 = builtin.unrealized_conversion_cast %64 : i64 to index
 84|        %66 = llvm.add %60, %3  : i64
 85|        %67 = builtin.unrealized_conversion_cast %66 : i64 to index
 86|        %68 = builtin.unrealized_conversion_cast %67 : index to i64
 87|        %69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
 88|        %70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
 89|        %71 = llvm.load %70 : !llvm.ptr<i64>
 90|        %72 = builtin.unrealized_conversion_cast %71 : i64 to index
 91|        %73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
 92|          %74 = builtin.unrealized_conversion_cast %arg5 : index to i64
 93|          %75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
 94|          %76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
 95|          %77 = llvm.load %76 : !llvm.ptr<f32>
 96|          %78 = llvm.fadd %arg6, %77  : f32
 97|          scf.yield %78 : f32
 98|        }
 99|        scf.yield %73 : f32
100|      }
101|      scf.yield %59 : f32
102|    }
103|    %43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
104| %42, %43 : !llvm.ptr<f32>
105|    %44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
106|    %45 = llvm.load %44 : !llvm.ptr<f32>
107|    llvm.return %45 : f32
108|  }

  Input to convert-std-to-llvm
module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %0 = llvm.mlir.constant(0 : index) : i64
    %1 = builtin.unrealized_conversion_cast %0 : i64 to index
    %2 = builtin.unrealized_conversion_cast %1 : index to i64
    %3 = llvm.mlir.constant(1 : index) : i64
    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
    %5 = builtin.unrealized_conversion_cast %4 : index to i64
    %6 = llvm.mlir.constant(2 : index) : i64
    %7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
    %8 = @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %11 = @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %14 = @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %17 = @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
    %19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %20 = llvm.mlir.constant(1 : index) : i64
    %21 = llvm.mlir.null : !llvm.ptr<f32>
    %22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
    %23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
    %24 = @malloc(%23) : (i64) -> !llvm.ptr<i8>
    %25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
    %26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %29 = llvm.mlir.constant(0 : index) : i64
    %30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %7, %31 : !llvm.ptr<f32>
    %32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %33 = llvm.load %32 : !llvm.ptr<f32>
    %34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %36 = llvm.load %35 : !llvm.ptr<i64>
    %37 = builtin.unrealized_conversion_cast %36 : i64 to index
    %38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %40 = llvm.load %39 : !llvm.ptr<i64>
    %41 = builtin.unrealized_conversion_cast %40 : i64 to index
    %42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
      %46 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %47 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %50 = llvm.load %49 : !llvm.ptr<i64>
      %51 = builtin.unrealized_conversion_cast %50 : i64 to index
      %52 = llvm.add %46, %3  : i64
      %53 = builtin.unrealized_conversion_cast %52 : i64 to index
      %54 = builtin.unrealized_conversion_cast %53 : index to i64
      %55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %57 = llvm.load %56 : !llvm.ptr<i64>
      %58 = builtin.unrealized_conversion_cast %57 : i64 to index
      %59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
        %60 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %61 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %64 = llvm.load %63 : !llvm.ptr<i64>
        %65 = builtin.unrealized_conversion_cast %64 : i64 to index
        %66 = llvm.add %60, %3  : i64
        %67 = builtin.unrealized_conversion_cast %66 : i64 to index
        %68 = builtin.unrealized_conversion_cast %67 : index to i64
        %69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %71 = llvm.load %70 : !llvm.ptr<i64>
        %72 = builtin.unrealized_conversion_cast %71 : i64 to index
        %73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
          %74 = builtin.unrealized_conversion_cast %arg5 : index to i64
          %75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
          %76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
          %77 = llvm.load %76 : !llvm.ptr<f32>
          %78 = llvm.fadd %arg6, %77  : f32
          scf.yield %78 : f32
        scf.yield %73 : f32
      scf.yield %59 : f32
    %43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %42, %43 : !llvm.ptr<f32>
    %44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %45 = llvm.load %44 : !llvm.ptr<f32>
    llvm.return %45 : f32

  Input to convert-arith-to-llvm
module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %0 = llvm.mlir.constant(0 : index) : i64
    %1 = builtin.unrealized_conversion_cast %0 : i64 to index
    %2 = builtin.unrealized_conversion_cast %1 : index to i64
    %3 = llvm.mlir.constant(1 : index) : i64
    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
    %5 = builtin.unrealized_conversion_cast %4 : index to i64
    %6 = llvm.mlir.constant(2 : index) : i64
    %7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
    %8 = @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %11 = @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %14 = @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %17 = @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
    %19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %20 = llvm.mlir.constant(1 : index) : i64
    %21 = llvm.mlir.null : !llvm.ptr<f32>
    %22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
    %23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
    %24 = @malloc(%23) : (i64) -> !llvm.ptr<i8>
    %25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
    %26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %29 = llvm.mlir.constant(0 : index) : i64
    %30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %7, %31 : !llvm.ptr<f32>
    %32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %33 = llvm.load %32 : !llvm.ptr<f32>
    %34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %36 = llvm.load %35 : !llvm.ptr<i64>
    %37 = builtin.unrealized_conversion_cast %36 : i64 to index
    %38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %40 = llvm.load %39 : !llvm.ptr<i64>
    %41 = builtin.unrealized_conversion_cast %40 : i64 to index
    %42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
      %46 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %47 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %50 = llvm.load %49 : !llvm.ptr<i64>
      %51 = builtin.unrealized_conversion_cast %50 : i64 to index
      %52 = llvm.add %46, %3  : i64
      %53 = builtin.unrealized_conversion_cast %52 : i64 to index
      %54 = builtin.unrealized_conversion_cast %53 : index to i64
      %55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %57 = llvm.load %56 : !llvm.ptr<i64>
      %58 = builtin.unrealized_conversion_cast %57 : i64 to index
      %59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
        %60 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %61 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %64 = llvm.load %63 : !llvm.ptr<i64>
        %65 = builtin.unrealized_conversion_cast %64 : i64 to index
        %66 = llvm.add %60, %3  : i64
        %67 = builtin.unrealized_conversion_cast %66 : i64 to index
        %68 = builtin.unrealized_conversion_cast %67 : index to i64
        %69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %71 = llvm.load %70 : !llvm.ptr<i64>
        %72 = builtin.unrealized_conversion_cast %71 : i64 to index
        %73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
          %74 = builtin.unrealized_conversion_cast %arg5 : index to i64
          %75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
          %76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
          %77 = llvm.load %76 : !llvm.ptr<f32>
          %78 = llvm.fadd %arg6, %77  : f32
          scf.yield %78 : f32
        scf.yield %73 : f32
      scf.yield %59 : f32
    %43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %42, %43 : !llvm.ptr<f32>
    %44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %45 = llvm.load %44 : !llvm.ptr<f32>
    llvm.return %45 : f32

  Input to convert-openmp-to-llvm
module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %0 = builtin.unrealized_conversion_cast %c0 : index to i64
    %c1 = arith.constant 1 : index
    %1 = builtin.unrealized_conversion_cast %c1 : index to i64
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %2 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = builtin.unrealized_conversion_cast %2 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %4 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %5 = builtin.unrealized_conversion_cast %4 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %6 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %7 = builtin.unrealized_conversion_cast %6 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %8 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %9 = builtin.unrealized_conversion_cast %8 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %10 = llvm.mlir.constant(1 : index) : i64
    %11 = llvm.mlir.null : !llvm.ptr<f32>
    %12 = llvm.getelementptr %11[%10] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
    %13 = llvm.ptrtoint %12 : !llvm.ptr<f32> to i64
    %14 = @malloc(%13) : (i64) -> !llvm.ptr<i8>
    %15 = llvm.bitcast %14 : !llvm.ptr<i8> to !llvm.ptr<f32>
    %16 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %17 = llvm.insertvalue %15, %16[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %18 = llvm.insertvalue %15, %17[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %19 = llvm.mlir.constant(0 : index) : i64
    %20 = llvm.insertvalue %19, %18[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %21 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %cst, %21 : !llvm.ptr<f32>
    %22 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %23 = llvm.load %22 : !llvm.ptr<f32>
    %24 = llvm.extractvalue %3[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %25 = llvm.getelementptr %24[%0] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %26 = llvm.load %25 : !llvm.ptr<i64>
    %27 = arith.index_cast %26 : i64 to index
    %28 = llvm.extractvalue %3[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %29 = llvm.getelementptr %28[%1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %30 = llvm.load %29 : !llvm.ptr<i64>
    %31 = arith.index_cast %30 : i64 to index
    %32 = scf.for %arg1 = %27 to %31 step %c1 iter_args(%arg2 = %23) -> (f32) {
      %36 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %37 = llvm.extractvalue %5[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %38 = llvm.getelementptr %37[%36] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %39 = llvm.load %38 : !llvm.ptr<i64>
      %40 = arith.index_cast %39 : i64 to index
      %41 = arith.addi %arg1, %c1 : index
      %42 = builtin.unrealized_conversion_cast %41 : index to i64
      %43 = llvm.extractvalue %5[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %44 = llvm.getelementptr %43[%42] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %45 = llvm.load %44 : !llvm.ptr<i64>
      %46 = arith.index_cast %45 : i64 to index
      %47 = scf.for %arg3 = %40 to %46 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %48 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %49 = llvm.extractvalue %7[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %50 = llvm.getelementptr %49[%48] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %51 = llvm.load %50 : !llvm.ptr<i64>
        %52 = arith.index_cast %51 : i64 to index
        %53 = arith.addi %arg3, %c1 : index
        %54 = builtin.unrealized_conversion_cast %53 : index to i64
        %55 = llvm.extractvalue %7[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %57 = llvm.load %56 : !llvm.ptr<i64>
        %58 = arith.index_cast %57 : i64 to index
        %59 = scf.for %arg5 = %52 to %58 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %60 = builtin.unrealized_conversion_cast %arg5 : index to i64
          %61 = llvm.extractvalue %9[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
          %62 = llvm.getelementptr %61[%60] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
          %63 = llvm.load %62 : !llvm.ptr<f32>
          %64 = arith.addf %arg6, %63 : f32
          scf.yield %64 : f32
        scf.yield %59 : f32
      scf.yield %47 : f32
    %33 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %32, %33 : !llvm.ptr<f32>
    %34 = llvm.extractvalue %20[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %35 = llvm.load %34 : !llvm.ptr<f32>
    return %35 : f32

  Input to convert-memref-to-llvm
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32> %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to convert-math-to-libm
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32> %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to convert-math-to-llvm
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32> %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to convert-vector-to-llvm
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32> %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to convert-linalg-to-loops
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to finalizing-bufferize
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

  Input to tensor-bufferize
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %13 = memref.load %1[%arg1] : memref<?xi64>
      %14 = arith.index_cast %13 : i64 to index
      %15 = arith.addi %arg1, %c1 : index
      %16 = memref.load %1[%15] : memref<?xi64>
      %17 = arith.index_cast %16 : i64 to index
      %18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %19 = memref.load %2[%arg3] : memref<?xi64>
        %20 = arith.index_cast %19 : i64 to index
        %21 = arith.addi %arg3, %c1 : index
        %22 = memref.load %2[%21] : memref<?xi64>
        %23 = arith.index_cast %22 : i64 to index
        %24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %25 = memref.load %3[%arg5] : memref<?xf32>
          %26 = arith.addf %arg6, %25 : f32
          scf.yield %26 : f32
        scf.yield %24 : f32
      scf.yield %18 : f32
    } %10, %4[] : memref<f32>
    %11 = bufferization.to_tensor %4 : memref<f32>
    %12 = tensor.extract %11[] : tensor<f32>
    return %12 : f32

  Input to func-bufferize
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %13 = memref.load %1[%arg1] : memref<?xi64>
      %14 = arith.index_cast %13 : i64 to index
      %15 = arith.addi %arg1, %c1 : index
      %16 = memref.load %1[%15] : memref<?xi64>
      %17 = arith.index_cast %16 : i64 to index
      %18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %19 = memref.load %2[%arg3] : memref<?xi64>
        %20 = arith.index_cast %19 : i64 to index
        %21 = arith.addi %arg3, %c1 : index
        %22 = memref.load %2[%21] : memref<?xi64>
        %23 = arith.index_cast %22 : i64 to index
        %24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %25 = memref.load %3[%arg5] : memref<?xf32>
          %26 = arith.addf %arg6, %25 : f32
          scf.yield %26 : f32
        scf.yield %24 : f32
      scf.yield %18 : f32
    } %10, %4[] : memref<f32>
    %11 = bufferization.to_tensor %4 : memref<f32>
    %12 = tensor.extract %11[] : tensor<f32>
    return %12 : f32

  Input to arith-bufferize
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %13 = memref.load %1[%arg1] : memref<?xi64>
      %14 = arith.index_cast %13 : i64 to index
      %15 = arith.addi %arg1, %c1 : index
      %16 = memref.load %1[%15] : memref<?xi64>
      %17 = arith.index_cast %16 : i64 to index
      %18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %19 = memref.load %2[%arg3] : memref<?xi64>
        %20 = arith.index_cast %19 : i64 to index
        %21 = arith.addi %arg3, %c1 : index
        %22 = memref.load %2[%21] : memref<?xi64>
        %23 = arith.index_cast %22 : i64 to index
        %24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %25 = memref.load %3[%arg5] : memref<?xf32>
          %26 = arith.addf %arg6, %25 : f32
          scf.yield %26 : f32
        scf.yield %24 : f32
      scf.yield %18 : f32
    } %10, %4[] : memref<f32>
    %11 = bufferization.to_tensor %4 : memref<f32>
    %12 = tensor.extract %11[] : tensor<f32>
    return %12 : f32

  Input to linalg-bufferize
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %13 = memref.load %1[%arg1] : memref<?xi64>
      %14 = arith.index_cast %13 : i64 to index
      %15 = arith.addi %arg1, %c1 : index
      %16 = memref.load %1[%15] : memref<?xi64>
      %17 = arith.index_cast %16 : i64 to index
      %18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %19 = memref.load %2[%arg3] : memref<?xi64>
        %20 = arith.index_cast %19 : i64 to index
        %21 = arith.addi %arg3, %c1 : index
        %22 = memref.load %2[%21] : memref<?xi64>
        %23 = arith.index_cast %22 : i64 to index
        %24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %25 = memref.load %3[%arg5] : memref<?xf32>
          %26 = arith.addf %arg6, %25 : f32
          scf.yield %26 : f32
        scf.yield %24 : f32
      scf.yield %18 : f32
    } %10, %4[] : memref<f32>
    %11 = bufferization.to_tensor %4 : memref<f32>
    %12 = tensor.extract %11[] : tensor<f32>
    return %12 : f32

  Input to sparse-tensor-conversion
module {
  func @func_f32(%arg0: tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = sparse_tensor.pointers %arg0, %c0 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
    %1 = sparse_tensor.pointers %arg0, %c1 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
    %2 = sparse_tensor.pointers %arg0, %c2 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xi64>
    %3 = sparse_tensor.values %arg0 : tensor<10x20x30xf32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed", "compressed", "compressed" ], dimOrdering = affine_map<(d0, d1, d2) -> (d0, d1, d2)>, pointerBitWidth = 64, indexBitWidth = 64 }>> to memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    linalg.fill(%cst, %4) : f32, memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %13 = memref.load %1[%arg1] : memref<?xi64>
      %14 = arith.index_cast %13 : i64 to index
      %15 = arith.addi %arg1, %c1 : index
      %16 = memref.load %1[%15] : memref<?xi64>
      %17 = arith.index_cast %16 : i64 to index
      %18 = scf.for %arg3 = %14 to %17 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %19 = memref.load %2[%arg3] : memref<?xi64>
        %20 = arith.index_cast %19 : i64 to index
        %21 = arith.addi %arg3, %c1 : index
        %22 = memref.load %2[%21] : memref<?xi64>
        %23 = arith.index_cast %22 : i64 to index
        %24 = scf.for %arg5 = %20 to %23 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %25 = memref.load %3[%arg5] : memref<?xf32>
          %26 = arith.addf %arg6, %25 : f32
          scf.yield %26 : f32
        scf.yield %24 : f32
      scf.yield %18 : f32
    } %10, %4[] : memref<f32>
    %11 = bufferization.to_tensor %4 : memref<f32>
    %12 = tensor.extract %11[] : tensor<f32>
    return %12 : f32

  Input to sparsification

#trait_sum_reduction = {
  indexing_maps = [
    affine_map<(i,j,k) -> (i,j,k)>,  // A
    affine_map<(i,j,k) -> ()>        // x (scalar out)
  iterator_types = ["reduction", "reduction", "reduction"],
  doc = "x += SUM_ijk A(i,j,k)"

#sparseTensor = #sparse_tensor.encoding<{
  dimLevelType = [ "compressed", "compressed", "compressed" ],
  dimOrdering = affine_map<(i,j,k) -> (i,j,k)>,
  pointerBitWidth = 64,
  indexBitWidth = 64

func @func_f32(%argA: tensor<10x20x30xf32, #sparseTensor>) -> f32 {
  %out_tensor = linalg.init_tensor [] : tensor<f32>
  %reduction = linalg.generic #trait_sum_reduction
     ins(%argA: tensor<10x20x30xf32, #sparseTensor>)
    outs(%out_tensor: tensor<f32>) {
      ^bb(%a: f32, %x: f32):
        %0 = arith.addf %x, %a : f32
        linalg.yield %0 : f32
  } -> tensor<f32>
  %answer = tensor.extract %reduction[] : tensor<f32>
  return %answer : f32

This large output may seem intimidating due to it’s size, but it’s mostly large since it’s showing the inputs to each pass.

We know that the error happens when the builtin.unrealized_conversion_cast operation occurs.

We can see from the output above that it happens during the convert-std-to-llvm pass.

It’s likely that there’s something problematic in the input to that pass, so it’s worth looking into the IR that was given to the convert-std-to-llvm pass, which we can see under the section labelled ````. We’ll show a sort snippet of it below.

result_string = str(result)
lines = result_string.splitlines()
lines = lines[lines.index("  Input to convert-std-to-llvm  ")-1:]
lines = lines[:lines.index("")]
  Input to convert-std-to-llvm
module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparsePointers64(%1, %arg0, %arg1) : (!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparsePointers64(!llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>, i64) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %0 = llvm.mlir.constant(0 : index) : i64
    %1 = builtin.unrealized_conversion_cast %0 : i64 to index
    %2 = builtin.unrealized_conversion_cast %1 : index to i64
    %3 = llvm.mlir.constant(1 : index) : i64
    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
    %5 = builtin.unrealized_conversion_cast %4 : index to i64
    %6 = llvm.mlir.constant(2 : index) : i64
    %7 = llvm.mlir.constant(0.000000e+00 : f32) : f32
    %8 = @sparsePointers64(%arg0, %0) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %9 = builtin.unrealized_conversion_cast %8 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %10 = builtin.unrealized_conversion_cast %9 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %11 = @sparsePointers64(%arg0, %3) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %12 = builtin.unrealized_conversion_cast %11 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %13 = builtin.unrealized_conversion_cast %12 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %14 = @sparsePointers64(%arg0, %6) : (!llvm.ptr<i8>, i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %15 = builtin.unrealized_conversion_cast %14 : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xi64>
    %16 = builtin.unrealized_conversion_cast %15 : memref<?xi64> to !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %17 = @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %18 = builtin.unrealized_conversion_cast %17 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<?xf32>
    %19 = builtin.unrealized_conversion_cast %18 : memref<?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
    %20 = llvm.mlir.constant(1 : index) : i64
    %21 = llvm.mlir.null : !llvm.ptr<f32>
    %22 = llvm.getelementptr %21[%20] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
    %23 = llvm.ptrtoint %22 : !llvm.ptr<f32> to i64
    %24 = @malloc(%23) : (i64) -> !llvm.ptr<i8>
    %25 = llvm.bitcast %24 : !llvm.ptr<i8> to !llvm.ptr<f32>
    %26 = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %27 = llvm.insertvalue %25, %26[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %28 = llvm.insertvalue %25, %27[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %29 = llvm.mlir.constant(0 : index) : i64
    %30 = llvm.insertvalue %29, %28[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %31 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %7, %31 : !llvm.ptr<f32>
    %32 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %33 = llvm.load %32 : !llvm.ptr<f32>
    %34 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %35 = llvm.getelementptr %34[%2] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %36 = llvm.load %35 : !llvm.ptr<i64>
    %37 = builtin.unrealized_conversion_cast %36 : i64 to index
    %38 = llvm.extractvalue %10[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
    %39 = llvm.getelementptr %38[%5] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %40 = llvm.load %39 : !llvm.ptr<i64>
    %41 = builtin.unrealized_conversion_cast %40 : i64 to index
    %42 = scf.for %arg1 = %37 to %41 step %4 iter_args(%arg2 = %33) -> (f32) {
      %46 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %47 = builtin.unrealized_conversion_cast %arg1 : index to i64
      %48 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %49 = llvm.getelementptr %48[%47] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %50 = llvm.load %49 : !llvm.ptr<i64>
      %51 = builtin.unrealized_conversion_cast %50 : i64 to index
      %52 = llvm.add %46, %3  : i64
      %53 = builtin.unrealized_conversion_cast %52 : i64 to index
      %54 = builtin.unrealized_conversion_cast %53 : index to i64
      %55 = llvm.extractvalue %13[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
      %56 = llvm.getelementptr %55[%54] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
      %57 = llvm.load %56 : !llvm.ptr<i64>
      %58 = builtin.unrealized_conversion_cast %57 : i64 to index
      %59 = scf.for %arg3 = %51 to %58 step %4 iter_args(%arg4 = %arg2) -> (f32) {
        %60 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %61 = builtin.unrealized_conversion_cast %arg3 : index to i64
        %62 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %63 = llvm.getelementptr %62[%61] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %64 = llvm.load %63 : !llvm.ptr<i64>
        %65 = builtin.unrealized_conversion_cast %64 : i64 to index
        %66 = llvm.add %60, %3  : i64
        %67 = builtin.unrealized_conversion_cast %66 : i64 to index
        %68 = builtin.unrealized_conversion_cast %67 : index to i64
        %69 = llvm.extractvalue %16[1] : !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>
        %70 = llvm.getelementptr %69[%68] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
        %71 = llvm.load %70 : !llvm.ptr<i64>
        %72 = builtin.unrealized_conversion_cast %71 : i64 to index
        %73 = scf.for %arg5 = %65 to %72 step %4 iter_args(%arg6 = %arg4) -> (f32) {
          %74 = builtin.unrealized_conversion_cast %arg5 : index to i64
          %75 = llvm.extractvalue %19[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
          %76 = llvm.getelementptr %75[%74] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
          %77 = llvm.load %76 : !llvm.ptr<f32>
          %78 = llvm.fadd %arg6, %77  : f32
          scf.yield %78 : f32
        scf.yield %73 : f32
      scf.yield %59 : f32
    %43 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> %42, %43 : !llvm.ptr<f32>
    %44 = llvm.extractvalue %30[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
    %45 = llvm.load %44 : !llvm.ptr<f32>
    llvm.return %45 : f32

While this is a good idea in general, it doesn’t seem to be useful here. When MLIR applies a pass, that pass is applied until quiescence, i.e. it keeps applying the pass until nothing changes (or until some limit on the number of applications is reached).

It seems that the convert-std-to-llvm pass has already been applied a few times since we see several ops from the LLVM dialect already present in the IR shown under the Input to convert-std-to-llvm section (for example, we see llvm.mlir.constant).

Another good place to look is in the output of the last pass right before we get our error. Let’s look at the result of the convert-math-to-llvm pass.

lines = result_string.splitlines()
lines = lines[lines.index("  Input to convert-math-to-llvm  ")-1:]
lines = lines[:lines.index("")]
  Input to convert-math-to-llvm
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32> %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memref.load %0[%c1] : memref<?xi64>
    %9 = arith.index_cast %8 : i64 to index
    %10 = scf.for %arg1 = %7 to %9 step %c1 iter_args(%arg2 = %5) -> (f32) {
      %12 = memref.load %1[%arg1] : memref<?xi64>
      %13 = arith.index_cast %12 : i64 to index
      %14 = arith.addi %arg1, %c1 : index
      %15 = memref.load %1[%14] : memref<?xi64>
      %16 = arith.index_cast %15 : i64 to index
      %17 = scf.for %arg3 = %13 to %16 step %c1 iter_args(%arg4 = %arg2) -> (f32) {
        %18 = memref.load %2[%arg3] : memref<?xi64>
        %19 = arith.index_cast %18 : i64 to index
        %20 = arith.addi %arg3, %c1 : index
        %21 = memref.load %2[%20] : memref<?xi64>
        %22 = arith.index_cast %21 : i64 to index
        %23 = scf.for %arg5 = %19 to %22 step %c1 iter_args(%arg6 = %arg4) -> (f32) {
          %24 = memref.load %3[%arg5] : memref<?xf32>
          %25 = arith.addf %arg6, %24 : f32
          scf.yield %25 : f32
        scf.yield %23 : f32
      scf.yield %17 : f32
    } %10, %4[] : memref<f32>
    %11 = memref.load %4[] : memref<f32>
    return %11 : f32

We see that the ops are mostly ops from the standard, llvm, and builtin dialects. However, there are some ops from the scf dialect. It would make sense that the convert-std-to-llvm pass would be able to handle ops from the builtin dialect. It would make sense that it be able to handle ops from the llvm dialect since that’s the target diallect. It’s unclear whether or not the convert-std-to-llvm dialect can handle ops from the scf dialect. Given the name of the convert-std-to-llvm pass, we can infer that it will mostly handle ops from the std dialect and cannot handle ops from the scf dialect. Let’s see if there are any passes that can convert from the scf dialect to the std dialect.

!mlir-opt --help | grep "scf"
Available Dialects: acc, affine, amx, arith, arm_neon, arm_sve, async, bufferization, builtin, cf, complex, dlti, emitc, gpu, linalg, llvm, math, memref, nvvm, omp, pdl, pdl_interp, quant, rocdl, scf, shape, sparse_tensor, spv, std, tensor, test, tosa, vector, x86vector
      --async-parallel-for                              -   Convert scf.parallel operations to multiple async compute ops executed concurrently for non-overlapping iteration ranges
      --convert-linalg-tiled-loops-to-scf               -   Lower linalg tiled loops to SCF loops and parallel loops
      --convert-openacc-to-scf                          -   Convert the OpenACC ops to OpenACC with SCF dialect
      --convert-parallel-loops-to-gpu                   -   Convert mapped scf.parallel ops to gpu launch operations
      --convert-scf-to-cf                               -   Convert SCF dialect to ControlFlow dialect, replacing structured control flow with a CFG
      --convert-scf-to-openmp                           -   Convert SCF parallel loop to OpenMP parallel + workshare constructs.
      --convert-scf-to-spirv                            -   Convert SCF dialect to SPIR-V dialect.
      --convert-vector-to-scf                           -   Lower the operations from the vector dialect into the SCF dialect
      --scf-bufferize                                   -   Bufferize the scf dialect.
      --scf-for-loop-canonicalization                   -   Canonicalize operations within scf.for loop bodies
      --scf-for-loop-peeling                            -   Peel `for` loops at their upper bounds.
      --scf-for-loop-range-folding                      -   Fold add/mul ops into loop range
      --scf-for-loop-specialization                     -   Specialize `for` loops for vectorization
      --scf-for-to-while                                -   Convert SCF for loops to SCF while loops
      --scf-parallel-loop-collapsing                    -   Collapse parallel loops to use less induction variables
      --scf-parallel-loop-fusion                        -   Fuse adjacent parallel loops
      --scf-parallel-loop-specialization                -   Specialize parallel loops for vectorization
      --scf-parallel-loop-tiling                        -   Tile parallel loops
      --test-scf-for-utils                              -   test scf.for utils
      --test-scf-if-utils                               -   test scf.if utils
      --test-scf-pipelining                             -   test scf.forOp pipelining
      --test-vector-transfer-full-partial-split         -   Test lowering patterns to split transfer ops via scf.if + linalg ops
      --tosa-to-scf                                     -   Lower TOSA to the SCF dialect

The pass convert-scf-to-cf seems promising as it intends to convert the scf dialect to cf dialect.

Let’s see if running the convert-scf-to-cf pass any of the conversion passes will get rid of our exception.

passes = [
    "--convert-scf-to-cf", # newly added
result = cli.apply_passes(mlir_bytes, passes)
module attributes {llvm.data_layout = ""} {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>> @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attributes {llvm.emit_c_interface, sym_visibility = "private"}
  llvm.func @sparsePointers64(%arg0: !llvm.ptr<i8>, %arg1: i64) -> !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i64>, ptr<i64>, i64, array<1 x i64>, array<1 x i64>)>>

It looks like it fixed our issue!