Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[examples] Add Gemmini-baremetal examples and more conv2d examples. #451

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions examples/GemminiDialect/baremetal/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## Gemmini-Baremetal-Example

This example demonstrates how to use the gemmini dialect in a baremetal environment. Functional evaluation can be performed using spike, while RTL hardware-based evaluation uses verilator.

Compiling baremetal workloads requires ``riscv64-unknown-elf-gcc``. The riscv-toolchain in the gemmini chipyard environment already includes ``riscv64-unknown-elf-gcc``, so there's no need to reinstall it. Before running the following test cases, simply switch to the corresponding chipyard **conda environment**.

### Function Evaluation
Simply execute the generated executable file for testing, for example:
```
make mvin-mvout-run-baremetal
spike --extension=gemmini mvin-mvout-baremetal
```

### Hardware Evaluation
Step 1. Execute `make <workload>-run-baremetal` in this folder to generate the baremetal test case.

Step 2. Copy the generated test case `<workload>-baremetal` to the gemmini workload storage path in the chipyard folder (`chipyard/generators/software/gemmini-rocc-tests/build/`)

Step 3. `cd chipyard/generators/gemmini/` and execute `./scripts/run-verilator.sh <workload>` to run the test. (this step may be different depending on the chipyard/gemmini version)

It is strongly recommended to use chipyard's MEMORY_LOAD setting, which will significantly speed up the simulation startup time.
72 changes: 72 additions & 0 deletions examples/GemminiDialect/baremetal/makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash
BUDDY_MLIR_DIR := ../../../
BUDDY_OPT := ${BUDDY_MLIR_DIR}/build/bin/buddy-opt
BUDDY_TRANSLATE := ${BUDDY_MLIR_DIR}/build/bin/buddy-translate
BUDDY_LLC := ${BUDDY_MLIR_DIR}/build/bin/buddy-llc

GEMMINI_EXAMPLE_DIR := ${BUDDY_MLIR_DIR}/examples/GemminiDialect/

ELF_CC := riscv64-unknown-elf-gcc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have not seen the E2E example.
Is the remaining E2E C++ error ('_start_main_argv') related to using riscv64-unknown-elf-gcc (expecting riscv64-unknown-elf-g++) for end-to-end?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, E2E example still block points at compiling with riscv64-unknown-elf-g++.
I will try to fix it and follow up with a full version commit.

PK_CC := riscv64-unknown-linux-gnu-gcc

mvin-mvout-run-baremetal:
@${BUDDY_OPT} ./${GEMMINI_EXAMPLE_DIR}/mvin-mvout.mlir -lower-gemmini | \
${BUDDY_TRANSLATE} --buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@${ELF_CC} -static -specs=htif_nano.specs log.o -o mvin-mvout-baremetal
@spike --extension=gemmini mvin-mvout-baremetal

gemmini-conv2d-nchw-fchw-f32-run-baremetal:
@${BUDDY_OPT} ./${GEMMINI_EXAMPLE_DIR}/conv_2d_nchw_fchw_f32.mlir \
-convert-linalg-to-gemmini="acc_t=f32" \
-convert-linalg-to-loops \
-lower-gemmini="dim=4 acc_t=f32 elem_t=f32" | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@${ELF_CC} -O2 -static -specs=htif_nano.specs log.o -o conv-2d-nchw-fchw-f32-baremetal
@spike --extension=gemmini conv-2d-nchw-fchw-f32-baremetal

gemmini-conv2d-nchw-fchw-i8-run-baremetal:
@${BUDDY_OPT} ./${GEMMINI_EXAMPLE_DIR}/conv_2d_nchw_fchw_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@${ELF_CC} -O2 -static -specs=htif_nano.specs log.o -o conv-2d-nchw-fchw-i8-baremetal
@spike --extension=gemmini conv-2d-nchw-fchw-i8-baremetal

gemmini-conv2d-nchw-fhwc-f32-run-baremetal:
@${BUDDY_OPT} ./${GEMMINI_EXAMPLE_DIR}/conv_2d_nchw_fhwc_f32.mlir \
-convert-linalg-to-gemmini="acc_t=f32" \
-convert-linalg-to-loops \
-lower-gemmini="dim=4 acc_t=f32 elem_t=f32" | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@${ELF_CC} -O2 -static -specs=htif_nano.specs log.o -o conv-2d-nchw-fhwc-f32-baremetal
@spike --extension=gemmini conv-2d-nchw-fhwc-f32-baremetal

gemmini-conv2d-nhwc-fhwc-i8-run-baremetal:
@${BUDDY_OPT} ./${GEMMINI_EXAMPLE_DIR}/conv_2d_nhwc_fhwc_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@${ELF_CC} -O2 -static -specs=htif_nano.specs log.o -o conv-2d-nhwc-fhwc-i8-baremetal
@spike --extension=gemmini conv-2d-nhwc-fhwc-i8-baremetal
32 changes: 32 additions & 0 deletions examples/GemminiDialect/conv_2d_nhwc_fhwc_5x5_i8.mlir
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// RUN: buddy-opt %s \
// RUN: --convert-linalg-to-gemmini | \
// RUN: FileCheck %s

memref.global "private" @input : memref<1x7x7x1xi8> = dense<[[[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]]]]>

memref.global "private" @kernel : memref<1x5x5x1xi8> = dense<[[[[1], [1], [1], [1], [1]],
[[1], [1], [1], [1], [1]],
[[1], [1], [1], [1], [1]],
[[1], [1], [1], [1], [1]],
[[1], [1], [1], [1], [1]]]]>

func.func @main() -> i8 {
%0 = arith.constant 0 : i8
%input = memref.get_global @input : memref<1x7x7x1xi8>
%kernel = memref.get_global @kernel : memref<1x5x5x1xi8>
%output = memref.alloc() : memref<1x3x3x1xi8>

// CHECK: gemmini.tile_conv %{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %{{.+}} %{{.+}} :
// CHECK-SAME: memref<1x7x7x1xi8> memref<25x1xi8> memref<1xi32> memref<9x1xi8> i64 i64
linalg.conv_2d_nhwc_fhwc
ins(%input, %kernel : memref<1x7x7x1xi8>, memref<1x5x5x1xi8>)
outs(%output : memref<1x3x3x1xi8>)
gemmini.print %output : memref<1x3x3x1xi8>
return %0 : i8
}
31 changes: 31 additions & 0 deletions examples/GemminiDialect/conv_2d_nhwc_fhwc_f32.mlir
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// RUN: buddy-opt %s \
// RUN: --convert-linalg-to-gemmini="acc_t=f32" | \
// RUN: FileCheck %s

memref.global "private" @input : memref<1x5x5x1xf32> = dense<[[[[1.],[2.],[3.],[4.],[5.]],
[[6.],[7.],[8.],[9.],[10.]],
[[11.],[12.],[13.],[14.],[15.]],
[[16.],[17.],[18.],[19.],[20.]],
[[21.],[22.],[23.],[24.],[25.]]]]>

memref.global "private" @kernel : memref<1x3x3x1xf32> = dense<[[[[1.], [1.], [1.]],
[[1.], [1.], [1.]],
[[1.], [1.], [1.]]]]>


func.func @main() -> i8 {
%0 = arith.constant 0 : i8
// batchsize = 2 inputchannel = 2
%input = memref.get_global @input : memref<1x5x5x1xf32>
// outputchannel = 3
%kernel = memref.get_global @kernel : memref<1x3x3x1xf32>
// batchsize h w outputchannel
%output = memref.alloc() : memref<1x3x3x1xf32>
// CHECK: gemmini.tile_conv %{{.+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %{{.+}} %{{.+}} :
// CHECK: memref<1x5x5x1xf32> memref<9x1xf32> memref<1xf32> memref<9x1xf32> i64 i64
linalg.conv_2d_nhwc_fhwc
ins(%input, %kernel : memref<1x5x5x1xf32>, memref<1x3x3x1xf32>)
outs(%output : memref<1x3x3x1xf32>)
gemmini.print %output : memref<1x3x3x1xf32>
return %0 : i8
}
30 changes: 30 additions & 0 deletions examples/GemminiDialect/conv_2d_nhwc_fhwc_i8.mlir
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
// RUN: buddy-opt %s \
// RUN: --convert-linalg-to-gemmini | \
// RUN: FileCheck %s

memref.global "private" @input : memref<1x7x7x1xi8> = dense<[[[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]]]]>

memref.global "private" @kernel : memref<1x3x3x1xi8> = dense<[[[[1], [1], [1]],
[[1], [1], [1]],
[[1], [1], [1]]]]>

func.func @main() -> i8 {
%0 = arith.constant 0 : i8
%input = memref.get_global @input : memref<1x7x7x1xi8>
%kernel = memref.get_global @kernel : memref<1x3x3x1xi8>
%output = memref.alloc() : memref<1x5x5x1xi8>

// CHECK: gemmini.tile_conv %{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %{{.+}} %{{.+}} :
// CHECK-SAME: memref<1x7x7x1xi8> memref<9x1xi8> memref<1xi32> memref<25x1xi8> i64 i64
linalg.conv_2d_nhwc_fhwc
ins(%input, %kernel : memref<1x7x7x1xi8>, memref<1x3x3x1xi8>)
outs(%output : memref<1x5x5x1xi8>)
gemmini.print %output : memref<1x5x5x1xi8>
return %0 : i8
}
32 changes: 32 additions & 0 deletions examples/GemminiDialect/conv_2d_nhwc_hwcf_5x5_i8.mlir
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// RUN: buddy-opt %s \
// RUN: --convert-linalg-to-gemmini | \
// RUN: FileCheck %s

memref.global "private" @input : memref<1x7x7x1xi8> = dense<[[[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]],
[[1],[1],[1],[1],[1],[1],[1]]]]>

memref.global "private" @kernel : memref<5x5x1x1xi8> = dense<[[[[1]], [[1]], [[1]], [[1]], [[1]]],
[[[1]], [[1]], [[1]], [[1]], [[1]]],
[[[1]], [[1]], [[1]], [[1]], [[1]]],
[[[1]], [[1]], [[1]], [[1]], [[1]]],
[[[1]], [[1]], [[1]], [[1]], [[1]]]]>

func.func @main() -> i8 {
%0 = arith.constant 0 : i8
%input = memref.get_global @input : memref<1x7x7x1xi8>
%kernel = memref.get_global @kernel : memref<5x5x1x1xi8>
%output = memref.alloc() : memref<1x3x3x1xi8>

// CHECK: gemmini.tile_conv %{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %alloc_{{[0-9]+}} %{{.+}} %{{.+}} :
// CHECK-SAME: memref<1x7x7x1xi8> memref<25x1xi8> memref<1xi32> memref<9x1xi8> i64 i64
linalg.conv_2d_nhwc_hwcf
ins(%input, %kernel : memref<1x7x7x1xi8>, memref<5x5x1x1xi8>)
outs(%output : memref<1x3x3x1xi8>)
gemmini.print %output : memref<1x3x3x1xi8>
return %0 : i8
}
79 changes: 79 additions & 0 deletions examples/GemminiDialect/makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2484,3 +2484,82 @@ gemmini-conv-13x13-gemmini-run:
-DDIALECT=2 performance-test.cpp \
-O2 -static -o a.out -I${INTERFACES}
@spike --extension=gemmini pk a.out


gemmini-linalg-conv2d-nhwc-fhwc-f32-lower:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_f32.mlir \
-convert-linalg-to-gemmini="acc_t=f32" \
-convert-linalg-to-loops \
-lower-gemmini="dim=4 acc_t=f32 elem_t=f32" \
-o log.mlir

gemmini-linalg-conv2d-nhwc-fhwc-f32-run-pk:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_f32.mlir \
-convert-linalg-to-gemmini="acc_t=f32" \
-convert-linalg-to-loops \
-lower-gemmini="dim=4 acc_t=f32 elem_t=f32" | \
${BUDDY_TRANSLATE} --buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-o log.o
@riscv64-unknown-linux-gnu-gcc log.o -O2 -static -o a.out
@spike --extension=gemmini pk a.out

gemmini-linalg-conv2d-nhwc-fhwc-i8-lower:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini \
-o log.mlir

gemmini-linalg-conv2d-nhwc-fhwc-i8-run-pk:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini | \
${BUDDY_TRANSLATE} --buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-o log.o
@riscv64-unknown-linux-gnu-gcc log.o -O2 -static -o a.out
@spike --extension=gemmini pk a.out

gemmini-conv2d-nhwc-hwcf-5x5-i8-lower:
@${BUDDY_OPT} ./conv_2d_nhwc_hwcf_5x5_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini \
-o log.mlir

gemmini-conv2d-nhwc-hwcf-5x5-i8-run-pk:
@${BUDDY_OPT} ./conv_2d_nhwc_hwcf_5x5_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@riscv64-unknown-linux-gnu-gcc -O2 -static log.o -o a.out
@spike --extension=gemmini pk a.out

gemmini-conv2d-nhwc-fhwc-5x5-i8-lower:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_5x5_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini \
-o log.mlir

gemmini-conv2d-nhwc-fhwc-5x5-i8-run-pk:
@${BUDDY_OPT} ./conv_2d_nhwc_fhwc_5x5_i8.mlir \
-convert-linalg-to-gemmini \
-convert-linalg-to-loops \
-lower-gemmini | \
${BUDDY_TRANSLATE} -buddy-to-llvmir | \
${BUDDY_LLC} -filetype=obj -mtriple=riscv64 \
-mattr=+buddyext,+D -float-abi=hard \
-relocation-model=pic \
-o log.o
@riscv64-unknown-linux-gnu-gcc -O2 -static log.o -o a.out
@spike --extension=gemmini pk a.out