GitHub - kaizizzzzzz/LLM-accelerator

Now support padding for a fixed length in the framework. Which is the realization of our FPGA.
Transpose weight matrix in nn.linear in advance. We don't leave this to FPGA
capture the charateristic of casual and attention mask, and replace their code to allo and FPGA friendly
Finish allo numpy code for transformer part! Now GPTneo can inference!
Mask softmax passed Csim!
Casual sdp passed Csim!
bias_add passed Csim!
GPTneo passes atol=1e-2 can pass most time, but atol=1e-3 failed
GPTneo should run with our accelerator if "make hw" passed!
Add some host.cpp template for tesing hw's correctness
our FPGA accelerator for GPTneo can output tokens correctly!!!
We have our first casual accelerator on fpga!!!

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.vscode		.vscode
GPTneo		GPTneo
host_for_verify		host_for_verify
.gitignore		.gitignore
README.md		README.md

Provide feedback