- Now support padding for a fixed length in the framework. Which is the realization of our FPGA.
- Transpose weight matrix in nn.linear in advance. We don't leave this to FPGA
- capture the charateristic of casual and attention mask, and replace their code to allo and FPGA friendly
- Finish allo numpy code for transformer part! Now GPTneo can inference!
- Mask softmax passed Csim!
- Casual sdp passed Csim!
- bias_add passed Csim!
- GPTneo passes atol=1e-2 can pass most time, but atol=1e-3 failed
- GPTneo should run with our accelerator if "make hw" passed!
- Add some host.cpp template for tesing hw's correctness
- our FPGA accelerator for GPTneo can output tokens correctly!!!
- We have our first casual accelerator on fpga!!!
-
Notifications
You must be signed in to change notification settings - Fork 5
kaizizzzzzz/LLM-accelerator
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published