forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 62
fp8 forward #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
fp8 forward #116
Changes from all commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
77fe3a4
disable navi
micmelesse 92fb040
start test
micmelesse 957b0e6
test fp16 against fp8
micmelesse a290a6d
save scaling code so far
micmelesse 3542300
global scaling
micmelesse 0121712
add per_head_scaling
micmelesse 1cef817
dump qk
micmelesse 390e990
save dumping q, k and qk to fp32 tensor
micmelesse e834dd2
fix pointer bug
micmelesse 2a4899a
save reproducer
micmelesse 65ad5f2
dump p and acc
micmelesse d080d33
fp8 working with my debug input
micmelesse daa7532
save
micmelesse b4c3dc3
change api for dequant
micmelesse dd5002d
pass descale_p
micmelesse 4dfe187
clean up
micmelesse 7afec5c
most working
micmelesse ba01300
save
micmelesse d1c3e46
save
micmelesse 1fd2219
varlen half way
micmelesse 277fac6
some varlen examples work
micmelesse 554bee9
improve varlen debug input
micmelesse 93fab91
varlen mostly working
micmelesse 06982bf
push working cases
micmelesse 4c110bd
fix ref bug
micmelesse ce51aac
fix backward bug
micmelesse 6e2dcbf
fix varlen backward bug
micmelesse db4a331
use descale to set fp8
micmelesse 9a5b607
check arch fp8 support
micmelesse a811071
cache arch
micmelesse 5037533
try again
micmelesse fb5c01e
skip bad config on MI200
micmelesse f38f6df
skip decode nan config on MI200
micmelesse 3058fef
fix mistake
micmelesse 35ac3ef
skip more
micmelesse 323d8dc
run full suit
micmelesse 8a6fa25
Update amd_tests.yml
micmelesse d51ee78
address comments
micmelesse fd49369
navi ci is broken
micmelesse e728cab
raise error tolerance to 2.5e-1
micmelesse d1b6fd9
target MI300 directly
micmelesse 3d9e0dd
show gfx
micmelesse 8cb52c7
try again
micmelesse e50bc0c
don't fail matrix if one path fails
micmelesse 15cb129
try upstream triton
micmelesse 4dd92bd
just get MI300 working
micmelesse ae8d4cf
Fix install bug
micmelesse b715392
run ref on cpu
micmelesse 0c54226
move ref test to navi machines
micmelesse ef7e107
pin triton
micmelesse 924cad4
add bench deps
micmelesse File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.