Skip to content

Commit 93548eb

Browse files
varun-sundar-rabindranathVarun Sundar Rabindranath
and
Varun Sundar Rabindranath
authored
[Kernel] Enable FP8 Cutlass for Ada Lovelace (vllm-project#6950)
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
1 parent 460c188 commit 93548eb

File tree

1 file changed

+1
-7
lines changed

1 file changed

+1
-7
lines changed

csrc/quantization/cutlass_w8a8/scaled_mm_entry.cu

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,7 @@ bool cutlass_scaled_mm_supports_fp8(int64_t cuda_device_capability) {
3838
if (cuda_device_capability >= 90) {
3939
return CUDA_VERSION >= 12000;
4040
} else if (cuda_device_capability >= 89) {
41-
// CUTLASS Kernels have not been tuned for Ada Lovelace systems
42-
// and are slower than torch.mm. Return false unconditionally in this case.
43-
return false;
44-
45-
// Once the CUTLASS kernels have been optimized for Lovelace systems,
46-
// use the following check:
47-
// return CUDA_VERSION >= 12040;
41+
return CUDA_VERSION >= 12040;
4842
}
4943
#endif
5044

0 commit comments

Comments
 (0)