【Hackathon 9th No.89】为FastDeploy集成 SageAttn v2/2++ #1157

fangfangssj · 2025-09-18T09:28:08Z

为FastDeploy集成 SageAttn v2++的RFC

paddle-bot · 2025-09-18T12:06:12Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

luotao1 · 2025-09-19T08:21:19Z

@chang-wenbin

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md

chang-wenbin · 2025-10-14T04:00:58Z

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md

+- qk_int8_sv_f8_accum_f16_fuse_v_scale_attn_inst_buf # FP16累积版本
+- qk_int8_sv_f8_accum_f32_fuse_v_scale_fuse_v_mean_attn # 融合V均值
+#### sm90
+- qk_int8_sv_f8_accum_f32_fuse_v_scale_attn


建议先接入SM90架构算子跑通流程，后续同步进行验证以及其他架构接入。

chang-wenbin · 2025-10-14T04:02:25Z

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md

+triton算子实现sm86架构
+#### sm86
+- attn_qk_int8_block_varlen
+- attn_qk_int8_per_block_causal_varlen


LLM服务下想要有性能收益必须要支持varlen，cudakernel可以参考paddlenlp PR中的算子修改进行验证。

Create 20250916_FastDeploy_add_sageattention.md

51a457b

luotao1 mentioned this pull request Sep 18, 2025

【Hackathon 9th】开源贡献个人挑战赛 PaddlePaddle/Paddle#74773

Open

paddle-bot bot added the contributor label Sep 18, 2025

luotao1 self-assigned this Sep 19, 2025

chang-wenbin reviewed Sep 23, 2025

View reviewed changes

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Outdated Show resolved Hide resolved

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Outdated Show resolved Hide resolved

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Outdated Show resolved Hide resolved

Update 20250916_FastDeploy_add_sageattention.md

54b5645

fangfangssj requested a review from chang-wenbin September 26, 2025 08:14

chang-wenbin reviewed Oct 14, 2025

View reviewed changes

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Outdated Show resolved Hide resolved

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Show resolved Hide resolved

rfcs/FastDeploy/20250916_FastDeploy_add_sageattention.md Outdated Show resolved Hide resolved

Update 20250916_FastDeploy_add_sageattention.md

25136d2

fangfangssj requested a review from chang-wenbin October 14, 2025 03:59

chang-wenbin approved these changes Oct 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

【Hackathon 9th No.89】为FastDeploy集成 SageAttn v2/2++ #1157

【Hackathon 9th No.89】为FastDeploy集成 SageAttn v2/2++ #1157

fangfangssj commented Sep 18, 2025

Uh oh!

paddle-bot bot commented Sep 18, 2025

Uh oh!

luotao1 commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chang-wenbin Oct 14, 2025

Uh oh!

chang-wenbin Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

【Hackathon 9th No.89】为FastDeploy集成 SageAttn v2/2++ #1157

Are you sure you want to change the base?

【Hackathon 9th No.89】为FastDeploy集成 SageAttn v2/2++ #1157

Conversation

fangfangssj commented Sep 18, 2025

Uh oh!

paddle-bot bot commented Sep 18, 2025

Uh oh!

luotao1 commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chang-wenbin Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

chang-wenbin Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants