Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chunked prefill regression #231

Merged
merged 1 commit into from
Mar 26, 2025
Merged

Conversation

yuyanpeng-google
Copy link
Collaborator

Fix typo of expanding wrong dimension.
Fix calculate wrong position and true length due to bos.

Test by benchmark servering

@yuyanpeng-google yuyanpeng-google requested review from mailvijayasingh and vipannalla and removed request for vipannalla March 25, 2025 09:47
Copy link
Collaborator

@vipannalla vipannalla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Fix typo expand to correct batch dimension
Remain the chunk size with bos
Add chunked prefill define in engine api
@yuyanpeng-google yuyanpeng-google force-pushed the yuyan-fix-chunked-prefill branch from 3f9443a to 35a54b7 Compare March 26, 2025 09:38
@yuyanpeng-google
Copy link
Collaborator Author

Rewrite chunked prefill in orchestrator unitest later.

@mailvijayasingh mailvijayasingh merged commit b8b9cb2 into main Mar 26, 2025
2 of 3 checks passed
@mailvijayasingh mailvijayasingh deleted the yuyan-fix-chunked-prefill branch March 26, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants