Replies: 3 comments 7 replies
-
@qianqing13579 I'm writing up a document that I will put in this repository's Wiki page about the current design of dynamic shapes in MIGX. We can then discuss the plan fowards. |
Beta Was this translation helpful? Give feedback.
-
Here is the Wiki page https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki/Dynamic-Shapes-Design. |
Beta Was this translation helpful? Give feedback.
-
The current dynamic batch solution means multiple copies in GPU : that is impossible for LLM models in LLM era. Let's forget about this solution (this is designed for ASIC which compile model into chip in advance, in which case batch size must be fixed number). GPU is indead naturally support dyanmic batches : activations can be allocated when produced and removed when data dependecy goes to zero! Unlike General purpose ASIC such as Graphcore IPU chip, you have to reserve memory in compile stage (stack memory), while GPU ASIC can request memory just in need (HBM memory pool dynamically). Hence the solution should turn to memory dynamically allocation for actiations . Note you cannot perfectly infer shape from an arbitrary model with Reshape and Resize ops (this has been proved) when you don't know the batch size. |
Beta Was this translation helpful? Give feedback.
-
Discussion of the current design for dynamic batch support and how it might be extended for full dynamic shape support.
Beta Was this translation helpful? Give feedback.
All reactions