Add device
as parameter to TP and rotary_embedding functions
#11888
+31
−25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds
device
as parameter to TP and rotary_embedding functions.For TP functions including
get_tensor_model_parallel_world_size
,get_tensor_model_parallel_rank
andget_tp_group
, we adddevice
as parameter to distinguish the behavior between CPU and GPU.For rotary_embedding,
device="cuda"
has been hard-coded. We add adevice
parameter and set the device of tensors accordingly.