Skip to content

Conversation

@GoyoUijin
Copy link

If the user does not explicitly specify a request_id in the Triton client, the Triton server sets request_id=''.
In this case, different threads may write to the same key in CosyVoice2Model.hift_cache_dict simultaneously, causing cache data from different requests to be mixed up.
When serving CosyVoice2 via Triton Server, I actually observed that the output audio from different requests got mixed.

Example

On the client side:

# request_id is not explicitly specified
sync_triton_client.async_stream_infer(
    "cosyvoice2",
    inputs,
    outputs=outputs,
    enable_empty_final_response=True,
)

On the server side:

    def execute(self, requests):
        for request in requests:
            request_id = request.request_id()  # This value is ''

Therefore, I propose generating and using a unique uuid directly on the server side, instead of relying on the user to provide a request_id.

- Since all token2wav requests within a single cosyvoice2 request must share the same request_id, modify the logic so that a new request_id is generated only if it does not already exist, and ensure that the same request_id is sent consistently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant