Skip to content

Set _speech_start_time when VAD START_OF_SPEECH activates#5027

Open
hudson-worden wants to merge 4 commits intolivekit:mainfrom
hudson-worden:fix-vad-start-time
Open

Set _speech_start_time when VAD START_OF_SPEECH activates#5027
hudson-worden wants to merge 4 commits intolivekit:mainfrom
hudson-worden:fix-vad-start-time

Conversation

@hudson-worden
Copy link
Contributor

@hudson-worden hudson-worden commented Mar 6, 2026

This image attempts to convey the issue. I have plotted where the ChatMessage was (determined by MetricsReport and the box that shows it is designated across the whole span) vs where I believe it should be (on the right).

Screenshot 2026-03-06 at 10 01 29 AM

Basically it should align with this custom span I added that represents the vad start of speech.

image

I believe the reason this is happening is b/c we're setting _speech_start_time during INFERENCE_DONE. When VAD picks up speech (even though it doesn't match up with the activation) _speech_start_time is set. Instead it should be reset when a START_OF_SPEECH event occurs. This would align it with the activation threshold and also a few other places

here
and here.

We're not removing the INFERENCE_DONE branch b/c it would handle the case where no other START_OF_SPEECH event comes. I'm open to removing that too though.

…overwrite the time set by INFERENCE_DONE and should b/c it's more accurate.
devin-ai-integration[bot]

This comment was marked as resolved.

@hudson-worden
Copy link
Contributor Author

This is the related issue thread on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant