Skip to content

Add remote track channelCount #3025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

henbos
Copy link
Contributor

@henbos henbos commented Nov 26, 2024

Fixes #3011


Preview | Diff

@henbos
Copy link
Contributor Author

henbos commented Dec 5, 2024

(Jan-Ivar's fiddle from meeting: https://jsfiddle.net/jib1/3Lh9zaxq/)

webrtc.html Outdated
<td>
As a setting, this is the number of channels of the latest
audio frame received, reflecting the number of
{{RTCRtpCodec/channels}} that the decoder outputted.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from Youenn: make this more clear that you MUST do what the decoder is capable of

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to refresh our memories about the conclusion: we have go-ahead to return 2 here and as per discussion, saying MUST here was preferred.

I updated this sentence to say "A stereo capable decoder MUST produce stereo and return the value 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WPT also reflects the "always do stereo on the receiving track" decision.

Copy link
Member

@dontcallmedom dontcallmedom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix and simplify amendment description

@dontcallmedom
Copy link
Member

the CI error is something that needs to be fixed in the amendment script AFAICT, will take a look

webrtc.html Outdated
Comment on lines 16703 to 16704
{{RTCRtpCodec/channels}} that the decoder outputted. A stereo
capable decoder MUST produce stereo and return the value 2.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define "stereo capable" ?
For opus I interpret the intention to be what RFC 6716 2.1.2 refers to as "stereo decoder". If so this seems fine, since it reflects the data represented by the track. I'm not sure that's obvious from the proposed text though. The bit with "that the decoder outputted" seems clearer, but OTOH seems to contradict "latest audio frame received" in case it doesn't match the decoder configuration.

Copy link
Contributor Author

@henbos henbos Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is that is the browser has implemented opus stereo, then the decoder will output stereo - regardless if the signal on the wire is mono or stereo. You could in theory flip-flop between mono and stereo here, but our audio folks did not want to do this because they didn't see the use case for it and they were concerned that re-initializing the opus decoder in the middle of the stream (the use case is CSRC where the stream can switch between mono and stereo at any time and multiple times throughout the meeting).

Is this OK with you? I can clarify the text to make it more clear, I agree that the latest frame makes it sound like I'm going somewhere else

Copy link
Contributor Author

@henbos henbos Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. I would only expect this to change from 2 to 1 or vice versa if you switch which codec you use, but as long as you use opus you should always get 2 regardless of number of channels on the wire

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell of the libwebrtc opus decoder (referenced above in #3025 (comment)) it outputs what was signaled. Not sure what you mean by "implements" above but it doesn't sound like you mean "signaled" or "configured"? Are you proposing to return 2 even when negotiating stereo=0?

Copy link
Contributor Author

@henbos henbos Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right, that nuance got lost on me, my mistake! If we signal stereo=0, we're asking for mono, then I think it makes perfect sense that you get mono, and as per your link that's what we're already doing.

What we want is that is stereo=1 is signalled, then your track has 2 channels regardless if the stream being sent is mono or stereo.

And ultimately we want stereo=1 to be signalled by default because we want you to be able to replaceTrack between mono and stereo tracks on the sender side without having to re-negotiate and without the receiver having to have special logic to deal with this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about "negotiated". Since the stereo param is receive-only it makes sense to me if the setting matches the stereo param in the local description. Other people know signaling better than me, to say whether those are equivalent for this case.

Otherwise LGTM

Copy link
Contributor Author

@henbos henbos Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - changed from "negotiated" to "when the local SDP was set", which is more precise. (For the infomrative "NOTE" I think the choice of language there doesn't matter)

@jan-ivar @Pehrsons I think this fully captures what we want now and includes that "MUST" language. Can I have the "Editors can integrate" back? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After disscussions with @padenot, we've found value in returning channelCount 1 when a stereo capable decoder is receiving mono packets. It's useful because down stream websites can cut processing in half.

I'll comment on the issue in a bit (after updating a fiddle to show this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand if you want to get it right on the first PR, but considering I'll be OOO for months and this PR currently reflects how libwebrtc-based opus decoders already work today, meaning all browsers that change the default to stereo=1 and exposed channelCount would pass these WPTs, can we leave "channelCount:1 on a stereo=1 decoder" as a separate issue and pull request?

We can already claim consensus on everyone wanting stereo opus by default so it would be good to make progress even if we don't have consensus on "mono on stereo opus" which is a new feature request. @jakobivarsson

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Alternatively we can make the language of this PR less strong, but I'd like to get this off my plate before parental leave and leave the rest to the WG)

Copy link
Member

@jan-ivar jan-ivar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I'm recording disagreement on the proposed behavior here, so I think we need to resolve this before we can merge this PR and tests.

Sorry about your timeframe constraints, but I'm confident we can make progress on this in a much shorter timeframe.

I'm happy to commandeer the PR to help assist with that, or feel free to nominate someone else of your choosing.

Comment on lines +16747 to +16748
was set. Before any frames have been decoded, this setting
MUST NOT [=map/exist=].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the constrainable pattern allows absence. Is there a precedent in the spec for a setting to not exist and come into being later? I don't think we should allow that. I've filed #3045

Suggested change
was set. Before any frames have been decoded, this setting
MUST NOT [=map/exist=].
was set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a precedent in the spec for a setting to not exist and come into being later?

Yes this is how remote track settings already work in the video frames (width, height and frameRate) so I thought we should be consistent with audio frames.

@henbos
Copy link
Contributor Author

henbos commented Mar 28, 2025

”Mono on stereo=1 decoder” is behavior I proposed to my colleagues already. My arguments in favor of this were basically:

  1. It makes sense.
  2. It makes end-to-end testing easier.

You brought up a point a about down stream processing being cut in half that I believe was not discussed at the time. Still, the current state of the PR represents what I was able to get agreement on, which was to exclude this particular use case and focus on what was already implemented (stereo=1 being always stereo) for now.

That’s why I thought it might be easier to do it in two steps since the first step (specifically talking about ”we want stereo” and ”we want channelCount) has agreement and is already implemented in libwebrtc, while the second one (mono over stereo=1) isn’t implemented by anyone yet (if I understand correctly) and may need further discussions with my colleagues.

But if your preference is to do both steps in the same PR I am happy to appoint it to you.

@henbos
Copy link
Contributor Author

henbos commented Mar 28, 2025

The way to decouple step 1 from step 2 would be to delete the sentence "The number of channels that the decoder produces MUST reflect the number of channels configured when the local SDP was set." and to delete the NOTE. Then it would be up to the implementation. But we can also wait until we know what the plan is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote track.getSettings().channelCount
5 participants