-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Damage not being handled correctly #158
Comments
I made some progress on trying to fix this. It looks like pipewire buffers are not making their way fast enough from the sink to be filled with data. Consider this:
So, what I think is happening is that when a frame is skipped because there is no buffer available to copy to, the damage state will be cleared anyway. So that means the next time it tries to do a frame copy, it won't get the damage notification since it's already been cleared in the failed frame beforehand. There are two things that can be done:
I am investigating using flushing events in pipewire. However it seems even after pipewire fires the drained event, sometimes a buffer is not available still. In my testing though before the time the capture loop stalls due to no buffer, the bug seems to be fixed. Maybe we should try double/tripple buffering? @emersion Does it seem right that the damage state is being cleared prematurely? I just wanted to clarify so I don't go down a rabbit hole. |
@Need4Speed402 If you are using obs xdg-desktop-portal-wlr will usually settle on only one buffer, which gets blocked by obs asynchronous rendering. If you want to try tripple buffering, you can increase XDPW_PWR_BUFFERS to 3 in https://github.com/emersion/xdg-desktop-portal-wlr/blob/master/include/pipewire_screencast.h#L6. Be aware, that pipewire clients will skip to the newest buffer in the queue if multiple are available. ¹I've read you are using obs and chromium, so mostly part 2 is important for each. |
@columbarius I'm actually working on your columbarius:screencast-pw-driven branch. I cloned it hoping it would fix the problem, but it hasn't. Instead, I got a bunch of spam in the console:
So looking into that made me realize about the problem. I'm working on this branch because it seems like that's the direction we want to go in anyway. A part of me is telling me to scrap all the code. I'd love to fix things like multi monitor and sharing of a certain region going. I'm just not sure how I would coordinate with you guys. I work from home and I use screen sharing often especially on terminals, I'm highly motivated to get all this stuff working. After all, it just seems like some basic state needs to get passed around. A buffer and some metadata right? I might be flying high on mount stupid right now though. I'm still trying to understand all the code before I make a rash decision. |
I figured out the race condition and here is the patch so far. It's built on @columbarius screencast-pw-driven branch. It only handles the happy path, I basically nuked the code that used to handle Foot terminal now renders correctly through obs and chromium. In fact, there's a bonus: the cursor ran into the exact same damage tracking issue as foot. Foot just made it really obvious. Now it looks like the cursor on obs renders at 144hz and also every other client. Before I assumed that obs was just asking for 60hz. Apparently not. Now, for the question I'm having trouble answering: should I clean up the code and submit it? Or should I do a more extensive refactor? I'm leaning towards a refactor simply because errors never seemed to get handled that well anyway, I'm not sure how to augment my hacky code to get proper error handling and clean up. 2021-07-25.15-27-30.mp4 |
Thanks for testing my patch.
Yeah this is a result of driving the pipewire loop directly from the wlroots loop and not passing back and forth between the wlroots and the pipewire loop. Increasing XDPW_PWR_BUFFERS should remove most Could you please retry your tests with a higher XDPW_PWR_BUFFERS, which should remove those out of buffer events for the cost of some unused copies.
If started contributing to this project from a similar position a year ago. If you like to improve sharing of a certain region, you could take up #156 and add some way to set the region via chooser or the config file. |
It would be nice, if you could clean up your patch only containing relevant changes. That would make it easier to review it. I tried to reproduce your issue, but coudn't find any skipped frame. And the Maybe you could create a script with sth. like ydotool helping me to reproduce it? |
I run 2x [email protected] displays, I have quite an exotic setup that probably needs to push a lot of bandwidth. I'm a little surprised that my issue is not reproducible on others machines, I've been having this issue ever since I started using sway a year ago. It's not as if my computer sometimes has a happy day and decides not to do it, it was a consistent bug for me. Perhaps on a normal 1080p screen, the pipewire buffers usually make their way back to the return queue faster and so running out of pipewire buffers is rare? The bottleneck is in the capturer while in my situation its in the sink.
In my experience, it would complain about missing pipewire buffers 20-30 times until it finally was able to grab one to capture to. We're not talking about maybe setting it to 3 would completely fix the problem. The real fix needs to have explicit synchronization. In my patch, I tried to do it with flush events but that's not doing exactly what I want it to, sometimes even then it would still exhaust pipewire buffers and then start skipping frames like before. Apparently there should be a pipewire I can find a example of the process event used here for sinks: https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/master/src/examples/video-play.c
I was referring to a race condition I accidentally created when trying to implement my patch. As far as I know, there is no race condition other running out of pipewire buffers and prematurely clearing the damage state.
From my understanding, it would seem easy simulate my situation by intentionally adding a delay for pipewire buffers making their way back to the capturer. I'm not sure how to make it happen externally though. Perhaps we can create a dummy client for testing? |
@Need4Speed402 could you please try if we are still skipping frames in #141? Added a loop to wait for pipewire buffers without draining the queue every time. |
2021-08-08.12-21-15.mp4Unfortunately, this doesn't fix my issue, and worse, it consumes 25% cpu time. Anyway, I wouldn't worry too much about it. I'm doing some work on my local branch where |
@Nefsen402 Do you want to test if the issue still remains? The buffercopy got reworked a lot since then. |
Unfortunately, the bug is still present. In fact, my GPU reset and had a complete freakout after a coulpe seconds of testing 🤷 |
I cannot reproduce this anymore. |
When recording the screen using pipewire and only when using certain clients, every other frame is dropped and not visible in the output. I have recorded a video with obs and the foot terminal on arch linux to demonstrate the issue. At first, I am typing
a
in a consistent rhythmic manner, but you only seeaa
pop up meaning it skipped the frame between when the twoa
's were being typed. The issue exists with both obs and webrtc screen capture on chromium.Not all clients exhibit this behavior, I have only ever seen foot consistently have this issue. Foot maybe having this issue because it only commits wl_buffers only when needed, while many other clients either commit multiple wl_buffers per commit, or just commit multiple times, but that of course is just speculation.
xdg-desktop-portal-wlr/src/screencast/wlr_screencast.c
Line 130 in 6cc3a01
Rummaging around a little bit, I have found that replacing
copy_with_damage
withcopy
, the issue seemingly goes away. Although that maybe because the render loop continuously sends frames, so maybe the bug still exists but now manifests as added latency. This also does not seem like awlroots
issue, because usingwf-recorder
will have the correct behavior even though it usescopy_with_damage
under the hood. My monitor does normally run at 144hz, so maybe it's dropping every other frame to try to reach 60hz? That issue still exists when I force the monitor to run at 60hz though.2021-07-24.17-52-42.mp4
Additional information:
wlroots/sway: 0.14.1 / 1.6.1
Pipewire: 0.3.32
used freshly compiled
xdg-desktop-portal-wlr
on commit 6cc3a01GPU: Radeon RX 6900xt
OS: Arch linux
The text was updated successfully, but these errors were encountered: