Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realtime error: Could not get pidns: Could not fstatat ns/pid: Not a directory #1653

Open
andrew-sayers opened this issue Feb 26, 2025 · 3 comments
Labels

Comments

@andrew-sayers
Copy link
Contributor

andrew-sayers commented Feb 26, 2025

Operating System

Debian

XDG Desktop Portal version

Git

XDG Desktop Portal version (Other)

No response

Desktop Environment

Cinnamon / MATE / Xfce

Desktop Environment (Other)

No response

Expected Behavior

Playing a YouTube video should not cause any error messages.

Current Behavior

The following message appears in my systemd log:

Realtime error: Could not get pidns: Could not fstatat ns/pid: Not a directory

It is often, but not always, accompanied by messages from pipewire. These usually include at least one warning that mentions "xrun" and sometimes include an error that says "snd_pcm_mmap_commit error: Broken pipe". They are not included here because this problem can happen without them.

Steps to Reproduce

  1. run sudo journalctl -f in a terminal
  2. open Firefox
  3. play any YouTube video
  4. observe the output of sudo journalctl -f

(this probably isn't specific to YouTube or even Firefox, I just happened to trigger the issue with them)

Anything else we should know?

The message is directly generated by xdp_pidfd_get_namespace() in xdp-utils.c, called from map_pid() in realtime.c.

I'm not familiar with pidfds and namespaces, but the function seems to be passed an anonymous pidfd, but expects an fd for a directory like /proc/$pid/task/$pid. Might this be getting called incorrectly?

Note: PR #1655 improved the error message so I could debug the issue, but doesn't fix anything.

@kotauskas
Copy link

It is often, but not always, accompanied by messages from pipewire. These usually include at least one warning that mentions "xrun"

FYI: this is because PipeWire's libpipewire-module-rt module relies on RTKit or the Realtime portal to set real-time priority for audio playback threads. On my Arch system, setting rtportal.enabled = false doesn't seem to help, which suggests that the Realtime portal may be intercepting RTKit calls for one reason or another. With no real-time priority, audio threads are prone to being preempted or not scheduled at inopportune times, leading to periodic xruns.

My current workaround is systemctl --user stop xdg-desktop-portal && systemctl --user mask xdg-desktop-portal. Note that this obviously breaks everything that depends on the Desktop Portal.

I'm running kernel version 6.6.52-rt43-arch1-2-rt-lts, and current Debian probably has an "old" kernel as well. One working theory I have at the moment is that a bleeding-edge kernel version added the ability to traverse pidfds as if they were /proc FDs, and the current PID namespace traversal code was written with an accidental dependency on that new behavior. The date at which that code was written suggests that this bug has been dormant until very recently and was unleashed by a RTKit-to-Realtime-portal D-Bus redirect or whatever it is that makes PipeWire contact the Realtime portal with rtportal.enabled=false.

It's also possible that that code erroneously assumed /proc FDs and pidfds to be the same thing when /proc FDs really provide a superset of pidfd functionality, leading to an erroneous dependency on /proc FDs that was violated by a change in whatever code it is that produces those FDs.

Note to maintainers: pidfds have ioctls made specifically for this purpose, and switching to them will definitely fix this bug.

@andrew-sayers
Copy link
Contributor Author

I'm running kernel version 6.6.52-rt43-arch1-2-rt-lts, and current Debian probably has an "old" kernel as well. One working theory I have at the moment is that a bleeding-edge kernel version added the ability to traverse pidfds as if they were /proc FDs, and the current PID namespace traversal code was written with an accidental dependency on that new behavior.

For the record, I'm getting this on Debian unstable running 6.12:

$ uname -a
Linux andrews-2024-laptop 6.12.16-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.16-1 (2025-02-22) x86_64 GNU/Linux

@kotauskas
Copy link

Ah, then it's probably pidfd misuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants