|
| 1 | +## Asynchronous cancellation |
| 2 | + |
| 3 | +The [`pthread_cancel`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cancel.html) function is defective by design and should not be used. See [How to stop Linux threads cleanly](https://mazzo.li/posts/stopping-linux-threads.html) for a tour of the problem and various better solutions. That said, some applications still (misguidedly) make use of `pthread_cancel`, and some special caveats apply when polyfilling said applications. |
| 4 | + |
| 5 | +## What does `pthread_cancel` do? |
| 6 | + |
| 7 | +Every thread has three pieces of state: |
| 8 | + |
| 9 | +|Variable name|Possible values|Initial value|Changed by| |
| 10 | +|-------------|---------------|-------------|----------| |
| 11 | +|Cancel state|ENABLE, DISABLE|ENABLE|[`pthread_setcancelstate`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_setcancelstate.html)| |
| 12 | +|Cancel type|DEFERRED, ASYNCHRONOUS|DEFERRED|[`pthread_setcanceltype`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_setcancelstate.html)| |
| 13 | +|Cancel requested|NO, YES|NO|[`pthread_cancel`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cancel.html) (always changes to YES)| |
| 14 | + |
| 15 | +A thread will be terminated (possibly with cleanup functions / destructors being called), if either of the following state combinations occur: |
| 16 | +1. _Cancel requested_ is YES, _Cancel state_ is ENABLE, _Cancel type_ is ASYNCHRONOUS. |
| 17 | +2. _Cancel requested_ is YES, _Cancel state_ is ENABLE, _Cancel type_ is DEFERRED, and the thread calls a libc function defined as a cancellation point. |
| 18 | + |
| 19 | +Roughly speaking, every libc function that _could_ block execution for a non-trivial amount of time is defined as a cancellation point. This includes, but is not limited to, the following functions: |
| 20 | +* `accept` |
| 21 | +* `close` |
| 22 | +* `connect` |
| 23 | +* `copy_file_range` |
| 24 | +* `epoll_wait` / `epoll_pwait` / `epoll_pwait2` |
| 25 | +* `fcntl` (when called with `F_SETLKW` or `F_OFD_SETLKW`) |
| 26 | +* `fsync` / `fdatasync` / `sync_file_range` / `msync` |
| 27 | +* `getrandom` |
| 28 | +* `open` / `openat` / `open_by_handle_at` |
| 29 | +* `pause` / `sigpause` / `sigsuspend` |
| 30 | +* `pthread_testcancel` |
| 31 | +* `read` / `readv` / `pread` |
| 32 | +* `recv` / `recvfrom` / `recvmsg` / `recvmmsg` / `msgrcv` / `mq_receive` / `mq_timedreceive` |
| 33 | +* `send` / `sendto` / `sendmsg` / `sendmmsg` / `msgsnd` / `mq_send` / `mq_timedsend` |
| 34 | +* `sigwait` / `sigwaitinfo` / `sigtimedwait` |
| 35 | +* `sleep` / `usleep` / `nanosleep` / `clock_nanosleep` |
| 36 | +* `write` / `writev` / `pwrite` |
| 37 | + |
| 38 | +## glibc semantics of cancellation points |
| 39 | + |
| 40 | +Taking `epoll_pwait2` as an example, at the time of writing, the glibc implementation of `epoll_pwait2` is roughly: |
| 41 | + |
| 42 | +```c |
| 43 | +old_type = pthread_setcanceltype(ASYNCHRONOUS); |
| 44 | +result = syscall(epoll_pwait2, ...); |
| 45 | +pthread_setcanceltype(old_type); |
| 46 | +check_for_pthread_cancel_race(); |
| 47 | +return result; |
| 48 | +``` |
| 49 | +
|
| 50 | +When `pthread_cancel` is called, if the target thread has _Cancel state_ of ENABLE and _Cancel type_ of ASYNCHRONOUS, then a signal is sent to the target thread. Shortly thereafter, the signal will be received by the target thread, wherein the signal handler confirms that _Cancel state_ is still ENABLE and _Cancel type_ is still ASYNCHRONOUS. If so, the thread will be terminated. If not, the signal handler will do nothing (though delivery of the signal might cause an `EINTR` result from an unrelated syscall that was being made at the time of delivery). |
| 51 | +
|
| 52 | +There are (at least) two issues with this implementation: |
| 53 | +1. `pthread_cancel` could send the signal _before_ `pthread_setcanceltype(old_type)`, but the signal might arrive _after_ `pthread_setcanceltype(old_type)`. |
| 54 | +2. A signal (unrelated to cancellation) could be delivered during the `syscall`, and the handler for that signal could perform a `longjmp`, thereby causing `pthread_setcanceltype(old_type)` to not be called. |
| 55 | +
|
| 56 | +The 1<sup>st</sup> point is addressed by the `check_for_pthread_cancel_race` call: yet another piece of per-thread state tracks whether a `pthread_cancel` call is in the middle of sending a signal, and if so, `check_for_pthread_cancel_race` waits for the signal delivery, thereby avoiding it from causing `EINTR` on an unrelated syscall. |
| 57 | +
|
| 58 | +The 2<sup>nd</sup> point is unaddressed. It is tracked as part of [glibc bug 12683](https://sourceware.org/bugzilla/show_bug.cgi?id=12683), where a solution has been proposed, but not yet implemented. |
| 59 | +
|
| 60 | +## Polyfilled semantics of cancellation points |
| 61 | +
|
| 62 | +For functions that don't meaningfully block in practice, such as `open_by_handle_at` and `getrandom`, the polyfill implementation of these functions inserts a `pthread_testcancel` call, as in: |
| 63 | +```c |
| 64 | +pthread_testcancel(); |
| 65 | +return syscall(open_by_handle_at, ...); |
| 66 | +``` |
| 67 | + |
| 68 | +For functions that really can block, the polyfill implementation takes a two-pronged strategy: |
| 69 | +1. If the ambient glibc provides the function being polyfilled, call that. |
| 70 | +2. Otherwise, implement it similarly to how glibc currently does, albeit without the `check_for_pthread_cancel_race` call: |
| 71 | + |
| 72 | + ```c |
| 73 | + old_type = pthread_setcanceltype(ASYNCHRONOUS); |
| 74 | + result = syscall(epoll_pwait2, ...); |
| 75 | + pthread_setcanceltype(old_type); |
| 76 | + return result; |
| 77 | + ``` |
| 78 | +
|
| 79 | +If glibc one day fixes [bug 12683](https://sourceware.org/bugzilla/show_bug.cgi?id=12683), then using the ambient glibc implementation will give bug-free behaviour (when running with a sufficiently new glibc). On older glibc versions, the polyfill implementation is only marginally worse than the glibc implementation: the lack of a `check_for_pthread_cancel_race` call is unfortunate, but applications should be prepared to handle `EINTR` anyway. |
0 commit comments