Skip to content

Add nonunix platform support#274

Open
ryanofsky wants to merge 19 commits into
bitcoin-core:masterfrom
ryanofsky:pr/wins
Open

Add nonunix platform support#274
ryanofsky wants to merge 19 commits into
bitcoin-core:masterfrom
ryanofsky:pr/wins

Conversation

@ryanofsky

Copy link
Copy Markdown
Collaborator

This PR implements API changes and fixes needed to allow libmultiprocess to work on nonunix platforms.

These changes were originally part of #231, which adds windows support, but were split out to allow windows and nonwindows changes to be reviewed separately.

@DrahtBot

DrahtBot commented Apr 22, 2026

Copy link
Copy Markdown

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #296 (ci: Bump channel to nixos-26.05 by maflcko)
  • #288 (Create support branch for CI scripts, documentation, and examples by ryanofsky)
  • #287 (Split repository into master (library source) and support (CI, docs, examples) branches. by ryanofsky)
  • #269 (proxy: add local connection limit to ListenConnections by enirox001)
  • #209 (cmake: Increase cmake policy version by ryanofsky)
  • #175 (Set cmake_minimum_required(VERSION 3.22) by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • nonunix -> non-Unix [standard spelling/capitalization; the current form is a typo in the release note]

2026-06-22 21:32:45

Comment thread include/mp/util.h Outdated
Comment thread src/mp/util.cpp Outdated
Comment thread include/mp/util.h Outdated

@ryanofsky ryanofsky left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Will update with suggestions.

Comment thread include/mp/util.h Outdated
Comment thread src/mp/util.cpp Outdated
Comment thread include/mp/util.h Outdated
ryanofsky and others added 13 commits June 10, 2026 21:33
Add ProcessId = int type alias and apply it to WaitProcess, SpawnProcess
(pid output argument), and callers. ProcessId type will be different on
windows so this provides more portability.
Add SocketId = int and SocketError = -1 type aliases and apply SocketId
to SpawnProcess (return type and callback parameter) and callers. SocketId type
will be different on Windows, so this provides more portability.

Co-authored-by: Sjors Provoost <sjors@sprovoost.nl>
Add SpawnConnectInfo type alias to pass socket handle from parent process to
child process in more platform independent way.
gen.cpp used fork() directly via <unistd.h> to invoke the capnp compiler as a
subprocess, but fork() is not available on Windows, so shouldn't be used in
application code.

Add a StartProcess(const std::vector<std::string>& args) function to
util.h/util.cpp that spawns a process and returns its ProcessId, leaving
the caller responsible for WaitProcess. On POSIX it uses posix_spawn;
on Windows it can use CreateProcess.

Update gen.cpp to replace the inline fork/exec/wait with
mp::WaitProcess(mp::StartProcess(args)).

There a change in behavior if starting the process fails, because
KJ_FAIL_SYSCALL is now used for error reporting and the
fs::weakly_canonical call is dropped (it is probably more useful to see
literal executable name passed). Also previously, the child process
could deadlock if exec() failed because threads in the parent may have
held allocator or stdio mutex locks that are now leaked in the child.
posix_spawn() avoids this entirely by never running arbitrary code in
the child process.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract socket pair creation from SpawnProcess into a standalone
SocketPair() function, and use it to replace the inline socketpair()
call. This is pretty much a refactoring but technically there are two
small behavior changes:

- FD_CLOEXEC is now used to ensure sockets are not leaked if processes
  are spawned. This has no impact on SpawnProcess, but is better hygiene
  and could affect future callers of SocketPair().

- KJ_SYSCALL is now used to simplify error-handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…objects

Replace the m_wait_fd/m_post_fd raw int members with
m_wait_stream/m_post_stream kj::Own<kj::AsyncIoStream> and
m_post_writer kj::Own<kj::OutputStream>.

The constructor uses provider->newTwoWayPipe() instead of calling
socketpair() directly. The loop() and post() methods write through
m_post_writer instead of calling write() with a raw fd, and
EventLoopRef::reset does the same.
kj::AsyncIoStream::getFd() was added in capnproto 0.9 (commit
d27bfb8a4175b32b783de68d93dd1dbafadddea5, first released in 0.9.0). The
code now uses getFd() in proxy.cpp, so 0.7 is no longer a sufficient
minimum.

Set olddeps version to 0.9.2, which is the patched 0.9.x release for
CVE-2022-46149.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…m objects

Instead of accepting raw file descriptor integers and wrapping them
internally, ConnectStream and ServeStream now accept
kj::Own<kj::AsyncIoStream> directly. This removes the assumption that
the transport is always a local unix fd, making the API easier to adapt
to other I/O types (e.g. Windows handles).

The Stream type alias (kj::Own<kj::AsyncIoStream>) is added as a
convenience.

Callers are updated to wrap their fd with wrapSocketFd() before calling.
Flush pending Cap'n Proto release messages before closing the stream.
When one side of a socket pair closes, the other side does not receive
an onDisconnect event, so it relies on receiving release messages from
the closing side to free its ProxyServer objects and shut down cleanly.
Without this, Server objects are not freed by Cap'n Proto on
disconnection.
…ibraries

On macOS, when libcapnp is built as a dynamic library and Bitcoin Core
REDUCE_EXPORT option is used the RTTI typeinfo for kj::Exception has a
different address in libcapnp.dylib versus the calling binary. This
means catch (const kj::Exception& e) in the calling binary silently
fails to match exceptions thrown by capnp, so the DISCONNECTED exception
from shutdownWrite() propagates as a fatal uncaught exception instead of
being suppressed as intended.

This causes the Bitcoin Core macOS native CI job to fail with:
  Fatal uncaught kj::Exception: kj/async-io-unix.c++:491: disconnected:
    shutdown(fd, SHUT_WR): Socket is not connected

The fix is to use kj::runCatchingExceptions/kj::throwRecoverableException,
which use KJ's own thread-level exception interception mechanism rather
than C++ RTTI-based matching, and therefore work correctly across dynamic
library boundaries. This is the same approach used elsewhere in the
codebase (proxy.cpp EventLoop::post, type-context.h server request handler)
for the same reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MSVC error when building multiprocess.vcxproj:

  mp/util.h(146,46): error C2280:
    'std::variant<T *,T>::variant(const std::variant<T *,T> &)':
    attempting to reference a deleted function [with T=mp::Lock]

The PtrOrValue constructor used a ternary expression to initialize data:

  data(ptr ? ptr : std::variant<T*, T>{std::in_place_type<T>, args...})

Both arms are prvalues of type std::variant<T*,T>, so under C++17's
mandatory copy elision no copy/move constructor should be invoked. GCC
and Clang apply this correctly. MSVC does not apply guaranteed copy
elision to ternary expressions in this context: it materializes the
temporary and then attempts to copy-construct data from it. Since
std::variant<Lock*,Lock> has a deleted copy constructor (Lock holds a
std::unique_lock which is move-only), MSVC fails.

Fix by initializing data to hold T*=ptr in the member initializer list,
then emplacing T in-place in the constructor body if ptr is null. This
avoids the ternary entirely and requires only the in-place constructor
of T, not any variant copy or move.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MSVC warns (C4305, treated as error) about truncation from 'int' to
'const bool' when initializing static const bool members from integer
bitwise-and expressions. Use constexpr bool with explicit != 0 to
make the boolean conversion unambiguous.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Sjors

Sjors commented Jun 17, 2026

Copy link
Copy Markdown
Member

I opened stratum-mining/sv2-tp#110 to test the changes in the Template Provider.

@Sjors Sjors left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Studied two first three commits...

Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread src/mp/util.cpp
Comment thread src/mp/util.cpp Outdated
// exec, so the fd survives into the spawned process regardless of how
// the socket was created.
int flags = fcntl(fds[0], F_GETFD);
if (flags == -1) throw std::system_error(errno, std::system_category(), "fcntl F_GETFD");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 24c5e57 util: Clear FD_CLOEXEC on child socket before exec: to not make things more non-fork-safe:

diff --git a/src/mp/util.cpp b/src/mp/util.cpp
index 15400215e4..b2a545ef5f 100644
--- a/src/mp/util.cpp
+++ b/src/mp/util.cpp
@@ -62,4 +62,12 @@ size_t MaxFd()
 }

+template <size_t N>
+[[noreturn]] void ChildProcessError(const char (&msg)[N])
+{
+    const ssize_t writeResult = ::write(STDERR_FILENO, msg, N - 1);
+    (void)writeResult;
+    _exit(126);
+}
+
 } // namespace

@@ -144,8 +152,5 @@ std::tuple<ProcessId, SocketId> SpawnProcess(ConnectInfoToArgsFn&& connect_info_
             throw std::system_error(errno, std::system_category(), "close");
         }
-        static constexpr char msg[] = "SpawnProcess(child): close(fds[1]) failed\n";
-        const ssize_t writeResult = ::write(STDERR_FILENO, msg, sizeof(msg) - 1);
-        (void)writeResult;
-        _exit(126);
+        ChildProcessError("SpawnProcess(child): close(fds[1]) failed\n");
     }

@@ -164,8 +169,12 @@ std::tuple<ProcessId, SocketId> SpawnProcess(ConnectInfoToArgsFn&& connect_info_
         // the socket was created.
         int flags = fcntl(fds[0], F_GETFD);
-        if (flags == -1) throw std::system_error(errno, std::system_category(), "fcntl F_GETFD");
+        if (flags == -1) {
+            ChildProcessError("SpawnProcess(child): fcntl(fds[0], F_GETFD) failed\n");
+        }
         if (flags & FD_CLOEXEC) {
             flags &= ~FD_CLOEXEC;
-            if (fcntl(fds[0], F_SETFD, flags) == -1) throw std::system_error(errno, std::system_category(), "fcntl F_SETFD");
+            if (fcntl(fds[0], F_SETFD, flags) == -1) {
+                ChildProcessError("SpawnProcess(child): fcntl(fds[0], F_SETFD) failed\n");
+            }
         }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there something in particular that led you this (hardening) commit?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

In 24c5e57 util: Clear FD_CLOEXEC on child socket before exec: to not make things more non-fork-safe:

Good catch. An easier to way to make it safe was just to move the CLOEXEC code before the fork() call so now doing that

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

Was there something in particular that led you this (hardening) commit?

I don't remember exactly. It was definitely a bug of some kind. I think an earlier version of the code was using newTwoWayPipe which wraps a socket pair and sets CLOEXEC. Also originally SpawnProcess required callers to construct the socket pair, taking the child socket as an argument instead of both sockets. But this made calling code more complicated, so now SpawnProcess returns a parent socket instead of accepting a child socket.

Now since SpawnProcess is creating the socket pair it knows whether CLOEXEC is set so this code is not necessary. But I think it's good to use CLOEXEC to prevent accidental leaks so I did keep some of this and integrated it better with the earlier commit introducing SocketPair.

@Sjors Sjors left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the rest as well, except the CMake changes in c9aa806 and 7cb83a5.

Concept ACK

Comment thread include/mp/util.h
//! errors in python unit tests.
std::string LogEscape(const kj::StringTree& string, size_t max_size);

using Stream = kj::Own<kj::AsyncIoStream>;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 091f5e1 proxy, refactor: Change ConnectStream and ServeStream to accept stream objects: would it make sense to introduce this earlier, in 3c81cf2?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

In 091f5e1 proxy, refactor: Change ConnectStream and ServeStream to accept stream objects: would it make sense to introduce this earlier, in 3c81cf2?

It could make sense but I feel it's a little clearer if the EventLoop code is using kj::Own<kj::AsyncIoStream> and kj::Own<kj::OutputStream> types directly and not tied to the mp::Stream type for external callers and meant to be more opaque.

But I did extend this commit to use Stream stream type in ConnectStream and ServeStream functions since these are external functions.

Comment thread include/mp/proxy-io.h Outdated
Comment thread src/mp/proxy.cpp Outdated
// resources and shutting down cleanly.
try {
m_stream->shutdownWrite();
} catch (const kj::Exception& e) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In bfc2db7 proxy: Call shutdownWrite() in Connection destructor: maybe squash 28e4c7f into this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

In bfc2db7 proxy: Call shutdownWrite() in Connection destructor: maybe squash 28e4c7f into this?

Didn't squash these because I think 28e4c7f is good documentation of the macos REDUCE_EXPORTS bug and will be probably be useful to reference in the future (the bug can happen very easily just by throwing and catching exceptions.)

Comment thread include/mp/util.h Outdated
ryanofsky and others added 6 commits June 21, 2026 20:23
MSVC does not correctly apply SFINAE when the substitution failure
occurs in a default function argument that uses decltype of another
function parameter (MSVC error C2039). Instead of silently excluding
the overload, MSVC instantiates the function body and reports a hard
error, e.g.:

  type-interface.h(62,56): error C2039: 'Calls': is not a member of
  'capnp::List<mp::test::messages::Pair<capnp::Text,capnp::Text>,
  capnp::Kind::STRUCT>::Builder'

The root cause is that MSVC evaluates default argument expressions
outside the SFINAE immediate context when they reference function
parameters via decltype. GCC and Clang treat this as a substitution
failure and silently exclude the overload, as the standard intends.

Replace all uses of the `* enable = nullptr` default-argument SFINAE
pattern with C++20 requires clauses, which are well-supported on all
three compilers and give cleaner constraint-violation diagnostics.
Add two concepts to util.h to reduce repetition across the type-*.h
headers:

  FieldTypeIs<T, U>   - T's .get() returns exactly type U
  InterfaceField<T>   - T's .get() returns a capnp interface type
                        (a type that exposes a nested ::Calls member)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This repo has introduced API changes to add Windows support to
libmultiprocess (HANDLE-based IPC alongside the existing fd-based IPC).
These changes require corresponding updates to Bitcoin Core, which are
pending in bitcoin/bitcoin#35084. Until that PR merges, the Bitcoin Core
CI jobs fail against master because Bitcoin Core has not yet been updated
to use the new API.

Switch the Bitcoin Core checkout in both jobs to use
refs/pull/35084/merge so CI tests against the compatible version. A
BITCOIN_CORE_REF env var is introduced at the top of the file; once
(and keep the var in place for any future API compatibility cycles).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On MSVC, std::terminate() does not print the exception message before
calling abort()/fastfail, so exceptions thrown during mpgen execution
appear as a bare 0xC0000409 exit code with no diagnostic output. Wrap
main() in a try-catch to explicitly print the error to stderr and
return 1 instead of crashing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bols

Use target_compile_definitions on mpgen to expose CAPNP_EXECUTABLE,
CAPNPC_CXX_EXECUTABLE (via $<TARGET_FILE:...> generator expressions on
the CapnProto::capnp_tool and CapnProto::capnpc_cpp imported targets),
and CAPNP_INCLUDE_DIRS (from the CAPNP_INCLUDE_DIRS variable set by
find_package). gen.cpp uses these directly instead of constructing paths
from capnp_PREFIX. Remove capnp_PREFIX from config.h.in as it is no
longer needed there. Add compat fallbacks in compat_config.cmake to
synthesize the tool imported targets and CAPNP_INCLUDE_DIRS from older
variables when using an older CapnProto package.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ubuntu Noble's libcapnp-dev 1.0.1 cmake config file is installed under
/usr/lib/x86_64-linux-gnu/cmake/CapnProto/ but its _IMPORT_PREFIX
calculation goes up only 3 directory levels to /usr/lib instead of 4
levels to /usr, so IMPORTED_LOCATION for CapnProto::capnp_tool is set
to /usr/lib/bin/capnp (non-existent) rather than /usr/bin/capnp.

The previous compat_config.cmake fallback only fired when the target
didn't exist at all (NOT TARGET), so it didn't catch this case where
the target exists but has a wrong path.

Add a validation pass that iterates over both tool targets after they
are created (either by the package or by our own fallback). For each
target, check whether any IMPORTED_LOCATION (config-specific or
generic) resolves to an existing file. If none do, use find_program
(with capnp_PREFIX/bin as a hint) to locate the actual binary and
override all stored locations on that target.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@ryanofsky ryanofsky left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews!

Rebased 7cb83a5 -> 68ed129 (pr/wins.1 -> pr/wins.2, compare) implementing review suggestions and fixing conflict with #279

Updated 68ed129 -> a43e5a8 (pr/wins.2 -> pr/wins.3, compare) to fix environ undeclared on macOS/BSD: add explicit declaration before posix_spawn() https://github.com/bitcoin-core/libmultiprocess/actions/runs/27974097183/job/82787313836

Updated a43e5a8 -> 6cc729f (pr/wins.3 -> pr/wins.4, compare) to fix clang-tidy readability-redundant-declaration on environ declaration https://github.com/bitcoin-core/libmultiprocess/actions/runs/27978804911/job/82803323635

Comment thread include/mp/util.h Outdated
Comment thread src/mp/util.cpp Outdated
Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread include/mp/util.h
Comment thread src/mp/util.cpp Outdated
// exec, so the fd survives into the spawned process regardless of how
// the socket was created.
int flags = fcntl(fds[0], F_GETFD);
if (flags == -1) throw std::system_error(errno, std::system_category(), "fcntl F_GETFD");

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

Was there something in particular that led you this (hardening) commit?

I don't remember exactly. It was definitely a bug of some kind. I think an earlier version of the code was using newTwoWayPipe which wraps a socket pair and sets CLOEXEC. Also originally SpawnProcess required callers to construct the socket pair, taking the child socket as an argument instead of both sockets. But this made calling code more complicated, so now SpawnProcess returns a parent socket instead of accepting a child socket.

Now since SpawnProcess is creating the socket pair it knows whether CLOEXEC is set so this code is not necessary. But I think it's good to use CLOEXEC to prevent accidental leaks so I did keep some of this and integrated it better with the earlier commit introducing SocketPair.

Comment thread include/mp/util.h
//! errors in python unit tests.
std::string LogEscape(const kj::StringTree& string, size_t max_size);

using Stream = kj::Own<kj::AsyncIoStream>;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

In 091f5e1 proxy, refactor: Change ConnectStream and ServeStream to accept stream objects: would it make sense to introduce this earlier, in 3c81cf2?

It could make sense but I feel it's a little clearer if the EventLoop code is using kj::Own<kj::AsyncIoStream> and kj::Own<kj::OutputStream> types directly and not tied to the mp::Stream type for external callers and meant to be more opaque.

But I did extend this commit to use Stream stream type in ConnectStream and ServeStream functions since these are external functions.

Comment thread include/mp/proxy-io.h Outdated
Comment thread src/mp/proxy.cpp Outdated
// resources and shutting down cleanly.
try {
m_stream->shutdownWrite();
} catch (const kj::Exception& e) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #274 (comment)

In bfc2db7 proxy: Call shutdownWrite() in Connection destructor: maybe squash 28e4c7f into this?

Didn't squash these because I think 28e4c7f is good documentation of the macos REDUCE_EXPORTS bug and will be probably be useful to reference in the future (the bug can happen very easily just by throwing and catching exceptions.)

Comment thread include/mp/util.h Outdated
@ryanofsky ryanofsky force-pushed the pr/wins branch 2 times, most recently from a43e5a8 to 6cc729f Compare June 22, 2026 21:32

@Sjors Sjors left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 6cc729f

I still didn't review the cmake commits though.

Comment thread src/mp/util.cpp
}
_exit(1);
ProcessId pid;
if (int err = posix_spawn(&pid, argv[0], nullptr, nullptr, argv.data(), ::environ)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ddd3945 util, refactor: Do not fork() and exec() separately: not posix_spawnp?

Comment thread include/mp/util.h

//! Create a socket pair that can be used to communicate within a process or
//! between parent and child processes.
std::array<SocketId, 2> SocketPair();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In d9a5a68 util, refactor: Add SocketPair() and use it in SpawnProcess: it seems a bit odd to set FD_CLOEXEC and then unset it again for the child. The unset code also assume it was the only flag, since it sets 0. Maybe it's better to give this function a boolean argument for whether to set this flag on the child?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants