@@ -19,6 +19,7 @@ jailer --id <id> \
19
19
[--chroot-base-dir < chroot_base> ]
20
20
[--netns < netns> ]
21
21
[--daemonize]
22
+ [--new-pid-ns]
22
23
[--...extra arguments for Firecracker]
23
24
```
24
25
@@ -45,6 +46,10 @@ jailer --id <id> \
45
46
jailer will use this to join the associated network namespace.
46
47
- When present, the ` --daemonize ` flag causes the jailer to cal ` setsid() ` and
47
48
redirect all three standard I/O file descriptors to ` /dev/null ` .
49
+ - When present, the ` --new-pid-ns ` flag causes the jailer to ` fork() ` and then
50
+ exec the provided binary into a new PID namespace. As a result, the jailer and
51
+ the process running the exec file have different PIDs. The PID of the child
52
+ process is stored in the jail root directory inside ` <exec_file_name>.pid ` .
48
53
- The jailer adheres to the "end of command options" convention, meaning
49
54
all parameters specified after ` -- ` are forwarded to Firecracker. For
50
55
example, this can be paired with the ` --config-file ` Firecracker argument to
@@ -98,6 +103,13 @@ After starting, the Jailer goes through the following operations:
98
103
namespace.
99
104
- If ` --daemonize ` is specified, call ` setsid() ` and redirect ` STDIN ` ,
100
105
` STDOUT ` , and ` STDERR ` to ` /dev/null ` .
106
+ - If ` --new-pid-ns ` is specified, call ` unshare() ` into a new PID namespace.
107
+ This will not have any effect on the current process, but its first
108
+ child will assume the role of init(1) in the new namespace. Next, the
109
+ jailer is duplicated by a ` fork() ` call, so that the child process
110
+ belongs to the previously created PID namespace. The parent will store
111
+ child's PID inside ` <exec_file_name>.pid ` , while the child drops privileges
112
+ and` exec() ` s into the ` <exec_file_name> ` , as described below.
101
113
- Drop privileges via setting the provided ` uid ` and ` gid ` .
102
114
- Exec into `<exec_file_name> --id=<id >
103
115
--start-time-us=<opaque > --start-time-cpu-us=<opaque >` (and also forward
@@ -224,11 +236,10 @@ Note: default value for `<api-sock>` is `/run/firecracker.socket`.
224
236
this involves registering handlers with the cgroup ` notify_on_release `
225
237
mechanism, while being wary about potential race conditions (the instance
226
238
crashing before the subscription process is complete, for example).
227
- - For extra resilience, the jailer expects to be spawned by the user in a new
228
- PID namespace, most likely via a combination of ` clone() ` with the
229
- ` CLONE_NEWPID ` flag and ` exec() ` . A process must be created in a new PID
230
- namespace in order to become a pseudo-init process, and the other option is
231
- to use a ` clone() ` in the jailer, which seems unnecessary.
239
+ - For extra resilience, the ` --new-pid-ns ` flag enables the Jailer to exec the
240
+ binary file in a new PID namespace, in order to become a pseudo-init process.
241
+ Alternatively, the user can spawn the jailer in a new PID namespace via a
242
+ combination of ` clone() ` with the ` CLONE_NEWPID ` flag and ` exec() ` .
232
243
- When running with ` --daemonize ` , the jailer will fail to start if it's a
233
244
process group leader, because ` setsid() ` returns an error in this case.
234
245
Spawning the jailer via ` clone() ` and ` exec() ` also ensures it cannot be a
0 commit comments