Skip to content

Commit 65f3d09

Browse files
author
Akilesh Kailash
committed
init: Wait for snapuserd before starting second stage
This is a race between init process and bionic libc initialization of snapuserd. init->fork() ----------------> SecondStageMain() -> PropertyInit() | | v execveat ---> __libc_init_common() -> __system_properties_init() (snapuserd) When init process calls PropertyInit(), /dev/__properties__ directory is created. When bionic libc of snapuserd daemon invokes __system_properties_init _after_ init process PropertyInit() function is invoked, libc will try to initialize the property by reading /system/etc/selinux/plat_property_contexts. Since any reads on /system has to be served by snapuserd, this specific read from libc cannot be serviced leading to deadlock. Reproduce the race by inducing a sleep of 1500ms just before execveat() so that init process calls PropertyInit() before bionic libc initialization. This leads to deadlock immediately and with additional kernel instrumentation with debug logs confirms the failure: ====================================================== init: Relaunched snapuserd with pid: 428 ext4_file_open: SNAPUSERD: path /system/etc/selinux/plat_property_contexts - Pid: 428 comm 8 ext4_file_read_iter: SNAPUSERD for path: /system/etc/selinux/plat_property_contexts pid: 428 comm 8 [ 25.418043][ T428] ext4_file_read_iter+0x3dc/0x3e0 [ 25.423000][ T428] vfs_read+0x2e0/0x354 [ 25.426986][ T428] ksys_read+0x7c/0xec [ 25.430894][ T428] __arm64_sys_read+0x20/0x30 [ 25.435419][ T428] el0_svc_common.llvm.17612735770287389485+0xd0/0x1e0 [ 25.442095][ T428] do_el0_svc+0x28/0xa0 [ 25.446100][ T428] el0_svc+0x14/0x24 [ 25.449825][ T428] el0_sync_handler+0x88/0xec [ 25.454343][ T428] el0_sync+0x1c0/0x200 ===================================================== Fix: Before starting init second stage, we will wait for snapuserd daemon to be up and running. We do a simple probe by reading system partition. This read will eventually be serviced by daemon confirming that daemon is up and running. Furthermore, we are still in the kernel domain and sepolicy has not been enforced yet. Thus, access to these device mapper block devices are ok even though we may see audit logs. Note that daemon will re-initialize the __system_property_init() as part of WaitForSocket() call. This is subtle but important; since bionic libc initialized had failed silently, it is important that this re-initialization is done. Bug: 207298357 Test: Induce the failure by explicitly delaying the call of execveat(). With fix, no issues observed. Tested incremental OTA on pixel ~15 times. Ignore-AOSP-First: cherry-pick from AOSP Signed-off-by: Akilesh Kailash <[email protected]> Change-Id: I86c2de977de052bfe9dcdc002dcbd9026601d0f3
1 parent 8d3065d commit 65f3d09

File tree

2 files changed

+59
-0
lines changed

2 files changed

+59
-0
lines changed

init/snapuserd_transition.cpp

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
#include <android-base/strings.h>
3333
#include <android-base/unique_fd.h>
3434
#include <cutils/sockets.h>
35+
#include <fs_avb/fs_avb.h>
3536
#include <libsnapshot/snapshot.h>
3637
#include <libsnapshot/snapuserd_client.h>
3738
#include <private/android_filesystem_config.h>
@@ -227,6 +228,56 @@ void SnapuserdSelinuxHelper::FinishTransition() {
227228
}
228229
}
229230

231+
/*
232+
* Before starting init second stage, we will wait
233+
* for snapuserd daemon to be up and running; bionic libc
234+
* may read /system/etc/selinux/plat_property_contexts file
235+
* before invoking main() function. This will happen if
236+
* init initializes property during second stage. Any access
237+
* to /system without snapuserd daemon will lead to a deadlock.
238+
*
239+
* Thus, we do a simple probe by reading system partition. This
240+
* read will eventually be serviced by daemon confirming that
241+
* daemon is up and running. Furthermore, we are still in the kernel
242+
* domain and sepolicy has not been enforced yet. Thus, access
243+
* to these device mapper block devices are ok even though
244+
* we may see audit logs.
245+
*/
246+
bool SnapuserdSelinuxHelper::TestSnapuserdIsReady() {
247+
std::string dev = "/dev/block/mapper/system"s + fs_mgr_get_slot_suffix();
248+
android::base::unique_fd fd(open(dev.c_str(), O_RDONLY | O_DIRECT));
249+
if (fd < 0) {
250+
PLOG(ERROR) << "open " << dev << " failed";
251+
return false;
252+
}
253+
254+
void* addr;
255+
ssize_t page_size = getpagesize();
256+
if (posix_memalign(&addr, page_size, page_size) < 0) {
257+
PLOG(ERROR) << "posix_memalign with page size " << page_size;
258+
return false;
259+
}
260+
261+
std::unique_ptr<void, decltype(&::free)> buffer(addr, ::free);
262+
263+
int iter = 0;
264+
while (iter < 10) {
265+
ssize_t n = TEMP_FAILURE_RETRY(pread(fd.get(), buffer.get(), page_size, 0));
266+
if (n < 0) {
267+
// Wait for sometime before retry
268+
std::this_thread::sleep_for(100ms);
269+
} else if (n == page_size) {
270+
return true;
271+
} else {
272+
LOG(ERROR) << "pread returned: " << n << " from: " << dev << " expected: " << page_size;
273+
}
274+
275+
iter += 1;
276+
}
277+
278+
return false;
279+
}
280+
230281
void SnapuserdSelinuxHelper::RelaunchFirstStageSnapuserd() {
231282
auto fd = GetRamdiskSnapuserdFd();
232283
if (!fd) {
@@ -248,6 +299,13 @@ void SnapuserdSelinuxHelper::RelaunchFirstStageSnapuserd() {
248299
setenv(kSnapuserdFirstStagePidVar, std::to_string(pid).c_str(), 1);
249300

250301
LOG(INFO) << "Relaunched snapuserd with pid: " << pid;
302+
303+
if (!TestSnapuserdIsReady()) {
304+
PLOG(FATAL) << "snapuserd daemon failed to launch";
305+
} else {
306+
LOG(INFO) << "snapuserd daemon is up and running";
307+
}
308+
251309
return;
252310
}
253311

init/snapuserd_transition.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ class SnapuserdSelinuxHelper final {
5151
private:
5252
void RelaunchFirstStageSnapuserd();
5353
void ExecSnapuserd();
54+
bool TestSnapuserdIsReady();
5455

5556
std::unique_ptr<SnapshotManager> sm_;
5657
BlockDevInitializer block_dev_init_;

0 commit comments

Comments
 (0)