Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restore fails when bash exec present #11439

Open
josharian opened this issue Feb 5, 2025 · 3 comments
Open

restore fails when bash exec present #11439

josharian opened this issue Feb 5, 2025 · 3 comments
Labels
type: bug Something isn't working

Comments

@josharian
Copy link

josharian commented Feb 5, 2025

Description

I am experimenting with gvisor's save/restore functionality to decide whether to build atop it. I immediately hit a stumbling block in which restore fails, without an apparent workaround.

If you exec bash, checkpoint, then restore, the restore fails due to FD issues.

I would hope that it would successfully restore, and close the FDs associated with the exec'd bash process that could not be restored. Or that there is some path to bringing the container back up, even if it is in a slightly broken state. (Shortcomings can be documented, worked around, built around; simple failure to restore cannot.)

(I am also left wondering what other surprises might be in store for me, given that I hit this one almost immediately, i.e. #3281)

Thanks!

Steps to reproduce

Terminal 1:

runsc --platform=systrap create x
runsc start x

Terminal 2:

runsc exec x bash

Terminal 1:

runsc checkpoint -image-path=/tmp/x x
runsc delete x
runsc --platform=systrap create x
runsc restore -image-path=/tmp/x x

This results in this failure:

starting container: restoring container "x": failed to load kernel: error executing callbacks: no host FD available for :host:0, map: map[__no_name_0:/:3 __no_name_0:host:0:256 __no_name_0:host:1:257 __no_name_0:host:2:258]:
goroutine 23 [running]:
gvisor.dev/gvisor/pkg/state.safely.func1()
	pkg/state/state.go:309 +0x188
panic({0xc8e3e0?, 0x4000460e40?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/runtime/panic.go:785 +0x124
gvisor.dev/gvisor/pkg/sentry/fsimpl/host.(*inode).afterLoad(0x40000e4608, {0xe830415b7db8?, 0x40004d9260?})
	pkg/sentry/fsimpl/host/save_restore.go:77 +0x20c
gvisor.dev/gvisor/pkg/sentry/fsimpl/host.(*inode).StateLoad.func1()
	bazel-out/aarch64-opt/bin/pkg/sentry/fsimpl/host/host_state_autogen.go:127 +0x28
gvisor.dev/gvisor/pkg/state.userCallback.callbackRun(0x40001b2268?)
	pkg/state/decode.go:48 +0x24
gvisor.dev/gvisor/pkg/state.(*decodeState).checkComplete(0x40001b21e0, 0x4000345e30)
	pkg/state/decode.go:188 +0x15c
gvisor.dev/gvisor/pkg/state.(*decodeState).Load.func3()
	pkg/state/decode.go:683 +0x88
gvisor.dev/gvisor/pkg/state.safely(0x0?)
	pkg/state/state.go:322 +0x54
gvisor.dev/gvisor/pkg/state.(*decodeState).Load(0x40001b21e0, {0xedde00?, 0x40004fe008?, 0x40004d8090?})
	pkg/state/decode.go:679 +0x3a0
gvisor.dev/gvisor/pkg/state.Load.func1()
	pkg/state/state.go:121 +0x9c
gvisor.dev/gvisor/pkg/state.safely(0x40004f4b28?)
	pkg/state/state.go:322 +0x54
gvisor.dev/gvisor/pkg/state.Load({0xe830415b7db8, 0x40004d9260}, {0x10ea3a0, 0x40002fe080}, {0xedf340, 0x40004fe008})
	pkg/state/state.go:120 +0x138
gvisor.dev/gvisor/pkg/sentry/kernel.(*Kernel).LoadFrom(0x40004fe008, {0x110d2f0, 0x40004d9260}, {0x10ea3a0, 0x40002fe080}, {0x0, 0x0}, 0x0, 0x0, 0x0, ...)
	pkg/sentry/kernel/kernel.go:806 +0x514
gvisor.dev/gvisor/pkg/sentry/state.LoadOpts.Load({{0x10ea260, 0x40002f96f0}, 0x0, 0x0, 0x0, {0x0, 0x0, 0x0}}, {0x110d2f0, 0x40004d9260}, ...)
	pkg/sentry/state/state.go:183 +0x214
gvisor.dev/gvisor/runsc/boot.(*restorer).restore(0x4000080360, 0x40000d3188)
	runsc/boot/restore.go:236 +0xa38
gvisor.dev/gvisor/runsc/boot.(*restorer).restoreContainerInfo(0x4000080360, 0x40000d3188, 0x40000d3198)
	runsc/boot/restore.go:132 +0x418
gvisor.dev/gvisor/runsc/boot.(*containerManager).Restore(0x4000020f40, 0x40004ce0e0, 0x0?)
	runsc/boot/controller.go:597 +0x8c4
reflect.Value.call({0x4000118a80?, 0x4000244710?, 0x40004f5c98?}, {0xeed790, 0x4}, {0x40004f5ef0, 0x3, 0x821988?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/reflect/value.go:581 +0x97c
reflect.Value.Call({0x4000118a80?, 0x4000244710?, 0x50?}, {0x40004f5ef0?, 0x40004ce0e0?, 0x0?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/reflect/value.go:365 +0x94
gvisor.dev/gvisor/pkg/urpc.(*Server).handleOne(0x400008c910, 0x4000166270)
	pkg/urpc/urpc.go:338 +0x468
gvisor.dev/gvisor/pkg/urpc.(*Server).handleRegistered(...)
	pkg/urpc/urpc.go:433
gvisor.dev/gvisor/pkg/urpc.(*Server).StartHandling.func1()
	pkg/urpc/urpc.go:453 +0x68
created by gvisor.dev/gvisor/pkg/urpc.(*Server).StartHandling in goroutine 21
	pkg/urpc/urpc.go:451 +0x74

for object host.inode{CachedMappable:kernfs.CachedMappable{mapsMu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, mappings:memmap.MappingSet{root:memmap.Mappingnode{nrSegments:0, parent:(*memmap.Mappingnode)(nil), parentIndex:0, hasChildren:false, maxGap:memmap.MappingdynamicGap{}, keys:[5]memmap.MappableRange{memmap.MappableRange{Start:0x0, End:0x0}, memmap.MappableRange{Start:0x0, End:0x0}, memmap.MappableRange{Start:0x0, End:0x0}, memmap.MappableRange{Start:0x0, End:0x0}, memmap.MappableRange{Start:0x0, End:0x0}}, values:[5]memmap.MappingsOfRange{memmap.MappingsOfRange(nil), memmap.MappingsOfRange(nil), memmap.MappingsOfRange(nil), memmap.MappingsOfRange(nil), memmap.MappingsOfRange(nil)}, children:[6]*memmap.Mappingnode{(*memmap.Mappingnode)(nil), (*memmap.Mappingnode)(nil), (*memmap.Mappingnode)(nil), (*memmap.Mappingnode)(nil), (*memmap.Mappingnode)(nil), (*memmap.Mappingnode)(nil)}}}, pf:kernfs.inodePlatformFile{NoBufferedIOFallback:memmap.NoBufferedIOFallback{}, hostFD:50, fdRefsMu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, fdRefs:fsutil.FrameRefSet{root:fsutil.FrameRefnode{nrSegments:0, parent:(*fsutil.FrameRefnode)(nil), parentIndex:0, hasChildren:false, maxGap:fsutil.FrameRefdynamicGap{}, keys:[5]memmap.FileRange{memmap.FileRange{Start:0x0, End:0x0}, memmap.FileRange{Start:0x0, End:0x0}, memmap.FileRange{Start:0x0, End:0x0}, memmap.FileRange{Start:0x0, End:0x0}, memmap.FileRange{Start:0x0, End:0x0}}, values:[5]fsutil.FrameRefSegInfo{fsutil.FrameRefSegInfo{refs:0x0, memCgID:0x0}, fsutil.FrameRefSegInfo{refs:0x0, memCgID:0x0}, fsutil.FrameRefSegInfo{refs:0x0, memCgID:0x0}, fsutil.FrameRefSegInfo{refs:0x0, memCgID:0x0}, fsutil.FrameRefSegInfo{refs:0x0, memCgID:0x0}}, children:[6]*fsutil.FrameRefnode{(*fsutil.FrameRefnode)(nil), (*fsutil.FrameRefnode)(nil), (*fsutil.FrameRefnode)(nil), (*fsutil.FrameRefnode)(nil), (*fsutil.FrameRefnode)(nil), (*fsutil.FrameRefnode)(nil)}}}, fileMapper:fsutil.HostFileMapper{refsMu:fsutil.refsMutex{mu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}}, refs:map[uint64]int32(nil), mapsMu:fsutil.mapsMutex{mu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}}, mappings:map[uint64]fsutil.mapping{}}, fileMapperInitOnce:sync.Once{done:atomic.Uint32{_:atomic.noCopy{}, v:0x0}, m:sync.Mutex{state:0, sema:0x0}}}}, InodeNoStatFS:kernfs.InodeNoStatFS{}, InodeAnonymous:kernfs.InodeAnonymous{}, InodeNotDirectory:kernfs.InodeNotDirectory{InodeAlwaysValid:kernfs.InodeAlwaysValid{}}, InodeNotSymlink:kernfs.InodeNotSymlink{}, InodeTemporary:kernfs.InodeTemporary{}, InodeWatches:kernfs.InodeWatches{watches:vfs.Watches{mu:sync.RWMutex{m:sync.CrossGoroutineRWMutex{w:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}}, ws:map[uint64]*vfs.Watch(nil)}}, InodeFSOwned:kernfs.InodeFSOwned{}, locks:vfs.FileLocks{bsd:lock.Locks{mu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, locks:lock.LockSet{root:lock.Locknode{nrSegments:0, parent:(*lock.Locknode)(nil), parentIndex:0, hasChildren:false, maxGap:lock.LockdynamicGap{}, keys:[5]lock.LockRange{lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}}, values:[5]lock.Lock{lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}}, children:[6]*lock.Locknode{(*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil)}}}, blockedQueue:waiter.Queue{list:waiter.waiterList{head:(*waiter.Entry)(nil), tail:(*waiter.Entry)(nil)}, mu:sync.RWMutex{m:sync.CrossGoroutineRWMutex{w:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}}}}, posix:lock.Locks{mu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, locks:lock.LockSet{root:lock.Locknode{nrSegments:0, parent:(*lock.Locknode)(nil), parentIndex:0, hasChildren:false, maxGap:lock.LockdynamicGap{}, keys:[5]lock.LockRange{lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}, lock.LockRange{Start:0x0, End:0x0}}, values:[5]lock.Lock{lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}, lock.Lock{Readers:map[lock.UniqueID]lock.OwnerInfo(nil), Writer:lock.UniqueID(nil), WriterInfo:lock.OwnerInfo{PID:0, OFD:false}}}, children:[6]*lock.Locknode{(*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil), (*lock.Locknode)(nil)}}}, blockedQueue:waiter.Queue{list:waiter.waiterList{head:(*waiter.Entry)(nil), tail:(*waiter.Entry)(nil)}, mu:sync.RWMutex{m:sync.CrossGoroutineRWMutex{w:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}}}}}, inodeRefs:host.inodeRefs{refCount:atomicbitops.Int64{_:sync.NoCopy{}, value:1}}, hostFD:0, restoreKey:vfs.RestoreID{ContainerName:"", Path:"host:0"}, ino:0x4, ftype:0x2000, epollable:true, seekable:false, isTTY:true, savable:true, readonly:false, queue:waiter.Queue{list:waiter.waiterList{head:(*waiter.Entry)(nil), tail:(*waiter.Entry)(nil)}, mu:sync.RWMutex{m:sync.CrossGoroutineRWMutex{w:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}}}, virtualOwner:host.virtualOwner{enabled:true, mu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, uid:atomicbitops.Uint32{_:sync.NoCopy{}, value:0x0}, gid:atomicbitops.Uint32{_:sync.NoCopy{}, value:0x0}, mode:atomicbitops.Uint32{_:sync.NoCopy{}, value:0x2190}}, bufMu:sync.Mutex{m:sync.CrossGoroutineMutex{m:sync.Mutex{state:0, sema:0x0}}}, haveBuf:atomicbitops.Uint32{_:sync.NoCopy{}, value:0x0}, buf:[]uint8(nil)}:
goroutine 23 [running]:
gvisor.dev/gvisor/pkg/state.safely.func1()
	pkg/state/state.go:309 +0x188
panic({0xd170a0?, 0x40004d55c0?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/runtime/panic.go:785 +0x124
gvisor.dev/gvisor/pkg/state.Failf(...)
	pkg/state/state.go:269
gvisor.dev/gvisor/pkg/state.(*decodeState).Load(0x40001b21e0, {0xedde00?, 0x40004fe008?, 0x40004d8090?})
	pkg/state/decode.go:694 +0x624
gvisor.dev/gvisor/pkg/state.Load.func1()
	pkg/state/state.go:121 +0x9c
gvisor.dev/gvisor/pkg/state.safely(0x40004f4b28?)
	pkg/state/state.go:322 +0x54
gvisor.dev/gvisor/pkg/state.Load({0xe830415b7db8, 0x40004d9260}, {0x10ea3a0, 0x40002fe080}, {0xedf340, 0x40004fe008})
	pkg/state/state.go:120 +0x138
gvisor.dev/gvisor/pkg/sentry/kernel.(*Kernel).LoadFrom(0x40004fe008, {0x110d2f0, 0x40004d9260}, {0x10ea3a0, 0x40002fe080}, {0x0, 0x0}, 0x0, 0x0, 0x0, ...)
	pkg/sentry/kernel/kernel.go:806 +0x514
gvisor.dev/gvisor/pkg/sentry/state.LoadOpts.Load({{0x10ea260, 0x40002f96f0}, 0x0, 0x0, 0x0, {0x0, 0x0, 0x0}}, {0x110d2f0, 0x40004d9260}, ...)
	pkg/sentry/state/state.go:183 +0x214
gvisor.dev/gvisor/runsc/boot.(*restorer).restore(0x4000080360, 0x40000d3188)
	runsc/boot/restore.go:236 +0xa38
gvisor.dev/gvisor/runsc/boot.(*restorer).restoreContainerInfo(0x4000080360, 0x40000d3188, 0x40000d3198)
	runsc/boot/restore.go:132 +0x418
gvisor.dev/gvisor/runsc/boot.(*containerManager).Restore(0x4000020f40, 0x40004ce0e0, 0x0?)
	runsc/boot/controller.go:597 +0x8c4
reflect.Value.call({0x4000118a80?, 0x4000244710?, 0x40004f5c98?}, {0xeed790, 0x4}, {0x40004f5ef0, 0x3, 0x821988?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/reflect/value.go:581 +0x97c
reflect.Value.Call({0x4000118a80?, 0x4000244710?, 0x50?}, {0x40004f5ef0?, 0x40004ce0e0?, 0x0?})
	bazel-out/aarch64-opt/bin/external/io_bazel_rules_go/stdlib_/src/reflect/value.go:365 +0x94
gvisor.dev/gvisor/pkg/urpc.(*Server).handleOne(0x400008c910, 0x4000166270)
	pkg/urpc/urpc.go:338 +0x468
gvisor.dev/gvisor/pkg/urpc.(*Server).handleRegistered(...)
	pkg/urpc/urpc.go:433
gvisor.dev/gvisor/pkg/urpc.(*Server).StartHandling.func1()
	pkg/urpc/urpc.go:453 +0x68
created by gvisor.dev/gvisor/pkg/urpc.(*Server).StartHandling in goroutine 21
	pkg/urpc/urpc.go:451 +0x74

The container is pretty vanilla alpine, running sleep infinity.

runsc version

# runsc -version
runsc version release-20250127.0
spec: 1.1.0-rc.1

uname

Linux lima-sketch 6.11.0-13-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Sun Dec  1 00:22:04 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
@josharian josharian added the type: bug Something isn't working label Feb 5, 2025
@ayushr2
Copy link
Collaborator

ayushr2 commented Feb 5, 2025

Yeah checkpointing a container which has live exec sessions is not supported, since we can not re-establish such exec sessions on restore. IIUC, your suggestion is that we silently kill the exec session on restore and clean up all its resources?

@josharian
Copy link
Author

IIUC, your suggestion is that we silently kill the exec session on restore and clean up all its resources?

Yep! Maybe behind a flag.

@josharian
Copy link
Author

For even more fun here:

$ runsc exec x bash
:/# dtach -n /tmp/z watch ps
:/# exit

Then checkpoint and restore. The checkpoint and restore succeed. But the detached processes show up in ps as kernel processes, but are not actually executing.

# runsc exec x ps
PID   USER     TIME  COMMAND
    1 root      0:00 sleep infinity
    4 root      0:00 [dtach]
    5 root      0:00 [watch]
   27 root      0:00 ps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants