Skip to content

Conversation

@zulinx86
Copy link
Contributor

@zulinx86 zulinx86 commented Oct 28, 2025

Fixes #5322.

Changes

  • Call KVM_KVMCLOCK_CTRL not only after pause but also before snapshot resume

Reason

KVM_KVMCLOCK_CTRL ioctl sets pvclock_set_guest_stopped_request flag of kvm_vcpu_arch 1. On the next guest time update, if the flag is set, KVM ORs in PVCLOCK_GUEST_STOPPED and kvm_setup_guest_pvclock() pushes the hv_clock into the guest's pvclock page 2. If the hv_clock has not been written to the guest's pvclock page when taking a snapshot, it is not saved in the snapshot memory (i.e. PVCLOCK_GUEST_STOPPED isn't set in resumed VMs). So we should call KVM_KVMCLOCK_CTRL ioctl before resuming a VM rather than after pausing a VM. That covers both the pause-and-resume case and the restore-and-resume case.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • [ ] I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • [ ] When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • [ ] I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.84%. Comparing base (cc20162) to head (0894dbc).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5494   +/-   ##
=======================================
  Coverage   82.84%   82.84%           
=======================================
  Files         269      269           
  Lines       27737    27737           
=======================================
  Hits        22978    22978           
  Misses       4759     4759           
Flag Coverage Δ
5.10-m5n.metal 83.01% <100.00%> (+<0.01%) ⬆️
5.10-m6a.metal 82.28% <100.00%> (+<0.01%) ⬆️
5.10-m6g.metal 79.66% <100.00%> (-0.02%) ⬇️
5.10-m6i.metal 83.01% <100.00%> (+0.01%) ⬆️
5.10-m7a.metal-48xl 82.27% <100.00%> (+<0.01%) ⬆️
5.10-m7g.metal 79.66% <100.00%> (-0.02%) ⬇️
5.10-m7i.metal-24xl 82.99% <100.00%> (+0.01%) ⬆️
5.10-m7i.metal-48xl 82.98% <100.00%> (+<0.01%) ⬆️
5.10-m8g.metal-24xl 79.66% <100.00%> (-0.01%) ⬇️
5.10-m8g.metal-48xl 79.66% <100.00%> (-0.02%) ⬇️
6.1-m5n.metal 83.04% <100.00%> (+<0.01%) ⬆️
6.1-m6a.metal 82.31% <100.00%> (+<0.01%) ⬆️
6.1-m6g.metal 79.66% <100.00%> (-0.01%) ⬇️
6.1-m6i.metal 83.04% <100.00%> (+<0.01%) ⬆️
6.1-m7a.metal-48xl 82.30% <100.00%> (+<0.01%) ⬆️
6.1-m7g.metal 79.66% <100.00%> (-0.01%) ⬇️
6.1-m7i.metal-24xl 83.05% <100.00%> (+<0.01%) ⬆️
6.1-m7i.metal-48xl 83.05% <100.00%> (+<0.01%) ⬆️
6.1-m8g.metal-24xl 79.66% <100.00%> (-0.01%) ⬇️
6.1-m8g.metal-48xl 79.66% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zulinx86 zulinx86 force-pushed the kvmclock_ctrl_before_resume branch from 301f879 to ff48e0e Compare October 28, 2025 13:53
@zulinx86 zulinx86 force-pushed the kvmclock_ctrl_before_resume branch from ff48e0e to 3fc6bf3 Compare October 28, 2025 14:05
@zulinx86 zulinx86 enabled auto-merge (rebase) October 28, 2025 14:06
@zulinx86 zulinx86 added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Oct 28, 2025
@zulinx86 zulinx86 force-pushed the kvmclock_ctrl_before_resume branch 2 times, most recently from 64f27a2 to 0dfdc66 Compare October 28, 2025 15:51
kalyazin
kalyazin previously approved these changes Oct 28, 2025
@Manciukic
Copy link
Contributor

To double check what's the impact on the resume path, I've kicked off a BK A/B build: https://buildkite.com/firecracker/performance-a-b-tests/builds/736

@zulinx86
Copy link
Contributor Author

To double check what's the impact on the resume path, I've kicked off a BK A/B build: https://buildkite.com/firecracker/performance-a-b-tests/builds/736

thanks!

@zulinx86 zulinx86 force-pushed the kvmclock_ctrl_before_resume branch 2 times, most recently from 07cf216 to ab0f55e Compare October 29, 2025 08:56
@zulinx86 zulinx86 changed the title fix: Move KVM_KVMCLOCK_CTRL from after pause to before resume fix: Call KVM_KVMCLOCK_CTRL not only after pause but also before snapshot resume Oct 29, 2025
KVM_KVMCLOCK_CTRL ioctl sets `pvclock_set_guest_stopped_request` flag of
`kvm_vcpu_arch` [1]. On the next guest time update, if the flag is set,
KVM ORs in `PVCLOCK_GUEST_STOPPED` and `kvm_setup_guest_pvclock()`
pushes the `hv_clock` into the guest's pvclock page [2]. If the
`hv_clock` has not been written to the guest's pvclock page when taking
a snapshot, it is not saved in the snapshot memory (i.e.
`PVCLOCK_GUEST_STOPPED` isn't set in resumed VMs). So we should call
KVM_KVMCLOCK_CTRL ioctl before resuming a VM in addition to after
pausing a VM.

[1]: https://elixir.bootlin.com/linux/v6.16.3/source/arch/x86/kvm/x86.c#L5734
[2]: https://elixir.bootlin.com/linux/v6.16.3/source/arch/x86/kvm/x86.c#L3286-L3295
Signed-off-by: Takahiro Itazuri <[email protected]>
@zulinx86 zulinx86 force-pushed the kvmclock_ctrl_before_resume branch from ab0f55e to 0894dbc Compare October 29, 2025 09:00
@zulinx86 zulinx86 requested a review from kalyazin October 29, 2025 10:51
@zulinx86 zulinx86 merged commit d33011c into firecracker-microvm:main Oct 29, 2025
8 checks passed
@zulinx86 zulinx86 deleted the kvmclock_ctrl_before_resume branch October 29, 2025 10:57
maggie-lou added a commit to buildbuddy-io/buildbuddy that referenced this pull request Oct 30, 2025
Also includes this patch:
firecracker-microvm/firecracker#5494

I'm interested in these bug fixes since v1.11.0 (which we were on
previously):
firecracker-microvm/firecracker#5122
firecracker-microvm/firecracker#5260

We should also look into enabling PCI. From their release notes:

> In our micro-benchmarks, we measured up to 50% better latency for
block and network, up to 70% better block throughput, and up to 25%
higher network throughput (results depend on instance type and kernel).

Also I had previously cherry-picked a firecracker patch that fixed some
of the rcu_sched failures. It was formally merged into the firecracker
repo
[here](firecracker-microvm/firecracker#5494) and
this final version has some implementation differences from the patch
we'd applied. I can't directly apply the updated patch on v1.11.0 due to
some merge conflicts, but it applies cleanly on top of v1.13.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Kernel panic when resuming from a memory snapshot

4 participants