[CIQ 6.12.y] Merge v6.12.17 & config Updates #194

PlaidCat · 2025-04-03T19:16:43Z

This is just a base update with refresh of the configs along with a merge rather than rebase ... its not how we usually operate but it might be the only way to work on this.

git checkout -b {jmaple}_merge_test_ciq-6.12.y
git merge v6.12.17
../kernel-src-tree-tools/lt_rebase.sh
git status
git diff

Last Commit is the changes CIQ has made otherwise everything else merged:
9117443

[CIQ] v6.12.17 config updates
All configs dropped the follwoing since its dependent on ARCH_MVEBU
which is not configured on so there is no reason to ask.
 # CONFIG_CZNIC_PLATFORMS is not set
See upstream commit: https://github.com/ctrliq/kernel-src-tree/commit/dd0f05b98925111f4530d7dab774398cdb32e9e3

x86_64 configs also dropped a previously defined y config
 -CONFIG_IMX_SCMI_MISC_DRV=y
 This comes from firmware: imx: IMX_SCMI_MISC_DRV should depend on ARCH_MXC
See Upstream Commit: https://github.com/ctrliq/kernel-src-tree/commit/be6686b823b30a69b1f71bde228ce042c78a1941
Whats a little confusing is that the fedora kernel-ark says that this is
marked as a `y`
$ cat redhat/configs/rhel/generic/CONFIG_IMX_SCMI_MISC_DRV
CONFIG_IMX_SCMI_MISC_DRV=y

[kernel-ark]$ ls redhat/configs/kernel-6.13.8-x86_64*
redhat/configs/kernel-6.13.8-x86_64-automotive.config
redhat/configs/kernel-6.13.8-x86_64.config
redhat/configs/kernel-6.13.8-x86_64-rt.config
redhat/configs/kernel-6.13.8-x86_64-automotive-debug.config
redhat/configs/kernel-6.13.8-x86_64-debug.config
redhat/configs/kernel-6.13.8-x86_64-rt-debug.config

[kernel-ark]$ grep CONFIG_IMX_SCMI_MISC_DRV redhat/configs/kernel-6.13.8-x86_64*
[kernel-ark]$

Do to this we're leaving this as the default Kconfig of off for x86_64

Build

Will be done by PR Builders

Testing

No Testing for this PR, we're still in development and this merge had no conflicts from the last PR we merged.

PROT_MTE (memory tagging extensions) is not supported on all user mmap() types for various reasons (memory attributes, backing storage, CoW handling). The arm64 arch_validate_flags() function checks whether the VM_MTE_ALLOWED flag has been set for a vma during mmap(), usually by arch_calc_vm_flag_bits(). Linux prior to 6.13 does not support PROT_MTE hugetlb mappings. This was added by commit 25c17c4 ("hugetlb: arm64: add mte support"). However, earlier kernels inadvertently set VM_MTE_ALLOWED on (MAP_ANONYMOUS | MAP_HUGETLB) mappings by only checking for MAP_ANONYMOUS. Explicitly check MAP_HUGETLB in arch_calc_vm_flag_bits() and avoid setting VM_MTE_ALLOWED for such mappings. Fixes: 9f34193 ("arm64: mte: Add PROT_MTE support to mmap() and mprotect()") Cc: <[email protected]> # 5.10.x-6.12.x Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit dddcb19 ] When we introduce xe_syncs, we don't wait for internal OA programming batches to complete. That is, xe_syncs are signaled asynchronously. In anticipation for this, separate out batch submission from waiting for completion of those batches. v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B) v3: Retain init "int err = 0;" in xe_oa_submit_bb (Jose) Reviewed-by: Jonathan Cavitt <[email protected]> Signed-off-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: f0ed398 ("xe/oa: Fix query mode of operation for OAR/OAC") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c8507a2 ] Now that we have laid the groundwork, introduce OA sync properties in the uapi and parse the input xe_sync array as is done elsewhere in the driver. Also add DRM_XE_OA_CAPS_SYNCS bit in OA capabilities for userspace. v2: Fix and document DRM_XE_SYNC_TYPE_USER_FENCE for OA (Matt B) Add DRM_XE_OA_CAPS_SYNCS bit to OA capabilities (Jose) Acked-by: José Roberto de Souza <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Signed-off-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: f0ed398 ("xe/oa: Fix query mode of operation for OAR/OAC") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 2fb4350 ] Add input fence dependencies which will make OA configuration wait till these dependencies are met (till input fences signal). v2: Change add_deps arg to xe_oa_submit_bb from bool to enum (Matt Brost) Reviewed-by: Jonathan Cavitt <[email protected]> Signed-off-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: f0ed398 ("xe/oa: Fix query mode of operation for OAR/OAC") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 5503983 ] This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference. 1) Allow a GGTT mapped batch to be submitted to user exec queue For a OA use case, one of the HW registers needs to be modified by submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so that the register is modified in the user's hardware context. In order to do this a batch that is mapped in GGTT, needs to be submitted to the user exec queue. Since all user submissions use q->vm and hence PPGTT, add some plumbing to enable submission of batches mapped in GGTT. v2: ggtt is zero-initialized, so no need to set it false (Matt Brost) 2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set. Setting this bit cause the context image size to change and if not done correct, can cause undesired hangs. Current code uses a separate exec_queue to modify this bit and is error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to the target hardware context to modify the relevant bit. In v2 version, an attempt to submit everything to the user-queue was made, but it failed the unprivileged-single-ctx-counters test. It appears that the OACTXCONTROL must be modified from a remote context. In v3 version, all context specific register configurations were moved to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a cleaner way, since we can now submit all configuration to user exec_queue and the fence handling is simplified. v2: (Matt) - set job->ggtt to true if create job is successful - unlock vm on job error (Ashutosh) - don't wait on job submission - use kernel exec queue where possible v3: (Ashutosh) - Fix checkpatch issues - Remove extra spaces/new-lines - Add Fixes: and Cc: tags - Reset context control bit when OA stream is closed - Submit all config via MI_LOAD_REGISTER_IMMEDIATE (Umesh) - Update commit message for v3 experiment - Squash patches for easier port to stable v4: (Ashutosh) - No need to pass q to xe_oa_submit_bb - Do not support exec queues with width > 1 - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE v5: (Ashutosh) - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri Fixes: 8135f1c ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Matthew Brost <[email protected]> # commit 1 Reviewed-by: Ashutosh Dixit <[email protected]> Cc: [email protected] Reviewed-by: Jonathan Cavitt <[email protected]> Signed-off-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Stable-dep-of: f0ed398 ("xe/oa: Fix query mode of operation for OAR/OAC") Signed-off-by: Sasha Levin <[email protected]>

…page_io() [ Upstream commit 928b4de ] The function extent_writepage_io() will submit the dirty sectors inside the page for the write. But recently to co-operate with the incoming subpage compression enhancement, a new bitmap is introduced to btrfs_bio_ctrl::submit_bitmap, to only avoid a subset of the dirty range. This is because we can have the following cases with 64K page size: 0 16K 32K 48K 64K | |/////////| |/| 52K For range [16K, 32K), we queue the dirty range for compression, which is ran in a delayed workqueue. Then for range [48K, 52K), we go through the regular submission path. In that case, our btrfs_bio_ctrl::submit_bitmap will exclude the range [16K, 32K). The dirty flags for the range [16K, 32K) is only cleared when the compression is done, by the extent_clear_unlock_delalloc() call inside submit_one_async_extent(). This patch fix the false alert by removing the btrfs_folio_assert_not_dirty() check, since it's no longer correct for subpage compression cases. Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 2bca8eb ] Currently for subpage (sector size < page size) cases, we reuse subpage locked bitmap to find out all delalloc ranges we have locked, and run all those found ranges. However such reuse is not perfect, e.g.: 0 32K 64K 96K 128K | |////////||///////| |////| 120K For above range, writepage_delalloc() for page 0 will handle the range [32K, 96k), note delalloc range can be beyond the page boundary. But writepage_delalloc() for page 64K will only handle range [120K, 128K), as the previous run on page 0 has already handled range [64K, 96K). Meanwhile for the writeback we should expect range [64K, 96K) to also be locked, this leads to the mismatch from locked bitmap and delalloc range. This is not causing problems yet, but it's still an inconsistent behavior. So instead of relying on the subpage locked bitmap, move the delalloc range search using local @delalloc_bitmap, so that we can remove the existing btrfs_folio_find_writer_locked(). Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c96d0e3 ] Currently we only mark sectors as locked if there is a *NEW* delalloc range for it. But NEW delalloc range is not the same as dirty sectors we want to submit, e.g: 0 32K 64K 96K 128K | |////////||///////| |////| 120K For above 64K page size case, writepage_delalloc() for page 0 will find and lock the delalloc range [32K, 96K), which is beyond the page boundary. Then when writepage_delalloc() is called for the page 64K, since [64K, 96K) is already locked, only [120K, 128K) will be locked. This means, although range [64K, 96K) is dirty and will be submitted later by extent_writepage_io(), it will not be marked as locked. This is fine for now, as we call btrfs_folio_end_writer_lock_bitmap() to free every non-compressed sector, and compression is only allowed for full page range. But this is not safe for future sector perfect compression support, as this can lead to double folio unlock: Thread A | Thread B ---------------------------------------+-------------------------------- | submit_one_async_extent() | |- extent_clear_unlock_delalloc() extent_writepage() | |- btrfs_folio_end_writer_lock() |- btrfs_folio_end_writer_lock_bitmap()| |- btrfs_subpage_end_and_test_writer() | | | |- atomic_sub_and_test() | | | /* Now the atomic value is 0 */ |- if (atomic_read() == 0) | | |- folio_unlock() | |- folio_unlock() The root cause is the above range [64K, 96K) is dirtied and should also be locked but it isn't. So to make everything more consistent and prepare for the incoming sector perfect compression, mark all dirty sectors as locked. Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 8511074 ] This function is not really suitable to lock a folio, as it lacks the proper mapping checks, thus the locked folio may not even belong to btrfs. And due to the above reason, the last user inside lock_delalloc_folios() is already removed, and we can remove this function. Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 336e69f ] Since commit d7172f5 ("btrfs: use per-buffer locking for extent_buffer reading"), metadata read no longer relies on the subpage reader locking. This means we do not need to maintain a different metadata/data split for locking, so we can convert the existing reader lock users by: - add_ra_bio_pages() Convert to btrfs_folio_set_writer_lock() - end_folio_read() Convert to btrfs_folio_end_writer_lock() - begin_folio_read() Convert to btrfs_folio_set_writer_lock() - folio_range_has_eb() Remove the subpage->readers checks, since it is always 0. - Remove btrfs_subpage_start_reader() and btrfs_subpage_end_reader() Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 0f71202 ] Since there is no user of reader locks, rename the writer locks into a more generic name, by removing the "_writer" part from the name. And also rename btrfs_subpage::writer into btrfs_subpage::locked. Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 011a9a1 ] As extent_writepage() is internal helper we should use our inode type, so change it from struct inode. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

@null

[ Upstream commit 72dad8e ] [BUG] When running btrfs with block size (4K) smaller than page size (64K, aarch64), there is a very high chance to crash the kernel at generic/750, with the following messages: (before the call traces, there are 3 extra debug messages added) BTRFS warning (device dm-3): read-write for sector size 4096 with page size 65536 is experimental BTRFS info (device dm-3): checking UUID tree hrtimer: interrupt took 5451385 ns BTRFS error (device dm-3): cow_file_range failed, root=4957 inode=257 start=1605632 len=69632: -28 BTRFS error (device dm-3): run_delalloc_nocow failed, root=4957 inode=257 start=1605632 len=69632: -28 BTRFS error (device dm-3): failed to run delalloc range, root=4957 ino=257 folio=1572864 submit_bitmap=8-15 start=1605632 len=69632: -28 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 3020984 at ordered-data.c:360 can_finish_ordered_extent+0x370/0x3b8 [btrfs] CPU: 2 UID: 0 PID: 3020984 Comm: kworker/u24:1 Tainted: G OE 6.13.0-rc1-custom+ #89 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] pc : can_finish_ordered_extent+0x370/0x3b8 [btrfs] lr : can_finish_ordered_extent+0x1ec/0x3b8 [btrfs] Call trace: can_finish_ordered_extent+0x370/0x3b8 [btrfs] (P) can_finish_ordered_extent+0x1ec/0x3b8 [btrfs] (L) btrfs_mark_ordered_io_finished+0x130/0x2b8 [btrfs] extent_writepage+0x10c/0x3b8 [btrfs] extent_write_cache_pages+0x21c/0x4e8 [btrfs] btrfs_writepages+0x94/0x160 [btrfs] do_writepages+0x74/0x190 filemap_fdatawrite_wbc+0x74/0xa0 start_delalloc_inodes+0x17c/0x3b0 [btrfs] btrfs_start_delalloc_roots+0x17c/0x288 [btrfs] shrink_delalloc+0x11c/0x280 [btrfs] flush_space+0x288/0x328 [btrfs] btrfs_async_reclaim_data_space+0x180/0x228 [btrfs] process_one_work+0x228/0x680 worker_thread+0x1bc/0x360 kthread+0x100/0x118 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1605632 OE len=16384 to_dec=16384 left=0 BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1622016 OE len=12288 to_dec=12288 left=0 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1634304 OE len=8192 to_dec=4096 left=0 CPU: 1 UID: 0 PID: 3286940 Comm: kworker/u24:3 Tainted: G W OE 6.13.0-rc1-custom+ #89 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: btrfs_work_helper [btrfs] (btrfs-endio-write) pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : process_one_work+0x110/0x680 lr : worker_thread+0x1bc/0x360 Call trace: process_one_work+0x110/0x680 (P) worker_thread+0x1bc/0x360 (L) worker_thread+0x1bc/0x360 kthread+0x100/0x118 ret_from_fork+0x10/0x20 Code: f84086a1 f9000fe1 53041c21 b9003361 (f9400661) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 2-3 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x275bb9540000 from 0xffff800080000000 PHYS_OFFSET: 0xffff8fbba0000000 CPU features: 0x100,00000070,00801250,8201720b [CAUSE] The above warning is triggered immediately after the delalloc range failure, this happens in the following sequence: - Range [1568K, 1636K) is dirty 1536K 1568K 1600K 1636K 1664K | |/////////|////////| | Where 1536K, 1600K and 1664K are page boundaries (64K page size) - Enter extent_writepage() for page 1536K - Enter run_delalloc_nocow() with locked page 1536K and range [1568K, 1636K) This is due to the inode having preallocated extents. - Enter cow_file_range() with locked page 1536K and range [1568K, 1636K) - btrfs_reserve_extent() only reserved two extents The main loop of cow_file_range() only reserved two data extents, Now we have: 1536K 1568K 1600K 1636K 1664K | |<-->|<--->|/|///////| | 1584K 1596K Range [1568K, 1596K) has an ordered extent reserved. - btrfs_reserve_extent() failed inside cow_file_range() for file offset 1596K This is already a bug in our space reservation code, but for now let's focus on the error handling path. Now cow_file_range() returned -ENOSPC. - btrfs_run_delalloc_range() do error cleanup <<< ROOT CAUSE Call btrfs_cleanup_ordered_extents() with locked folio 1536K and range [1568K, 1636K) Function btrfs_cleanup_ordered_extents() normally needs to skip the ranges inside the folio, as it will normally be cleaned up by extent_writepage(). Such split error handling is already problematic in the first place. What's worse is the folio range skipping itself, which is not taking subpage cases into consideration at all, it will only skip the range if the page start >= the range start. In our case, the page start < the range start, since for subpage cases we can have delalloc ranges inside the folio but not covering the folio. So it doesn't skip the page range at all. This means all the ordered extents, both [1568K, 1584K) and [1584K, 1596K) will be marked as IOERR. And these two ordered extents have no more pending ios, they are marked finished, and *QUEUED* to be deleted from the io tree. - extent_writepage() do error cleanup Call btrfs_mark_ordered_io_finished() for the range [1536K, 1600K). Although ranges [1568K, 1584K) and [1584K, 1596K) are finished, the deletion from io tree is async, it may or may not happen at this time. If the ranges have not yet been removed, we will do double cleaning on those ranges, triggering the above ordered extent warnings. In theory there are other bugs, like the cleanup in extent_writepage() can cause double accounting on ranges that are submitted asynchronously (compression for example). But that's much harder to trigger because normally we do not mix regular and compression delalloc ranges. [FIX] The folio range split is already buggy and not subpage compatible, it was introduced a long time ago where subpage support was not even considered. So instead of splitting the ordered extents cleanup into the folio range and out of folio range, do all the cleanup inside writepage_delalloc(). - Pass @null as locked_folio for btrfs_cleanup_ordered_extents() in btrfs_run_delalloc_range() - Skip the btrfs_cleanup_ordered_extents() if writepage_delalloc() failed So all ordered extents are only cleaned up by btrfs_run_delalloc_range(). - Handle the ranges that already have ordered extents allocated If part of the folio already has ordered extent allocated, and btrfs_run_delalloc_range() failed, we also need to cleanup that range. Now we have a concentrated error handling for ordered extents during btrfs_run_delalloc_range(). Fixes: d1051d6 ("btrfs: Fix error handling in btrfs_cleanup_ordered_extents") CC: [email protected] # 5.15+ Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]> Stable-dep-of: 8bf334b ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 8bf334b ] [BUG] If submit_one_sector() failed inside extent_writepage_io() for sector size < page size cases (e.g. 4K sector size and 64K page size), then we can hit double ordered extent accounting error. This should be very rare, as submit_one_sector() only fails when we failed to grab the extent map, and such extent map should exist inside the memory and has been pinned. [CAUSE] For example we have the following folio layout: 0 4K 32K 48K 60K 64K |//| |//////| |///| Where |///| is the dirty range we need to writeback. The 3 different dirty ranges are submitted for regular COW. Now we hit the following sequence: - submit_one_sector() returned 0 for [0, 4K) - submit_one_sector() returned 0 for [32K, 48K) - submit_one_sector() returned error for [60K, 64K) - btrfs_mark_ordered_io_finished() called for the whole folio This will mark the following ranges as finished: * [0, 4K) * [32K, 48K) Both ranges have their IO already submitted, this cleanup will lead to double accounting. * [60K, 64K) That's the correct cleanup. The only good news is, this error is only theoretical, as the target extent map is always pinned, thus we should directly grab it from memory, other than reading it from the disk. [FIX] Instead of calling btrfs_mark_ordered_io_finished() for the whole folio range, which can touch ranges we should not touch, instead move the error handling inside extent_writepage_io(). So that we can cleanup exact sectors that ought to be submitted but failed. This provides much more accurate cleanup, avoiding the double accounting. CC: [email protected] # 5.15+ Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit d91060e ] Access KVM's emulated APIC base MSR value directly instead of bouncing through a helper, as there is no reason to add a layer of indirection, and there are other MSRs with a "set" but no "get", e.g. EFER. No functional change intended. Reviewed-by: Kai Huang <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Stable-dep-of: 04bc93c ("KVM: nVMX: Defer SVI update to vmcs01 on EOI when L2 is active w/o VID") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit adfec1f ] Inline kvm_get_apic_mode() in lapic.h to avoid a CALL+RET as well as an export. The underlying kvm_apic_mode() helper is public information, i.e. there is no state/information that needs to be hidden from vendor modules. No functional change intended. Reviewed-by: Kai Huang <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Stable-dep-of: 04bc93c ("KVM: nVMX: Defer SVI update to vmcs01 on EOI when L2 is active w/o VID") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 04bc93c ] If KVM emulates an EOI for L1's virtual APIC while L2 is active, defer updating GUEST_INTERUPT_STATUS.SVI, i.e. the VMCS's cache of the highest in-service IRQ, until L1 is active, as vmcs01, not vmcs02, needs to track vISR. The missed SVI update for vmcs01 can result in L1 interrupts being incorrectly blocked, e.g. if there is a pending interrupt with lower priority than the interrupt that was EOI'd. This bug only affects use cases where L1's vAPIC is effectively passed through to L2, e.g. in a pKVM scenario where L2 is L1's depriveleged host, as KVM will only emulate an EOI for L1's vAPIC if Virtual Interrupt Delivery (VID) is disabled in vmc12, and L1 isn't intercepting L2 accesses to its (virtual) APIC page (or if x2APIC is enabled, the EOI MSR). WARN() if KVM updates L1's ISR while L2 is active with VID enabled, as an EOI from L2 is supposed to affect L2's vAPIC, but still defer the update, to try to keep L1 alive. Specifically, KVM forwards all APICv-related VM-Exits to L1 via nested_vmx_l1_wants_exit(): case EXIT_REASON_APIC_ACCESS: case EXIT_REASON_APIC_WRITE: case EXIT_REASON_EOI_INDUCED: /* * The controls for "virtualize APIC accesses," "APIC- * register virtualization," and "virtual-interrupt * delivery" only come from vmcs12. */ return true; Fixes: c7c9c56 ("x86, apicv: add virtual interrupt delivery support") Cc: [email protected] Link: https://lore.kernel.org/kvm/[email protected] Reported-by: Markku Ahvenjärvi <[email protected]> Closes: https://lore.kernel.org/all/[email protected] Cc: Janne Karhunen <[email protected]> Signed-off-by: Chao Gao <[email protected]> [sean: drop request, handle in VMX, write changelog] Tested-by: Chao Gao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b042004 ] [Why] For Header related changes for core [How] Refactoring if and endif statements to enable DC_LOGGER Reviewed-by: Mounika Adhuri <[email protected]> Reviewed-by: Alvin Lee <[email protected]> Signed-off-by: Lohita Mudimela <[email protected]> Signed-off-by: Tom Chung <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Stable-dep-of: f88192d ("drm/amd/display: Correct register address in dcn35") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a1fc283 ] [why] hw register offset delta Reviewed-by: Martin Leung <[email protected]> Signed-off-by: Charlene Liu <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Stable-dep-of: f88192d ("drm/amd/display: Correct register address in dcn35") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f88192d ] [Why] the offset address of mmCLK5_spll_field_8 was incorrect for dcn35 which causes SSC not to be enabled. Reviewed-by: Charlene Liu <[email protected]> Signed-off-by: Lo-An Chen <[email protected]> Signed-off-by: Zaeem Mohamed <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a4c5a46 ] Different connectivity boards may be attached to the same platform. For example, QCA6698-based boards can support either a two-antenna or three-antenna solution, both of which work on the sa8775p-ride platform. Due to differences in connectivity boards and variations in RF performance from different foundries, different NVM configurations are used based on the board ID. Therefore, in the firmware-name property, if the NVM file has an extension, the NVM file will be used. Otherwise, the system will first try the .bNN (board ID) file, and if that fails, it will fall back to the .bin file. Possible configurations: firmware-name = "QCA6698/hpnv21"; firmware-name = "QCA6698/hpnv21.bin"; Signed-off-by: Cheng Jiang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Stable-dep-of: a2fad24 ("Bluetooth: qca: Fix poor RF performance for WCN6855") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a2fad24 ] For WCN6855, board ID specific NVM needs to be downloaded once board ID is available, but the default NVM is always downloaded currently. The wrong NVM causes poor RF performance, and effects user experience for several types of laptop with WCN6855 on the market. Fix by downloading board ID specific NVM if board ID is available. Fixes: 095327f ("Bluetooth: hci_qca: Add support for QTI Bluetooth chip wcn6855") Cc: [email protected] # 6.4 Signed-off-by: Zijun Hu <[email protected]> Tested-by: Johan Hovold <[email protected]> Reviewed-by: Johan Hovold <[email protected]> Tested-by: Steev Klimaszewski <[email protected]> #Thinkpad X13s Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…ports [ Upstream commit 0e45a09 ] serio_pause_rx() and serio_continue_rx() are usually used together to temporarily stop receiving interrupts/data for a given serio port. Define "serio_pause_rx" guard for this so that the port is always resumed once critical section is over. Example: scoped_guard(serio_pause_rx, elo->serio) { elo->expected_packet = toupper(packet[0]); init_completion(&elo->cmd_done); } Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Stable-dep-of: 08bd5b7 ("Input: synaptics - fix crash when enabling pass-through port") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 08bd5b7 ] When enabling a pass-through port an interrupt might come before psmouse driver binds to the pass-through port. However synaptics sub-driver tries to access psmouse instance presumably associated with the pass-through port to figure out if only 1 byte of response or entire protocol packet needs to be forwarded to the pass-through port and may crash if psmouse instance has not been attached to the port yet. Fix the crash by introducing open() and close() methods for the port and check if the port is open before trying to access psmouse instance. Because psmouse calls serio_open() only after attaching psmouse instance to serio port instance this prevents the potential crash. Reported-by: Takashi Iwai <[email protected]> Fixes: 100e169 ("Input: libps2 - attach ps2dev instances as serio port's drvdata") Link: https://bugzilla.suse.com/show_bug.cgi?id=1219522 Cc: [email protected] Reviewed-by: Takashi Iwai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 541011d ] The stop trigger invokes rz_ssi_stop() and rz_ssi_stream_quit(). - The purpose of rz_ssi_stop() is to disable TX/RX, terminate DMA transactions, and set the controller to idle. - The purpose of rz_ssi_stream_quit() is to reset the substream-specific software data by setting strm->running and strm->substream appropriately. The function rz_ssi_is_stream_running() checks if both strm->substream and strm->running are valid and returns true if so. Its implementation is as follows: static inline bool rz_ssi_is_stream_running(struct rz_ssi_stream *strm) { return strm->substream && strm->running; } When the controller is configured in full-duplex mode (with both playback and capture active), the rz_ssi_stop() function does not modify the controller settings when called for the first substream in the full-duplex setup. Instead, it simply sets strm->running = 0 and returns if the companion substream is still running. The following code illustrates this: static int rz_ssi_stop(struct rz_ssi_priv *ssi, struct rz_ssi_stream *strm) { strm->running = 0; if (rz_ssi_is_stream_running(&ssi->playback) || rz_ssi_is_stream_running(&ssi->capture)) return 0; // ... } The controller settings, along with the DMA termination (for the last stopped substream), are only applied when the last substream in the full-duplex setup is stopped. While applying the controller settings only when the last substream stops is not problematic, terminating the DMA operations for only one substream causes failures when starting and stopping full-duplex operations multiple times in a loop. To address this issue, call dmaengine_terminate_async() for both substreams involved in the full-duplex setup when the last substream in the setup is stopped. Fixes: 4f8cd05 ("ASoC: sh: rz-ssi: Add full duplex support") Cc: [email protected] Reviewed-by: Biju Das <[email protected]> Signed-off-by: Claudiu Beznea <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 82a0a3e ] My static checker rule complains about this code. The concern is that if "sample_space" is negative then the "sample_space >= runtime->channels" condition will not work as intended because it will be type promoted to a high unsigned int value. strm->fifo_sample_size is SSI_FIFO_DEPTH (32). The SSIFSR_TDC_MASK is 0x3f. Without any further context it does seem like a reasonable warning and it can't hurt to add a check for negatives. Cc: [email protected] Fixes: 03e786b ("ASoC: sh: Add RZ/G2L SSIF-2 driver") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit d9d959c ] In order to remove the deprecated function pcim_iomap_regions_request_all(), a few drivers need an interface to request all BARs a PCI device offers. Make pcim_request_all_regions() a public interface. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Philipp Stanner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Stable-dep-of: d555ed4 ("PCI: Restore original INTX_DISABLE bit by pcim_intx()") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit f546e80 ] pci_intx() is a hybrid function which sometimes performs devres operations, depending on whether pcim_enable_device() has been used to enable the pci_dev. This sometimes-managed nature of the function is problematic. Notably, it causes the function to allocate under some circumstances which makes it unusable from interrupt context. Export pcim_intx() (which is always managed) and rename __pcim_intx() (which is never managed) to pci_intx_unmanaged() and export it as well. Then all callers of pci_intx() can be ported to the version they need, depending whether they use pci_enable_device() or pcim_enable_device(). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Philipp Stanner <[email protected]> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Stable-dep-of: d555ed4 ("PCI: Restore original INTX_DISABLE bit by pcim_intx()") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit dfa2f4d ] pci_intx() is a hybrid function which can sometimes be managed through devres. This hybrid nature is undesirable. Since all users of pci_intx() have by now been ported either to always-managed pcim_intx() or never-managed pci_intx_unmanaged(), the devres functionality can be removed from pci_intx(). Consequently, pci_intx_unmanaged() is now redundant, because pci_intx() itself is now unmanaged. Remove the devres functionality from pci_intx(). Have all users of pci_intx_unmanaged() call pci_intx(). Remove pci_intx_unmanaged(). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Philipp Stanner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Paolo Abeni <[email protected]> Stable-dep-of: d555ed4 ("PCI: Restore original INTX_DISABLE bit by pcim_intx()") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit d555ed4 ] pcim_intx() tries to restore the INTx bit at removal via devres, but there is a chance that it restores a wrong value. Because the value to be restored is blindly assumed to be the negative of the enable argument, when a driver calls pcim_intx() unnecessarily for the already enabled state, it'll restore to the disabled state in turn. That is, the function assumes the case like: // INTx == 1 pcim_intx(pdev, 0); // old INTx value assumed to be 1 -> correct but it might be like the following, too: // INTx == 0 pcim_intx(pdev, 0); // old INTx value assumed to be 1 -> wrong Also, when a driver calls pcim_intx() multiple times with different enable argument values, the last one will win no matter what value it is. This can lead to inconsistency, e.g. // INTx == 1 pcim_intx(pdev, 0); // OK ... pcim_intx(pdev, 1); // now old INTx wrongly assumed to be 0 This patch addresses those inconsistencies by saving the original INTx state at the first pcim_intx() call. For that, get_or_create_intx_devres() is folded into pcim_intx() caller side; it allows us to simply check the already allocated devres and record the original INTx along with the devres_alloc() call. Link: https://lore.kernel.org/r/[email protected] Fixes: 25216af ("PCI: Add managed pcim_intx()") Link: https://lore.kernel.org/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Philipp Stanner <[email protected]> Cc: [email protected] # v6.11+ Signed-off-by: Sasha Levin <[email protected]>

commit 539bd20 upstream. 'commit 18bcb4a ("mtd: spi-nor: sst: Factor out common write operation to `sst_nor_write_data()`")' introduced a bug where only one byte of data is written, regardless of the number of bytes passed to sst_nor_write_data(), causing a kernel crash during the write operation. Ensure the correct number of bytes are written as passed to sst_nor_write_data(). Call trace: [ 57.400180] ------------[ cut here ]------------ [ 57.404842] While writing 2 byte written 1 bytes [ 57.409493] WARNING: CPU: 0 PID: 737 at drivers/mtd/spi-nor/sst.c:187 sst_nor_write_data+0x6c/0x74 [ 57.418464] Modules linked in: [ 57.421517] CPU: 0 UID: 0 PID: 737 Comm: mtd_debug Not tainted 6.12.0-g5ad04afd91f9 #30 [ 57.429517] Hardware name: Xilinx Versal A2197 Processor board revA - x-prc-02 revA (DT) [ 57.437600] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 57.444557] pc : sst_nor_write_data+0x6c/0x74 [ 57.448911] lr : sst_nor_write_data+0x6c/0x74 [ 57.453264] sp : ffff80008232bb40 [ 57.456570] x29: ffff80008232bb40 x28: 0000000000010000 x27: 0000000000000001 [ 57.463708] x26: 000000000000ffff x25: 0000000000000000 x24: 0000000000000000 [ 57.470843] x23: 0000000000010000 x22: ffff80008232bbf0 x21: ffff000816230000 [ 57.477978] x20: ffff0008056c0080 x19: 0000000000000002 x18: 0000000000000006 [ 57.485112] x17: 0000000000000000 x16: 0000000000000000 x15: ffff80008232b580 [ 57.492246] x14: 0000000000000000 x13: ffff8000816d1530 x12: 00000000000004a4 [ 57.499380] x11: 000000000000018c x10: ffff8000816fd530 x9 : ffff8000816d1530 [ 57.506515] x8 : 00000000fffff7ff x7 : ffff8000816fd530 x6 : 0000000000000001 [ 57.513649] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 57.520782] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008049b0000 [ 57.527916] Call trace: [ 57.530354] sst_nor_write_data+0x6c/0x74 [ 57.534361] sst_nor_write+0xb4/0x18c [ 57.538019] mtd_write_oob_std+0x7c/0x88 [ 57.541941] mtd_write_oob+0x70/0xbc [ 57.545511] mtd_write+0x68/0xa8 [ 57.548733] mtdchar_write+0x10c/0x290 [ 57.552477] vfs_write+0xb4/0x3a8 [ 57.555791] ksys_write+0x74/0x10c [ 57.559189] __arm64_sys_write+0x1c/0x28 [ 57.563109] invoke_syscall+0x54/0x11c [ 57.566856] el0_svc_common.constprop.0+0xc0/0xe0 [ 57.571557] do_el0_svc+0x1c/0x28 [ 57.574868] el0_svc+0x30/0xcc [ 57.577921] el0t_64_sync_handler+0x120/0x12c [ 57.582276] el0t_64_sync+0x190/0x194 [ 57.585933] ---[ end trace 0000000000000000 ]--- Cc: [email protected] Fixes: 18bcb4a ("mtd: spi-nor: sst: Factor out common write operation to `sst_nor_write_data()`") Signed-off-by: Amit Kumar Mahapatra <[email protected]> Reviewed-by: Pratyush Yadav <[email protected]> Reviewed-by: Tudor Ambarus <[email protected]> Reviewed-by: Bence Csókás <[email protected]> [[email protected]: add Cc stable tag] Signed-off-by: Pratyush Yadav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2b9df00 upstream. Replace dma_request_channel() with dma_request_chan_by_mask() and use helper functions to return proper error code instead of fixed -EBUSY. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d76d22b upstream. Remap the slave DMA I/O resources to enhance driver portability. Using a physical address causes DMA translation failure when the ARM SMMU is enabled. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f37d135 upstream. dma_map_single is using physical/bus device (DMA) but dma_unmap_single is using framework device(NAND controller), which is incorrect. Fixed dma_unmap_single to use correct physical/bus device. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 782cffe upstream. According to the latest event list, update the event constraint tables for Lion Cove core. The general rule (the event codes < 0x90 are restricted to counters 0-3.) has been removed. There is no restriction for most of the performance monitoring events. Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Amiri Khalil <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4cb7779 upstream. Christoph reports that their rk3399 system dies since commit 773c05f ("irqchip/gic-v3: Work around insecure GIC integrations"). It appears that some rk3399 have secure payloads, and that the firmware sets SCR_EL3.FIQ==1. Obivously, disabling security in that configuration leads to even more problems. Revisit the workaround by: - making it rk3399 specific - checking whether Group-0 is available, which is a good proxy for SCR_EL3.FIQ being 0 - either apply the workaround if Group-0 is available, or disable pseudo-NMIs if not Note that this doesn't mean that the secure side is able to receive interrupts, as all interrupts are made non-secure anyway. Clearly, nobody ever tested secure interrupts on this platform. Fixes: 773c05f ("irqchip/gic-v3: Work around insecure GIC integrations") Reported-by: Christoph Fritz <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Christoph Fritz <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected] Closes: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 860ca5e upstream. Add check for the return value of cifs_buf_get() and cifs_small_buf_get() in receive_encrypted_standard() to prevent null pointer dereference. Fixes: eec04ea ("smb: client: fix OOB in receive_encrypted_standard()") Cc: [email protected] Signed-off-by: Haoxiang Li <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c158647 upstream. The previous implementation incorrectly configured the cmn_interrupt_2_enable register for interrupt handling. Using cmn_interrupt_2_enable to configure Tag, Data RAM ECC interrupts would lead to issues like double handling of the interrupts (EL1 and EL3) as cmn_interrupt_2_enable is meant to be configured for interrupts which needs to be handled by EL3. EL1 LLCC EDAC driver needs to use cmn_interrupt_0_enable register to configure Tag, Data RAM ECC interrupts instead of cmn_interrupt_2_enable. Fixes: 2745065 ("drivers: edac: Add EDAC driver support for QCOM SoCs") Signed-off-by: Komal Bajaj <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 57b76be upstream. The function tracer should record the preemption level at the point when the function is invoked. If the tracing subsystem decrement the preemption counter it needs to correct this before feeding the data into the trace buffer. This was broken in the commit cited below while shifting the preempt-disabled section. Use tracing_gen_ctx_dec() which properly subtracts one from the preemption counter on a preemptible kernel. Cc: [email protected] Cc: Wander Lairson Costa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: ce5e480 ("ftrace: disable preemption when recursion locked") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Tested-by: Wander Lairson Costa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 38b1406 upstream. Function graph uses a subops and manager ops mechanism to attach to ftrace. The manager ops connects to ftrace and the functions it connects to is defined by a list of subops that it manages. The function hash that defines what the above ops attaches to limits the functions to attach if the hash has any content. If the hash is empty, it means to trace all functions. The creation of the manager ops hash is done by iterating over all the subops hashes. If any of the subops hashes is empty, it means that the manager ops hash must trace all functions as well. The issue is in the creation of the manager ops. When a second subops is attached, a new hash is created by starting it as NULL and adding the subops one at a time. But the NULL ops is mistaken as an empty hash, and once an empty hash is found, it stops the loop of subops and just enables all functions. # echo "f:myevent1 kernel_clone" >> /sys/kernel/tracing/dynamic_events # cat /sys/kernel/tracing/enabled_functions kernel_clone (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 # echo "f:myevent2 schedule_timeout" >> /sys/kernel/tracing/dynamic_events # cat /sys/kernel/tracing/enabled_functions trace_initcall_start_cb (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 run_init_process (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 try_to_run_init_process (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 x86_pmu_show_pmu_cap (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 cleanup_rapl_pmus (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 uncore_free_pcibus_map (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 uncore_types_exit (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 uncore_pci_exit.part.0 (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 kvm_shutdown (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 vmx_dump_msrs (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 vmx_cleanup_l1d_flush (1) tramp: 0xffffffffc0309000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60 [..] Fix this by initializing the new hash to NULL and if the hash is NULL do not treat it as an empty hash but instead allocate by copying the content of the first sub ops. Then on subsequent iterations, the new hash will not be NULL, but the content of the previous subops. If that first subops attached to all functions, then new hash may assume that the manager ops also needs to attach to all functions. Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Alexander Gordeev <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 5fccc75 ("ftrace: Add subops logic to allow one ops to manage many") Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8eb4b09 upstream. Check if a function is already in the manager ops of a subops. A manager ops contains multiple subops, and if two or more subops are tracing the same function, the manager ops only needs a single entry in its hash. Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Alexander Gordeev <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 4f554e9 ("ftrace: Add ftrace_set_filter_ips function") Tested-by: Heiko Carstens <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 22bec11 upstream. When the function tracing_set_tracer() switched over to using the guard() infrastructure, it did not need to save the 'ret' variable and would just return the value when an error arised, instead of setting ret and jumping to an out label. When CONFIG_TRACER_SNAPSHOT is enabled, it had code that expected the "ret" variable to be initialized to zero and had set 'ret' while holding an arch_spin_lock() (not used by guard), and then upon releasing the lock it would check 'ret' and exit if set. But because ret was only set when an error occurred while holding the locks, 'ret' would be used uninitialized if there was no error. The code in the CONFIG_TRACER_SNAPSHOT block should be self contain. Make sure 'ret' is also set when no error occurred. Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Closes: https://lore.kernel.org/r/[email protected]/ Fixes: d33b10c ("tracing: Switch trace.c code over to use guard()") Signed-off-by: Steven Rostedt (Google) <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 488fb6e upstream. Fix a deadlock in pse_pi_get_current_limit and pse_pi_set_current_limit caused by consecutive mutex_lock calls. One in the function itself and another in pse_pi_get_voltage. Resolve the issue by using the unlocked version of pse_pi_get_voltage instead. Fixes: e0a5e2b ("net: pse-pd: Use power limit at driver side instead of current limit") Signed-off-by: Kory Maincent <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

… task_can_run_on_remote_rq() commit f3f08c3 upstream. While fixing migration disabled task handling, 3296682 ("sched_ext: Fix migration disabled handling in targeted dispatches") assumed that a migration disabled task's ->cpus_ptr would only have the pinned CPU. While this is eventually true for migration disabled tasks that are switched out, ->cpus_ptr update is performed by migrate_disable_switch() which is called right before context_switch() in __scheduler(). However, the task is enqueued earlier during pick_next_task() via put_prev_task_scx(), so there is a race window where another CPU can see the task on a DSQ. If the CPU tries to dispatch the migration disabled task while in that window, task_allowed_on_cpu() will succeed and task_can_run_on_remote_rq() will subsequently trigger SCHED_WARN(is_migration_disabled()). WARNING: CPU: 8 PID: 1837 at kernel/sched/ext.c:2466 task_can_run_on_remote_rq+0x12e/0x140 Sched_ext: layered (enabled+all), task: runnable_at=-10ms RIP: 0010:task_can_run_on_remote_rq+0x12e/0x140 ... <TASK> consume_dispatch_q+0xab/0x220 scx_bpf_dsq_move_to_local+0x58/0xd0 bpf_prog_84dd17b0654b6cf0_layered_dispatch+0x290/0x1cfa bpf__sched_ext_ops_dispatch+0x4b/0xab balance_one+0x1fe/0x3b0 balance_scx+0x61/0x1d0 prev_balance+0x46/0xc0 __pick_next_task+0x73/0x1c0 __schedule+0x206/0x1730 schedule+0x3a/0x160 __do_sys_sched_yield+0xe/0x20 do_syscall_64+0xbb/0x1e0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fix it by converting the SCHED_WARN() back to a regular failure path. Also, perform the migration disabled test before task_allowed_on_cpu() test so that BPF schedulers which fail to handle migration disabled tasks can be noticed easily. While at it, adjust scx_ops_error() message for !task_allowed_on_cpu() case for brevity and consistency. Signed-off-by: Tejun Heo <[email protected]> Fixes: 3296682 ("sched_ext: Fix migration disabled handling in targeted dispatches") Acked-by: Andrea Righi <[email protected]> Reported-by: Jake Hillion <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4603618 upstream. The mm kselftests are currently built with no optimisation (-O0). It's unclear why, and besides being obviously suboptimal, this also prevents the pkeys tests from working as intended. Let's build all the tests with -O2. [[email protected]: silence unused-result warnings] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kevin Brodsky <[email protected]> Cc: Aruna Ramakrishna <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Joey Gouly <[email protected]> Cc: Keith Lucas <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> [ Yifei: This commit also fix the failure of pkey_sighandler_tests_64, which is also in linux-6.12.y, thus backport this commit. ] Signed-off-by: Yifei Liu <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

… plus lts commit a6a7cba upstream. In general the delay should be added by the PHY instead of the MAC, and this improves network stability on some boards which seem to need different delay. Fixes: 387b3bb ("arm64: dts: rockchip: Add Xunlong OrangePi R1 Plus LTS") Cc: [email protected] # 6.6+ Signed-off-by: Tianling Shen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> [Fix conflicts due to missing dtsi conversion] Signed-off-by: Tianling Shen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b35eb91 upstream. When mesa started using compute queues more often we started seeing additional hangs with compute queues. Disabling gfxoff seems to mitigate that. Manually control gfxoff and gfx pg with command submissions to avoid any issues related to gfxoff. KFD already does the same thing for these chips. v2: limit to compute v3: limit to APUs v4: limit to Raven/PCO v5: only update the compute ring_funcs v6: Disable GFX PG v7: adjust order Reviewed-by: Lijo Lazar <[email protected]> Suggested-by: Błażej Szczygieł <[email protected]> Suggested-by: Sergey Kovalenko <[email protected]> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3861 Link: https://lists.freedesktop.org/archives/amd-gfx/2025-January/119116.html Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.12.x Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 55ed2b1 upstream. Bump the driver version for RV/PCO compute stability fix so mesa can use this check to enable compute queues on RV/PCO. Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.12.x Signed-off-by: Greg Kroah-Hartman <[email protected]>

Link: https://lore.kernel.org/r/[email protected] Tested-by: FLorian Fainelli <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Ron Economos <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Jon Hunter <[email protected]> Tested-by: Harshit Mogalapalli <[email protected]> Tested-by: Mark Brown <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Markus Reichelt <[email protected]> Tested-by: Slade Watkins <[email protected]> Tested-by: Florian Fainelli <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Salvatore Bonaccorso <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

This is the 6.12.17 stable release

All configs dropped the follwoing since its dependent on ARCH_MVEBU which is not configured on so there is no reason to ask. # CONFIG_CZNIC_PLATFORMS is not set See upstream commit: dd0f05b x86_64 configs also dropped a previously defined y config -CONFIG_IMX_SCMI_MISC_DRV=y This comes from firmware: imx: IMX_SCMI_MISC_DRV should depend on ARCH_MXC See Upstream Commit: be6686b Whats a little confusing is that the fedora kernel-ark says that this is marked as a `y` $ cat redhat/configs/rhel/generic/CONFIG_IMX_SCMI_MISC_DRV CONFIG_IMX_SCMI_MISC_DRV=y [kernel-ark]$ ls redhat/configs/kernel-6.13.8-x86_64* redhat/configs/kernel-6.13.8-x86_64-automotive.config redhat/configs/kernel-6.13.8-x86_64.config redhat/configs/kernel-6.13.8-x86_64-rt.config redhat/configs/kernel-6.13.8-x86_64-automotive-debug.config redhat/configs/kernel-6.13.8-x86_64-debug.config redhat/configs/kernel-6.13.8-x86_64-rt-debug.config [kernel-ark]$ grep CONFIG_IMX_SCMI_MISC_DRV redhat/configs/kernel-6.13.8-x86_64* [kernel-ark]$ Do to this we're leaving this as the default Kconfig of off for x86_64

bmastbergen

🥌

josephtate

🆗

PlaidCat · 2025-04-03T22:11:12Z

This did does not look like it will do what what we want.

I'm going to revert back to the previous rebase process and fall back to git push --force to do the update date.

This will keep our stuff directly dangling off the end.

ctmarinas and others added 30 commits February 27, 2025 04:30

Amit Kumar Mahapatra and others added 21 commits February 27, 2025 04:30

Merge tag 'v6.12.17' into {jmaple}_merge_test_ciq-6.12.y

e039ecc

This is the 6.12.17 stable release

PlaidCat self-assigned this Apr 3, 2025

PlaidCat requested review from kerneltoast, bmastbergen and thefossguy-ciq April 3, 2025 19:17

bmastbergen approved these changes Apr 3, 2025

View reviewed changes

josephtate approved these changes Apr 3, 2025

View reviewed changes

Kemotaha approved these changes Apr 3, 2025

View reviewed changes

PlaidCat closed this Apr 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIQ 6.12.y] Merge v6.12.17 & config Updates #194

[CIQ 6.12.y] Merge v6.12.17 & config Updates #194

PlaidCat commented Apr 3, 2025

bmastbergen left a comment

josephtate left a comment

PlaidCat commented Apr 3, 2025

[CIQ 6.12.y] Merge v6.12.17 & config Updates #194

[CIQ 6.12.y] Merge v6.12.17 & config Updates #194

Conversation

PlaidCat commented Apr 3, 2025

Build

Testing

bmastbergen left a comment

Choose a reason for hiding this comment

josephtate left a comment

Choose a reason for hiding this comment

PlaidCat commented Apr 3, 2025