Skip to content

Commit 2d448db

Browse files
surenbaghdasaryangregkh
authored andcommitted
userfaultfd: do not block on locking a large folio with raised refcount
commit 37b338e upstream. Lokesh recently raised an issue about UFFDIO_MOVE getting into a deadlock state when it goes into split_folio() with raised folio refcount. split_folio() expects the reference count to be exactly mapcount + num_pages_in_folio + 1 (see can_split_folio()) and fails with EAGAIN otherwise. If multiple processes are trying to move the same large folio, they raise the refcount (all tasks succeed in that) then one of them succeeds in locking the folio, while others will block in folio_lock() while keeping the refcount raised. The winner of this race will proceed with calling split_folio() and will fail returning EAGAIN to the caller and unlocking the folio. The next competing process will get the folio locked and will go through the same flow. In the meantime the original winner will be retried and will block in folio_lock(), getting into the queue of waiting processes only to repeat the same path. All this results in a livelock. An easy fix would be to avoid waiting for the folio lock while holding folio refcount, similar to madvise_free_huge_pmd() where folio lock is acquired before raising the folio refcount. Since we lock and take a refcount of the folio while holding the PTE lock, changing the order of these operations should not break anything. Modify move_pages_pte() to try locking the folio first and if that fails and the folio is large then return EAGAIN without touching the folio refcount. If the folio is single-page then split_folio() is not called, so we don't have this issue. Lokesh has a reproducer [1] and I verified that this change fixes the issue. [1] https://github.com/lokeshgidra/uffd_move_ioctl_deadlock [[email protected]: reflow comment to 80 cols, s/end/end up/] Link: https://lkml.kernel.org/r/[email protected] Fixes: adef440 ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Suren Baghdasaryan <[email protected]> Reported-by: Lokesh Gidra <[email protected]> Reviewed-by: Peter Xu <[email protected]> Acked-by: Liam R. Howlett <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Barry Song <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jann Horn <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Matthew Wilcow (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent 5ae31c5 commit 2d448db

File tree

1 file changed

+16
-1
lines changed

1 file changed

+16
-1
lines changed

mm/userfaultfd.c

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1215,6 +1215,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
12151215
*/
12161216
if (!src_folio) {
12171217
struct folio *folio;
1218+
bool locked;
12181219

12191220
/*
12201221
* Pin the page while holding the lock to be sure the
@@ -1234,12 +1235,26 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
12341235
goto out;
12351236
}
12361237

1238+
locked = folio_trylock(folio);
1239+
/*
1240+
* We avoid waiting for folio lock with a raised
1241+
* refcount for large folios because extra refcounts
1242+
* will result in split_folio() failing later and
1243+
* retrying. If multiple tasks are trying to move a
1244+
* large folio we can end up livelocking.
1245+
*/
1246+
if (!locked && folio_test_large(folio)) {
1247+
spin_unlock(src_ptl);
1248+
err = -EAGAIN;
1249+
goto out;
1250+
}
1251+
12371252
folio_get(folio);
12381253
src_folio = folio;
12391254
src_folio_pte = orig_src_pte;
12401255
spin_unlock(src_ptl);
12411256

1242-
if (!folio_trylock(src_folio)) {
1257+
if (!locked) {
12431258
pte_unmap(&orig_src_pte);
12441259
pte_unmap(&orig_dst_pte);
12451260
src_pte = dst_pte = NULL;

0 commit comments

Comments
 (0)