Skip to content
This repository was archived by the owner on Sep 11, 2019. It is now read-only.

Commit f591822

Browse files
yishaihjgunthorpe
authored andcommitted
IB/mlx5: Fix implicit MR release flow
Once implicit MR is being called to be released by ib_umem_notifier_release() its leaves were marked as "dying". However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is called, it skips running the mr_leaf_free_action (i.e. umem_odp->work) when those leaves were marked as "dying". As such ib_umem_release() for the leaves won't be called and their MRs will be leaked as well. When an application exits/killed without calling dereg_mr we might hit the above flow. This fatal scenario is reported by WARN_ON() upon mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the call trace can be seen below. Originally the "dying" mark as part of ib_umem_notifier_release() was introduced to prevent pagefault_mr() from returning a success response once this happened. However, we already have today the completion mechanism so no need for that in those flows any more. Even in case a success response will be returned the firmware will not find the pages and an error will be returned in the following call as a released mm will cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero(). Fix the above issue by dropping the "dying" from the above flows. The other flows that are using "dying" are still needed it for their synchronization purposes. WARNING: CPU: 1 PID: 7218 at drivers/infiniband/hw/mlx5/main.c:2004 mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib] CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G E 5.2.0-rc6+ GrapheneOS#13 Call Trace: uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs] ib_uverbs_close+0x1f/0x80 [ib_uverbs] __fput+0xbe/0x250 task_work_run+0x88/0xa0 do_exit+0x2cb/0xc30 ? __fput+0x14b/0x250 do_group_exit+0x39/0xb0 get_signal+0x191/0x920 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_csk_accept+0x229/0x2f0 do_signal+0x36/0x5e0 ? put_unused_fd+0x5b/0x70 ? __sys_accept4+0x1a6/0x1e0 ? inet_hash+0x35/0x40 ? release_sock+0x43/0x90 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_listen+0x9f/0x120 exit_to_usermode_loop+0x5c/0xc6 do_syscall_64+0x182/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 81713d3 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
1 parent e21a712 commit f591822

File tree

2 files changed

+9
-19
lines changed

2 files changed

+9
-19
lines changed

drivers/infiniband/core/umem_odp.c

-4
Original file line numberDiff line numberDiff line change
@@ -112,10 +112,6 @@ static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
112112
* prevent any further fault handling on this MR.
113113
*/
114114
ib_umem_notifier_start_account(umem_odp);
115-
umem_odp->dying = 1;
116-
/* Make sure that the fact the umem is dying is out before we release
117-
* all pending page faults. */
118-
smp_wmb();
119115
complete_all(&umem_odp->notifier_completion);
120116
umem_odp->umem.context->invalidate_range(
121117
umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp));

drivers/infiniband/hw/mlx5/odp.c

+9-15
Original file line numberDiff line numberDiff line change
@@ -579,7 +579,6 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
579579
u32 flags)
580580
{
581581
int npages = 0, current_seq, page_shift, ret, np;
582-
bool implicit = false;
583582
struct ib_umem_odp *odp_mr = to_ib_umem_odp(mr->umem);
584583
bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE;
585584
bool prefetch = flags & MLX5_PF_FLAGS_PREFETCH;
@@ -594,7 +593,6 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
594593
if (IS_ERR(odp))
595594
return PTR_ERR(odp);
596595
mr = odp->private;
597-
implicit = true;
598596
} else {
599597
odp = odp_mr;
600598
}
@@ -682,19 +680,15 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
682680

683681
out:
684682
if (ret == -EAGAIN) {
685-
if (implicit || !odp->dying) {
686-
unsigned long timeout =
687-
msecs_to_jiffies(MMU_NOTIFIER_TIMEOUT);
688-
689-
if (!wait_for_completion_timeout(
690-
&odp->notifier_completion,
691-
timeout)) {
692-
mlx5_ib_warn(dev, "timeout waiting for mmu notifier. seq %d against %d. notifiers_count=%d\n",
693-
current_seq, odp->notifiers_seq, odp->notifiers_count);
694-
}
695-
} else {
696-
/* The MR is being killed, kill the QP as well. */
697-
ret = -EFAULT;
683+
unsigned long timeout = msecs_to_jiffies(MMU_NOTIFIER_TIMEOUT);
684+
685+
if (!wait_for_completion_timeout(&odp->notifier_completion,
686+
timeout)) {
687+
mlx5_ib_warn(
688+
dev,
689+
"timeout waiting for mmu notifier. seq %d against %d. notifiers_count=%d\n",
690+
current_seq, odp->notifiers_seq,
691+
odp->notifiers_count);
698692
}
699693
}
700694

0 commit comments

Comments
 (0)