net/mlx5e: Prevent encap offload when neigh update is running
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:
crash> bt
ffff8aece8a64000
PID:
1514557 TASK:
ffff8aece8a64000 CPU: 3 COMMAND: "tc"
#0 [
ffffa6d14183f368] __schedule at
ffffffffb8ba7f45
#1 [
ffffa6d14183f3f8] schedule at
ffffffffb8ba8418
#2 [
ffffa6d14183f418] schedule_preempt_disabled at
ffffffffb8ba8898
#3 [
ffffa6d14183f428] __mutex_lock at
ffffffffb8baa7f8
#4 [
ffffa6d14183f4d0] mutex_lock_nested at
ffffffffb8baabeb
#5 [
ffffa6d14183f4e0] mlx5e_attach_encap at
ffffffffc0f48c17 [mlx5_core]
#6 [
ffffa6d14183f628] mlx5e_tc_add_fdb_flow at
ffffffffc0f39680 [mlx5_core]
#7 [
ffffa6d14183f688] __mlx5e_add_fdb_flow at
ffffffffc0f3b636 [mlx5_core]
#8 [
ffffa6d14183f6f0] mlx5e_tc_add_flow at
ffffffffc0f3bcdf [mlx5_core]
#9 [
ffffa6d14183f728] mlx5e_configure_flower at
ffffffffc0f3c1d1 [mlx5_core]
#10 [
ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at
ffffffffc0f3d529 [mlx5_core]
#11 [
ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at
ffffffffc0f3d714 [mlx5_core]
#12 [
ffffa6d14183f7b0] tc_setup_cb_add at
ffffffffb8931bb8
#13 [
ffffa6d14183f810] fl_hw_replace_filter at
ffffffffc0dae901 [cls_flower]
#14 [
ffffa6d14183f8d8] fl_change at
ffffffffc0db5c57 [cls_flower]
#15 [
ffffa6d14183f970] tc_new_tfilter at
ffffffffb8936047
#16 [
ffffa6d14183fac8] rtnetlink_rcv_msg at
ffffffffb88c7c31
#17 [
ffffa6d14183fb50] netlink_rcv_skb at
ffffffffb8942853
#18 [
ffffa6d14183fbc0] rtnetlink_rcv at
ffffffffb88c1835
#19 [
ffffa6d14183fbd0] netlink_unicast at
ffffffffb8941f27
#20 [
ffffa6d14183fc18] netlink_sendmsg at
ffffffffb8942245
#21 [
ffffa6d14183fc98] sock_sendmsg at
ffffffffb887d482
#22 [
ffffa6d14183fcb8] ____sys_sendmsg at
ffffffffb887d81a
#23 [
ffffa6d14183fd38] ___sys_sendmsg at
ffffffffb88806e2
#24 [
ffffa6d14183fe90] __sys_sendmsg at
ffffffffb88807a2
#25 [
ffffa6d14183ff28] __x64_sys_sendmsg at
ffffffffb888080f
#26 [
ffffa6d14183ff38] do_syscall_64 at
ffffffffb8b9b6a8
#27 [
ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at
ffffffffb8c0007c
crash> bt 0xffff8aeb07544000
PID:
1110766 TASK:
ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9"
#0 [
ffffa6d14e6b7bd8] __schedule at
ffffffffb8ba7f45
#1 [
ffffa6d14e6b7c68] schedule at
ffffffffb8ba8418
#2 [
ffffa6d14e6b7c88] schedule_timeout at
ffffffffb8baef88
#3 [
ffffa6d14e6b7d10] wait_for_completion at
ffffffffb8ba968b
#4 [
ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at
ffffffffc0f47ec4 [mlx5_core]
#5 [
ffffa6d14e6b7da0] mlx5e_rep_update_flows at
ffffffffc0f3e734 [mlx5_core]
#6 [
ffffa6d14e6b7df8] mlx5e_rep_neigh_update at
ffffffffc0f400bb [mlx5_core]
#7 [
ffffa6d14e6b7e50] process_one_work at
ffffffffb80acc9c
#8 [
ffffa6d14e6b7ed0] worker_thread at
ffffffffb80ad012
#9 [
ffffa6d14e6b7f10] kthread at
ffffffffb80b615d
#10 [
ffffa6d14e6b7f50] ret_from_fork at
ffffffffb8001b2f
After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.
Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.
Fixes: 95435ad7999b ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>