net/mlx5: Lag, do bond only if slaves agree on roce state
authorMaher Sanalla <msanalla@nvidia.com>
Wed, 22 May 2024 19:26:52 +0000 (22:26 +0300)
committerDavid S. Miller <davem@davemloft.net>
Fri, 24 May 2024 12:27:07 +0000 (13:27 +0100)
Currently, the driver does not enforce that lag bond slaves must have
matching roce capabilities. Yet, in mlx5_do_bond(), the driver attempts
to enable roce on all vports of the bond slaves, causing the following
syndrome when one slave has no roce fw support:

mlx5_cmd_out_err:809:(pid 25427): MODIFY_NIC_VPORT_CONTEXT(0×755) op_mod(0×0)
failed, status bad parameter(0×3), syndrome (0xc1f678), err(-22)

Thus, create HW lag only if bond's slaves agree on roce state,
either all slaves have roce support resulting in a roce lag bond,
or none do, resulting in a raw eth bond.

Fixes: 7907f23adc18 ("net/mlx5: Implement RoCE LAG feature")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c

index f7f0476a4a58d350b52b9966a0efaa0d6964a5a8..d0871c46b8c54376154cdf36e90c13636b7582ef 100644 (file)
@@ -719,6 +719,7 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
        struct mlx5_core_dev *dev;
        u8 mode;
 #endif
+       bool roce_support;
        int i;
 
        for (i = 0; i < ldev->ports; i++)
@@ -743,6 +744,11 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
                if (mlx5_sriov_is_enabled(ldev->pf[i].dev))
                        return false;
 #endif
+       roce_support = mlx5_get_roce_state(ldev->pf[MLX5_LAG_P1].dev);
+       for (i = 1; i < ldev->ports; i++)
+               if (mlx5_get_roce_state(ldev->pf[i].dev) != roce_support)
+                       return false;
+
        return true;
 }
 
@@ -910,8 +916,10 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
                } else if (roce_lag) {
                        dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
                        mlx5_rescan_drivers_locked(dev0);
-                       for (i = 1; i < ldev->ports; i++)
-                               mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
+                       for (i = 1; i < ldev->ports; i++) {
+                               if (mlx5_get_roce_state(ldev->pf[i].dev))
+                                       mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
+                       }
                } else if (shared_fdb) {
                        int i;