[ofa-general] Re: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

Michael S. Tsirkin mst at dev.mellanox.co.il
Thu Sep 27 13:55:41 PDT 2007


> Quoting Jim Mott <jimmott at austin.rr.com>:
> Subject: [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
> 
>   This is a two part bug report.  One is a conceptual problem that may just be a problem of understanding on my part.  The other is
> what I believe to be a bug in the mlx4 driver.
> 
> 1) ib_create_qp() fails with max_sge 
>   If you use ib_query_device() to return the device specific 
> attribute max_sge, it seems reasonable to expect you can create
> a QP with max_send_sge=max_sge.  The problem is that this often
> fails.
> 
>   The reason is that depending on the QP type (RC, UD, etc.) and
> how the QP will be used (send, RDMA, atomic, etc.), there can be
> extra segments required in the WQE that eat up SGE entries.  So
> while some send WQE might have max_sge available SGEs, many will
> not.
> 
>   Normally the difference between max_sge and the actual maximum
> value allowed (and checked) for max_send_sge is 1 or 2.
> 
>   This issue may need API extensions to definitively resolve.  In
> the short term, it would be very nice if max_sge reported by 
> ib_query_device() could always return a value that ib_create_qp()
> could use.  Think of it as the minimum max_send_sge value that
> will work for all QP types.
> 
> 
> 2) mlx4 setting of max send SQEs
>   The recent patch to support shrinking WQEs introduces a 
> behavior that creates a big difference between the mlx4 
> supported send SGEs (checked against 61, should be 59 or 60,
> and reported in ib_query_device as 32 to equal receive side
> max_rq_sg value).  
> 
>   The patch that follows will allow an MLX4 to support the
> number of send SGEs returned by ib_query_devce, and in fact
> quite a few more.  It probably breaks shrinking WQEs and thus
> should not be applied directly.
> 
>   Note that if ib_query_device() returned max_sge adjusted
> for the raddr and atomic segments, this fix would not be
> needed.  MLX4 would still support more SGEs in hardware than
> can be used through the API, but that is a different problem.  
> 
> --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c     2007-09-26 13:27:47.000000000 -0500
> +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c  2007-09-26 13:36:40.000000000 -0500
> @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx
>                 qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s));
>  
>         for (;;) {
> -               if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz)
> +               if (s > dev->dev->caps.max_sq_desc_sz)
>                         return -EINVAL;
>  
>                 qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << qp->sq.wqe_shift);

Good idea, but that patch needs more work: max_send_sge returned
to user should be made smaller to avoid corrupting the WQE.

-- 
MST



More information about the general mailing list