[ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
Tom Tucker
tom at opengridcomputing.com
Thu Sep 27 13:37:13 PDT 2007
On Wed, 2007-09-26 at 14:06 -0500, Jim Mott wrote:
> This is a two part bug report. One is a conceptual problem that may just be a problem of understanding on my part. The other is
> what I believe to be a bug in the mlx4 driver.
mthca has the same issue.
>
> 1) ib_create_qp() fails with max_sge
> If you use ib_query_device() to return the device specific
> attribute max_sge, it seems reasonable to expect you can create
> a QP with max_send_sge=max_sge. The problem is that this often
> fails.
>
> The reason is that depending on the QP type (RC, UD, etc.) and
> how the QP will be used (send, RDMA, atomic, etc.), there can be
> extra segments required in the WQE that eat up SGE entries. So
> while some send WQE might have max_sge available SGEs, many will
> not.
>
> Normally the difference between max_sge and the actual maximum
> value allowed (and checked) for max_send_sge is 1 or 2.
>
> This issue may need API extensions to definitively resolve. In
> the short term, it would be very nice if max_sge reported by
> ib_query_device() could always return a value that ib_create_qp()
> could use. Think of it as the minimum max_send_sge value that
> will work for all QP types.
>
>
> 2) mlx4 setting of max send SQEs
> The recent patch to support shrinking WQEs introduces a
> behavior that creates a big difference between the mlx4
> supported send SGEs (checked against 61, should be 59 or 60,
> and reported in ib_query_device as 32 to equal receive side
> max_rq_sg value).
>
> The patch that follows will allow an MLX4 to support the
> number of send SGEs returned by ib_query_devce, and in fact
> quite a few more. It probably breaks shrinking WQEs and thus
> should not be applied directly.
>
> Note that if ib_query_device() returned max_sge adjusted
> for the raddr and atomic segments, this fix would not be
> needed. MLX4 would still support more SGEs in hardware than
> can be used through the API, but that is a different problem.
>
> --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:27:47.000000000 -0500
> +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 13:36:40.000000000 -0500
> @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx
> qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s));
>
> for (;;) {
> - if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz)
> + if (s > dev->dev->caps.max_sq_desc_sz)
> return -EINVAL;
>
> qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << qp->sq.wqe_shift);
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list