[ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

Jim Mott jimmott at austin.rr.com
Wed Sep 26 12:06:25 PDT 2007


  This is a two part bug report.  One is a conceptual problem that may just be a problem of understanding on my part.  The other is
what I believe to be a bug in the mlx4 driver.

1) ib_create_qp() fails with max_sge 
  If you use ib_query_device() to return the device specific 
attribute max_sge, it seems reasonable to expect you can create
a QP with max_send_sge=max_sge.  The problem is that this often
fails.

  The reason is that depending on the QP type (RC, UD, etc.) and
how the QP will be used (send, RDMA, atomic, etc.), there can be
extra segments required in the WQE that eat up SGE entries.  So
while some send WQE might have max_sge available SGEs, many will
not.

  Normally the difference between max_sge and the actual maximum
value allowed (and checked) for max_send_sge is 1 or 2.

  This issue may need API extensions to definitively resolve.  In
the short term, it would be very nice if max_sge reported by 
ib_query_device() could always return a value that ib_create_qp()
could use.  Think of it as the minimum max_send_sge value that
will work for all QP types.


2) mlx4 setting of max send SQEs
  The recent patch to support shrinking WQEs introduces a 
behavior that creates a big difference between the mlx4 
supported send SGEs (checked against 61, should be 59 or 60,
and reported in ib_query_device as 32 to equal receive side
max_rq_sg value).  

  The patch that follows will allow an MLX4 to support the
number of send SGEs returned by ib_query_devce, and in fact
quite a few more.  It probably breaks shrinking WQEs and thus
should not be applied directly.

  Note that if ib_query_device() returned max_sge adjusted
for the raddr and atomic segments, this fix would not be
needed.  MLX4 would still support more SGEs in hardware than
can be used through the API, but that is a different problem.  

--- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c     2007-09-26 13:27:47.000000000 -0500
+++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c  2007-09-26 13:36:40.000000000 -0500
@@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx
                qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s));
 
        for (;;) {
-               if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz)
+               if (s > dev->dev->caps.max_sq_desc_sz)
                        return -EINVAL;
 
                qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << qp->sq.wqe_shift);




More information about the general mailing list