[ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

Jim Mott jimmott at austin.rr.com
Wed Sep 26 18:41:44 PDT 2007


The same bug exists with mthca.  I saw it originally in the kernel doing RDS work, but I just put together a short user space test.

ibv_query_device(MT25204) returns max_sge=30
  - ibv_create_qp with qp_attr.cap.max_send_sge = dev_attr.max_sge fails
  - ibv_create_qp with qp_attr.cap.max_send_sge = dev_attr.max_sge-1 works

I only have the two types of adapters to test with.
-----Original Message-----
From: Roland Dreier [mailto:rdreier at cisco.com] 
Sent: Wednesday, September 26, 2007 5:32 PM
To: Jim Mott
Cc: general at lists.openfabrics.org
Subject: Re: [ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device

 > A minimal API change that could help would be to add two new fields
 > to ib_device_attr structure returned by ib_query_device:
 >   - delta_sge_sg
 >   - delta_sge_rd

Hmm, a cute idea but I'm still left wondering if it's worth the ABI
breakage etc just to give a few more S/G entries in some situations.

 > The behavior would be that in all cases using max_sge for send or
 > receive SGE count in create_qp would always succeed.  That means
 > the current value the drivers return there would have to be reduced
 > to fix this bug.  All existing codes would continue to run.

Actually are there any drivers other than patched mlx4 where max_sge
doesn't always work?  I agree we do want to get this right, but I
thought we had fixed all such bugs.  (And we should make sure that any
"shrinking WQE" patch for mlx4 doesn't introduce new bugs)

(BTW I see a different bug in unpatched mlx4, namely that it might
report a too-big number of S/G entries allowed for the SQ)

 > It looks like there is some movement in this direction already
 > with the fields:
 >   - max_sge_rd (nes, amso1100, ehca, cxgb3 only)

This field is obsolete, since we don't handle RD and almost certainly
never will.  I'm not sure why anyone is setting a value.

 >   - max_srq_sge (amso1100, mthca, mlx4, ehca, ipath only)

Any devices that handle SRQ should set this.  I think cxgb3 does not
support SRQ.

 - R.




More information about the general mailing list