[ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
Jim Mott
jimmott at austin.rr.com
Wed Sep 26 18:41:44 PDT 2007
The same bug exists with mthca. I saw it originally in the kernel doing RDS work, but I just put together a short user space test.
ibv_query_device(MT25204) returns max_sge=30
- ibv_create_qp with qp_attr.cap.max_send_sge = dev_attr.max_sge fails
- ibv_create_qp with qp_attr.cap.max_send_sge = dev_attr.max_sge-1 works
I only have the two types of adapters to test with.
-----Original Message-----
From: Roland Dreier [mailto:rdreier at cisco.com]
Sent: Wednesday, September 26, 2007 5:32 PM
To: Jim Mott
Cc: general at lists.openfabrics.org
Subject: Re: [ofa-general] [Bug report / partial patch] OFED 1.3 send max_sge lower than reported by ib_query_device
> A minimal API change that could help would be to add two new fields
> to ib_device_attr structure returned by ib_query_device:
> - delta_sge_sg
> - delta_sge_rd
Hmm, a cute idea but I'm still left wondering if it's worth the ABI
breakage etc just to give a few more S/G entries in some situations.
> The behavior would be that in all cases using max_sge for send or
> receive SGE count in create_qp would always succeed. That means
> the current value the drivers return there would have to be reduced
> to fix this bug. All existing codes would continue to run.
Actually are there any drivers other than patched mlx4 where max_sge
doesn't always work? I agree we do want to get this right, but I
thought we had fixed all such bugs. (And we should make sure that any
"shrinking WQE" patch for mlx4 doesn't introduce new bugs)
(BTW I see a different bug in unpatched mlx4, namely that it might
report a too-big number of S/G entries allowed for the SQ)
> It looks like there is some movement in this direction already
> with the fields:
> - max_sge_rd (nes, amso1100, ehca, cxgb3 only)
This field is obsolete, since we don't handle RD and almost certainly
never will. I'm not sure why anyone is setting a value.
> - max_srq_sge (amso1100, mthca, mlx4, ehca, ipath only)
Any devices that handle SRQ should set this. I think cxgb3 does not
support SRQ.
- R.
More information about the general
mailing list