[openib-general] max_send_sge < max_sge

Pete Wyckoff pw at osc.edu
Tue Jun 27 13:21:03 PDT 2006


mst at mellanox.co.il wrote on Tue, 27 Jun 2006 09:42 +0300:
> Quoting r. Pete Wyckoff <pw at osc.edu>:
> > Is this a known issue?
> 
> Yes. The fact that ibv_query_device returns some value in hca_cap can not
> guarantee that ibv_create_qp with these parameters will succeed. For example,
> system administrator might have imposed a limit on the amount of memory you can
> pin down, and you will get ENOMEM.

I was hoping to get a guaranteed maximum number from
ibv_query_device so that I would know that calls to ibv_create_qp
would not fail due to my asking for too many CQ entries.  My code
has some idea of how many it wants (16), and compares that to the
hca_cap values to settle for what it can get.  I only happened to
notice that 30 wouldn't work even though it was so claimed when
debugging.

> > Should I always subtract 1 from the reported max on the send side?  Just for
> > this hardware?
> 
> Unless you use it, passing the absolute maximum value supported by hardware does
> not seem, to me, to make sense - it will just slow you down, and waste
> resources.  Is there a protocol out there that actually has a use for 30 sge?

Perhaps I don't understand what is more resource-costly about using
29 sge when they are supported by the hardware.  I'm using them on
the send side to avoid having to either:
    1.  memcpy 29 little buffers into one big buffer
or
    2.  send 29 rdma writes instead of a single rdma write with 29 sges
The buffer on the receiver is contiguous and big enough to hold
everything.

> In my opinion, for the application to be robust it has to either use small
> values that empirically work on most systems, or be able to scale down to
> require less resources if an allocation fails.

Scale down?  So if ibv_create_qp fails, you think I should look at
the return value (which is NULL, not ENOMEM or EINVAL or anything
informative), and then gradually reduce the values for max_recv_sge,
max_send_sge, max_recv_wr, max_send_wr, max_inline_data below the
reported HCA maximum until I find something that works?

I'll subtract 1 from the hca_cap.max_sge for Mellanox hardware
before doing the comparison against how many SGEs I'd like to get.
Otherwise I can't see much alternative to trusting the hca_cap
values that are returned.

		-- Pete




More information about the general mailing list