[openib-general] max_send_sge < max_sge

Mon Jun 26 23:42:34 PDT 2006

Quoting r. Pete Wyckoff <pw at osc.edu>:
> Subject: max_send_sge < max_sge
> 
> Using stock 2.6.17.1, with verbs 1.0.3-1.fc4 and mthca 1.0.2-1.fc4
> with MT25204, this line:
> 
>     ret = ibv_query_device(ctx, &hca_cap);
> 
> tells me that hca_cap.max_sge = 30.
> 
> However, this code fails, with the last kernel write returning EINVAL:
> 
>     memset(&att, 0, sizeof(att));
>     att.send_cq = 1024;
>     att.recv_cq = 1024;
>     att.cap.max_recv_wr = 512;
>     att.cap.max_send_wr = 512;
>     att.cap.max_recv_sge = 30;
>     att.cap.max_send_sge = 30;
>     att.qp_type = IBV_QPT_RC; 
>     qp = ibv_create_qp(pd, &att);

Some Mellanox HCAs support different max sge values for send queue versus
receive queue, or for different QP types. ibv_query_device returns the maximum
value hardware can support.

> Is this a known issue?

Yes. The fact that ibv_query_device returns some value in hca_cap can not
guarantee that ibv_create_qp with these parameters will succeed. For example,
system administrator might have imposed a limit on the amount of memory you can
pin down, and you will get ENOMEM.

> Should I always subtract 1 from the reported max on the send side?  Just for
> this hardware?

Unless you use it, passing the absolute maximum value supported by hardware does
not seem, to me, to make sense - it will just slow you down, and waste
resources.  Is there a protocol out there that actually has a use for 30 sge?

In my opinion, for the application to be robust it has to either use small
values that empirically work on most systems, or be able to scale down to
require less resources if an allocation fails.

-- 
MST