[openib-general] Re: ib_mad.c comments
Hal Rosenstock
halr at voltaire.com
Fri Sep 10 14:43:36 PDT 2004
Hi Sean,
Thanks for reviewing thi early version which is incomplete.
On Wed, 2004-09-08 at 19:47, Sean Hefty wrote:
> Here's a list of comments from reviewing ib_mad.c. I'll use this list as kind of my to do list for the GSI. Several of these can be delayed when implementing. After we meet tomorrow, I will begin creating patches. Overall, it's a good start.
>
> ib_mad_reg():
> - Need to lock when checking/setting version/class/methods.
This item is on my TODO list for this (ib_mad.c).
> - Need to support registrations for "all" methods of a given class.
> (We may want the initial implementation to only do this for now,
> to shorten the development time.)
This can be done today depending on the definition of "all". All is the
entire method bit mask. If "all" means all the ones valid for a class,
that is a specific bit mask per class. Is this proposing a shortcut way
to do this ? If so, is this much of a savings for the client ?
Do you think it adds that much more complexity to have multiple clients
for methods of the same class ? Wouldn't this be a problem to run SM and
SMA concurrently if just all methods for a class was supported initially
?
> - Should we reference qp0 and qp1 with the registration?
Either way (struct ib_qp * (for qp0 or qp1) or enum ib_qp_type) is fine
with me. It seems to me that the enum approach is easier for the client.
Does the client need a pointer to the QP for something else ?
> - Need to ensure unique tids in case of wrapping.
Are you referring to making sure that the high TID is not in use ? It is
on my TODO list to change this to use client_id as an index (to save
walking a linked list and just index into a table based on the
client_id. This will be done after all the more straightforward changes.
> ib_mad_post_send():
> - We should return the error code from ib_post_send in order to
> handle overruns differently.
Fixed. Posted patch for this.
> - The print level should be lowered from error.
There are 2 prints. Not sure which ones should be lowered. Both or only
the one after ib_post_send ?
> - Should we avoid casting the list_head to a structure where possible?
Will address as part of response to Roland's comments on ib_mad.c
> allocate_method_table():
> - Can just use memset to clear the table.
I changed this to use memset. Posted patch for this.
> check_class_table():
> - Has an extra '{'.
Where's the extra { ?
> ib_mad_recv_done_handler():
> ib_mad_send_done_handler():
> - Not sure why these calls search for the corresponding work request.
On the send side, it should just be the head entry. I will change that.
Right now on the receive side, there is one receive list, and the
receives are posted for QPs 0 and 1 (and ultimately other QPs) so the
right post needs to be found. I think there is additional information in
the WR which is needed for the callbacks. I didn't get far enough on the
receive side so I will defer this part of the answer for now.
> ib_mad_post_receive_mads():
> - I think we can just pass &qp0 or &qp1, rather than a type to
> ib_mad_post_receive_mad.
Right now, the QP type is saved as part of the private MAD. This may be
unnecessary. Won't be sure until I complete the receive side.
> - Print level should be lowered from error
There are 2 prints. Not sure which ones should be lowered. Both or only
the one after ib_post_recv ?
> - We can track the number of posted receives to avoid posting overruns.
This seems like an optimization. I will put it on my TODO list for now.
> struct ib_mad_device_private:
> - If we make qp0 and qp1 an array, it may simply the code and remove
> several checks from the code.
Good point. It does (simplify the code :-) Already posted patch for
this.
> ib_mad_device_open():
> - A nit, but it's logically initializing a port on the device.
Fixed. Posted patch for this.
> - Remove +20 to CQ size.
Fixed. Posted patch for this.
> - We could change to using 1 PD/device, versus 1 PD/port.
Is there any advantage/disadvantage one way or the other ?
> - Not sure if we need to support max_sge on the send queue. This may
> be substantially larger than what we need. At a minimum, I think
> that we need 2 for optimal RMPP support. I'm not sure where the
> trade-off is between SGE versus copying into a single buffer lies.
I'm not following where the minimum of 2 for optimal RMPP support comes
from.
-- Hal
More information about the general
mailing list