[openib-general] ib_mad.c comments
Roland Dreier
roland at topspin.com
Wed Sep 8 20:07:23 PDT 2004
Huh... I didn't even notice it was checked in... anyway, my comments
follow after some comments on Sean's comments:
Sean> Need to lock when checking/setting version/class/methods.
I agree for the initial implementation. Ultimately RCU seems better
but I would recommend sticking with locking to start with since it's
much easier to code correctly.
Sean> We should return the error code from ib_post_send in order
Sean> to handle overruns differently.
What did we decide about how to handle someone posting more sends than
the underlying work queue can hold? In any case I agree with this.
Sean> Should we avoid casting the list_head to a structure where
Sean> possible?
Yes, definitely. It's much better to do something like
&mystruct->list rather than relying on the fact that mystruct has a
struct list_head as its first member. In fact the usage of list.h is
pretty broken throughout ib_mad.c, see below.
Sean> Not sure why these calls search for the corresponding work request.
Yes -- we know the next request to complete will always be the oldest
one we have around, right?
Sean> Not sure if we need to support max_sge on the send queue.
Sean> This may be substantially larger than what we need. At a
Sean> minimum, I think that we need 2 for optimal RMPP support.
Sean> I'm not sure where the trade-off is between SGE versus
Sean> copying into a single buffer lies.
I'm not sure there's much practical difference between copying and
using the HCA to do a gather/scatter on a buffer of size 256. The big
difference is memory per WQE (at least for mthca): supporting the
max_sge means each WQE will be about 1 KB, while using a smaller
number means each WQE could be about 128 bytes.
OK, my comments (which are based on only a quick read and therefore
focused mostly on low-level coding details):
kmem_cache_t *ib_mad_cache;
seems to be unused -- should be static anyway.
static u32 ib_mad_client_id = 0;
needs to be protected by a lock when used later
#define IB_MAD_DEVICE_LIST_LOCK_VAR unsigned long ib_mad_device_list_sflags
#define IB_MAD_DEVICE_LIST_LOCK() spin_lock_irqsave(&ib_mad_device_list_lock, ib_mad_device_list_sflags)
#define IB_MAD_DEVICE_LIST_UNLOCK() spin_unlock_irqrestore(&ib_mad_device_list_lock, ib_mad_device_list_sflags)
Don't use this idiom ... just use the spinlock functions directly. It
makes locking code harder to read and review, it leads to wasteful
stuff like the below (in ib_mad_reg()):
IB_MAD_DEVICE_LIST_LOCK_VAR;
IB_MAD_AGENT_LIST_LOCK_VAR;
and besides, Documentation/CodingStyle says
"macros that depend on having a local variable with a magic name
might look like a good thing, but it's confusing as hell when one
reads the code and it's prone to breakage from seemingly innocent
changes."
/*
* ib_mad_reg - Register to send/receive MADs.
* @device - The device to register with.
Start with /** for kernel doc to pick this up. Might be better to put
it in a header file so that it's easier to find the documentation (but
it's OK to leave it in a .c).
struct ib_mad_device_private *entry, *priv = NULL,
*head = (struct ib_mad_device_private *) &ib_mad_device_list;
This definition of head is totally broken, since ib_mad_device_list is
declared as:
static struct list_head ib_mad_device_list;
so trying to use it as a struct ib_mad_device_private is just going
off into random memory. However there's no reason to even have a
variable named head, since it seems you only use it in:
list_for_each(entry, head) {
This really should be
list_for_each_entry(entry, &ib_mad_device_list, list) {
and the definition of struct ib_mad_device_private needs to be fixed from
struct ib_mad_device_private {
struct ib_mad_device_private *next;
to
struct ib_mad_device_private {
struct list_head list;
(you don't have to use the name list for your struct list_head member;
that's just my habit).
list_for_each(entry2, head2) {
if (entry2->agent == mad_agent_priv->agent) {
list_del((struct list_head *)entry2);
break;
}
}
This is broken for a couple of reasons: misuse of list_for_each as
just described; also, you can't delete items from a list while walking
through it with list_for_each (use list_for_each_safe instead);
finally, there's no reason to walk a list to find the entry you just
added in the same function -- just call list_del on the entry
directly, since you should still have it around.
Pretty much all of these comments apply to all use of the list.h
macros in the file -- most look wrong.
What context is it allowed to call ib_mad_post_send() from? We never
discussed this, but since the current implementation allocates work
requests with
mad_send_wr = kmalloc(sizeof *mad_send_wr, GFP_KERNEL);
right now it can only be called from process context with no locks
held. This seems like it violates the principle of least surprise,
because ib_post_send() can be called from any context. Also, the
failure case
if (!mad_send_wr) {
printk(KERN_ERR "No memory for ib_mad_send_wr_private\n");
return -ENOMEM;
}
needs to set bad_send_wr.
ib_mad_recv_done_handler() seems to be missing a call to pci_unmap_single().
static u8 convert_mgmt_class(struct ib_mad_reg_req *mad_reg_req)
{
u8 mgmt_class;
/* Alias IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE to 0 */
if (mad_reg_req->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
mgmt_class = 0;
} else {
mgmt_class = mad_reg_req->mgmt_class;
}
return mgmt_class;
}
I'd rewrite this as
static inline u8 convert_mgmt_class(struct ib_mad_reg_req *mad_reg_req)
{
return mad_reg_req->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE ?
0 : mad_reg_req->mgmt_class;
}
or just open code it in the two places it's used.
static int allocate_method_table(struct ib_mad_mgmt_method_table **method)
{
/* .. */
return ENOMEM;
probably should be -ENOMEM;
static void ib_mad_completion_handler(struct ib_mad_device_private *priv)
{
/*
* For stack overflow safety reason, WC is static here.
* This callback may not be called more than once at the same time.
*/
static struct ib_wc wc;
Seems like a bad plan to me -- on an SMP machine with multiple HCAs
(or even multiple ports on a single HCA) it seems like we want to
multithread MAD processing rather than serializing it (In fact Yaron
has made a lot of noise about running on giant SGI NUMA machines with
millions of HCAs, where this looks especially bad). Also the comment
seems to be wrong -- there seems to be one thread per HCA so multiple
copies of the callback can run at once.
static int ib_mad_thread(void *param)
{
struct ib_mad_device_private *priv = param;
struct ib_mad_thread_data *thread_data = &priv->thread_data;
lock_kernel();
daemonize("ib_mad-%-6s-%-2d", priv->device->name, priv->port);
unlock_kernel();
Just use kthread_create() to start your thread and handle all this.
Even though the current Topspin stack uses a MAD processing thread per
HCA, I'm not sure it's the best design. Why do we need to defer work
to process context?
sema_init(&thread_data->sem, 0);
Seems like a race condition here ... what happens if someone else
tries to use the semaphore before the thread has gotten a chance to
run? In any case...
while (1) {
if (down_interruptible(&thread_data->sem)) {
printk(KERN_DEBUG "Exiting ib_mad thread\n");
break;
}
I don't think it's a good idea to use a semaphore and signals to
control the worker thread. Better would be a wait queue and something
like wait_event().
#define IB_MAD_DEVICE_SET_UP(__device__) {\
IB_MAD_DEVICE_LIST_LOCK_VAR;\
IB_MAD_DEVICE_LIST_LOCK();\
(__device__)->up = 1;\
IB_MAD_DEVICE_LIST_UNLOCK();}
#define IB_MAD_DEVICE_SET_DOWN(__device__) {\
IB_MAD_DEVICE_LIST_LOCK_VAR;\
IB_MAD_DEVICE_LIST_LOCK();\
(__device__)->up = 0;\
IB_MAD_DEVICE_LIST_UNLOCK();}
These don't seem to merit being macros. If you really want they
could be inline functions but I don't see any use of the "up" member
outside of the macros anyway, so maybe you can just kill them. It
seems hard to think of how to test "up" in a way that's not racy.
for (i = 0; i < num_ports; i++) {
ret = ib_mad_device_open(device, i);
This is wrong -- for a CA you need to handle ports 1 ... num_ports,
while a switch just uses port 0.
Thanks,
Roland
More information about the general
mailing list