[openib-general] ib_mad.c comments
Hal Rosenstock
halr at voltaire.com
Sat Sep 11 11:12:17 PDT 2004
On Sat, 2004-09-11 at 11:54, Roland Dreier wrote:
> Roland> static u32 ib_mad_client_id = 0;
>
> Hal> This will be locked. Rather than a linked list, this will
> Hal> become an indexed table as this will be more efficient when
> Hal> looking up the client in the receive path.
>
> Hmm... seems like we wouldn't want a static limit on the table size.
> Maybe we should use <linux/idr.h> to handle allocating the 32-bit IDs.
There will be a static limit to the number of clients. We could either
use this as a direct index (or have an additional level of indirection)
and have the client IDs be from a larger ID space. As I don't see a
benefit to this right now (I don't think clients will be very dynamic),
I will start with the direct mapping approach. It can easily be changed
if this is the wrong decision.
> Hal> There used to be a problem with not doing this with some old
> Hal> Linux kernels. I will eliminate the static WC and the
> Hal> comment. Sent patch for this.
>
> What was the problem?
I don't know the exact problem, just what I wrote about the "origin" of
this. The precise origin of this has not been explained to me (and it is
not worth chasing down).
> Hal> To minimize time spent in non process context. Is this an
> Hal> overly conservative approach ?
>
> It's probably OK for now. Probably the correct answer is to use
> tasklets and run in softirq context though (since process context can
> be starved for an arbitrarily long time).
OK. I will put changing this on the futures list for now.
> Roland> I don't think it's a good idea to use a semaphore and
> Roland> signals to control the worker thread. Better would be a
> Roland> wait queue and something like wait_event().
>
> Hal> Is there a problem with this or is this an efficiency issue ?
>
> Using signals tends to be a bad idea because your thread can get
> killed when init sends SIGKILL to all processes while shutting down,
> and then if you need to send MADs later, say to unmount IB-attached
> storage, you're in trouble. The semaphore is not so good because if
> 100 wakeups come before the thread runs, it will wake up 100 times in
> a row (even if it does all the work on the first iteration). Plus
> it's just not idiomatic in the kernel so it makes the code harder to
> review and maintain.
Chainging this is now on my short term list.
> Roland> These don't seem to merit being macros. If you really
> Roland> want they could be inline functions but I don't see any
> Roland> use of the "up" member outside of the macros anyway, so
> Roland> maybe you can just kill them. It seems hard to think of
> Roland> how to test "up" in a way that's not racy.
>
> Hal> up was to be used to qualify whether to proceed with posting
> Hal> MADs to send. Why is it racy if the check of it also takes
> Hal> the lock ?
>
> Well, you have to be careful of
>
> context #1 context #2
>
> lock
> see that up is set
> unlock
> lock
> clear up
> unlock
> do work relying on up
> being set...oops
>
> Basically it means that anything that relies on up being set has to
> hold the lock across the whole operation. This includes posting sends
> I assume, which means that the lock has to be a spinlock (since the
> send posting can happen in interrupt context). But that means you
> can't do any operations that sleep and rely on up staying consistent.
> So it starts to look messy to me (it does seem doable but there may be
> details I'm missing).
Got it. It will be removed. I think it means there may be some other
cleanup cases to deal with but they might have been there anyway.
Thanks.
-- Hal
More information about the general
mailing list