[ofa-general] NetEffect, iw_nes and kernel warning
Roland Dreier
rdreier at cisco.com
Tue Jan 27 15:53:16 PST 2009
Interesting... looks like an unfortunate interaction with unclear
locking rules. See below for full explanation.
BTW, what workload are you running to hit this?
I assume you have CONFIG_HIGHMEM set?
> WARNING: at kernel/softirq.c:136 local_bh_enable+0x9b/0xa0()
I assume this is
WARN_ON_ONCE(in_irq() || irqs_disabled());
The interesting parts of the stack trace seem to be (reversing the order
so the story makes sense):
[<e8e3f815>] nes_netdev_start_xmit+0x815/0x8a0 [iw_nes]
nes_netdev_start_xmit() calls skb_linearize() for nonlinear skbs it
can't handle, which calls __pskb_pull_tail():
[<c048982c>] __pskb_pull_tail+0x5c/0x2e0
__pskb_pull_tail() calls skb_copy_bits():
[<c0489c05>] skb_copy_bits+0x155/0x290
At least in some cases, skb_copy_bits() calls kmap_skb_frag() and more
to the point kunmap_skb_frag(), which looks like:
static inline void kunmap_skb_frag(void *vaddr)
{
kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
#ifdef CONFIG_HIGHMEM
local_bh_enable();
#endif
}
which leads to:
[<c012a79b>] local_bh_enable+0x9b/0xa0
which hits the irqs_disabled() warning because iw_nes is using LLTX, and
nes_netdev_start_xmit() does:
local_irq_save(flags);
if (!spin_trylock(&nesnic->sq_lock)) {
at the very beginning.
The best solution is probably for iw_nes to stop using LLTX and use the
main netdev lock... but actually I still don't see how it's safe for a
net driver to call skb_linearize() from its transmit routine, since
there's a chance that that will unconditionally enable BHs?
- R.
More information about the general
mailing list