[ofa-general] Re: Kernel panic in IPoIB stability testing
Moni Shoua
monis at Voltaire.COM
Wed Feb 4 07:45:28 PST 2009
> I doubt it. You are leaving a deleted path record as part of the path list.
> This is list corruption (since the list pointers themselves are part of the
> path record structure -- what if this returned storage is re-allocated?).
>
You are right.
> I think the correct fix (after your previous posted comment) is:
> path = __path_find(dev, phdr->hwaddr + 4);
> if (!path || !path->valid) {
> int had_path = 0;
> if (!path)
> path = path_rec_create(dev, phdr->hwaddr + 4);
> else
> had_path = 1;
> if (path) {
> /* put pseudoheader back on for next time */
> skb_push(skb, sizeof *phdr);
> __skb_queue_tail(&path->queue, skb);
>
> if (path_rec_start(dev, path)) {
> if (had_path) {
> list_del(&path->list);
> rb_erase(&path->rb_node,
> &priv->path_tree);
> }
> spin_unlock_irqrestore(&priv->lock, flags);
> path_free(dev, path);
> return;
> } else
> __path_add(dev, path);
> } else {
> ++dev->stats.tx_dropped;
> dev_kfree_skb_any(skb);
> }
>
> spin_unlock_irqrestore(&priv->lock, flags);
> return;
> }
>
> My only question here is:
> Do we have to worry about netif_tx_lock_bh(dev) (as taken in ipoib_flush_paths)?
> (If we do, we have a problem).
>
Besides the locking issue that I hadn't think about yet what if we this fix looks the right thing to do.
But what if we leave the path without freeing it even if path_rec_start() fails?
This would leave a path which is not valid in path_list which is not forbidden state as
I conclude (after all this is the state the function was called)
In this way, I think that we don't have to worry about locks.
and the code will look like this
if (!path || !path->valid) {
if (!path)
path = path_rec_create(dev, phdr->hwaddr + 4);
if (path) {
/* put pseudoheader back on for next time */
skb_push(skb, sizeof *phdr);
__skb_queue_tail(&path->queue, skb);
if (!path->query && path_rec_start(dev, path)) {
spin_unlock_irqrestore(&priv->lock, flags);
- path_free(dev, path);
return;
} else
__path_add(dev, path);
} else {
++priv->stats.tx_dropped;
dev_kfree_skb_any(skb);
}
spin_unlock_irqrestore(&priv->lock, flags);
return;
}
More information about the general
mailing list