[ofa-general] Re: Kernel panic in IPoIB stability testing
Jack Morgenstein
jackm at dev.mellanox.co.il
Tue Feb 3 23:20:25 PST 2009
On Wednesday 04 February 2009 08:46, Jack Morgenstein wrote:
> On Tuesday 03 February 2009 19:56, Yossi Etigin wrote:
> > I think it comes from unicast_arp_send.
> > Consider this scenario:
> > - paths are flushed (opensm up/down).
> > - unicast_arp_send() is called with a path in priv->path_list. path->valid is 0.
> > - path_rec_start() fails with -EAGAIN (-11) because alloc_mad() fails - no sm ah (yet)
> > (see the prints just before the panic).
> > - unicast_arp_send calls() path_free().
> > - path memory is overwritten.
> > - __ipoib_dev_flush() is called again.
> > - mark_paths_invalid() tries to iterate over priv->path_list and gets kernel panic
> > because path->list became invalid.
>
> I think you are right.
How about this:
path = __path_find(dev, phdr->hwaddr + 4);
if (!path || !path->valid) {
int had_path = 0;
if (!path)
path = path_rec_create(dev, phdr->hwaddr + 4);
else
had_path = 1;
if (path) {
/* put pseudoheader back on for next time */
skb_push(skb, sizeof *phdr);
__skb_queue_tail(&path->queue, skb);
if (path_rec_start(dev, path)) {
if (had_path) {
list_del(&path->list);
rb_erase(&path->rb_node,
&priv->path_tree);
}
spin_unlock(&priv->lock);
path_free(dev, path);
return;
} else if (!had_path)
__path_add(dev, path);
} else {
++dev->stats.tx_dropped;
dev_kfree_skb_any(skb);
}
spin_unlock(&priv->lock);
return;
}
- Jack
More information about the general
mailing list