[ofa-general] Re: Kernel panic in IPoIB stability testing

Jack Morgenstein jackm at dev.mellanox.co.il
Tue Feb 3 22:46:48 PST 2009


On Tuesday 03 February 2009 19:56, Yossi Etigin wrote:
> I think it comes from unicast_arp_send.
> Consider this scenario:
> - paths are flushed (opensm up/down).
> - unicast_arp_send() is called with a path in priv->path_list. path->valid is 0.
> - path_rec_start() fails with -EAGAIN (-11) because alloc_mad() fails - no sm ah (yet)
>   (see the prints just before the panic).
> - unicast_arp_send calls() path_free().
> - path memory is overwritten.
> - __ipoib_dev_flush() is called again.
> - mark_paths_invalid() tries to iterate over priv->path_list and gets kernel panic
>   because path->list became invalid.

I think you are right.

In unicast_arp_send, we have the following code:
	path = __path_find(dev, phdr->hwaddr + 4);
	if (!path || !path->valid) {
		if (!path)
			path = path_rec_create(dev, phdr->hwaddr + 4);
		if (path) {
			/* put pseudoheader back on for next time */
			skb_push(skb, sizeof *phdr);
			__skb_queue_tail(&path->queue, skb);

			if (path_rec_start(dev, path)) {
				spin_unlock(&priv->lock);
				path_free(dev, path);
				return;
			} else
				__path_add(dev, path);
		} else {
			++dev->stats.tx_dropped;
			dev_kfree_skb_any(skb);
		}

		spin_unlock(&priv->lock);
		return;
	}

It was originally written without the path->valid check in the "if", and so was based on the path record
being allocated within the "if".  In this case, the path record was not yet inserted into the path list.
When you added the "valid" processing, you did not take this into account.

You need code something like the following:

	path = __path_find(dev, phdr->hwaddr + 4);
	if (!path || !path->valid) {
		int had_path = 0;
		if (!path)
			path = path_rec_create(dev, phdr->hwaddr + 4);
		else
		    had_path = 1;
		if (path) {
			/* put pseudoheader back on for next time */
			skb_push(skb, sizeof *phdr);
			__skb_queue_tail(&path->queue, skb);

			if (path_rec_start(dev, path)) {
				if (had_path)
					/* detach from path list here under spinlock */
				spin_unlock(&priv->lock);
				path_free(dev, path);
				return;
			} else if (!had_path)
				__path_add(dev, path);
		} else {
			++dev->stats.tx_dropped;
			dev_kfree_skb_any(skb);
		}

		spin_unlock(&priv->lock);
		return;
	}

- Jack



More information about the general mailing list