[ofa-general] Re: [RFC][PATCH] last wqe event handler patch

Roland Dreier rdreier at cisco.com
Wed Jun 25 14:43:19 PDT 2008


 > According to our stress test, 5 second timeout is way too short. What do
 > you mean by add locking there?

Interesting... I wonder if it really is taking that long for everything
to finish draining, or if the system is too busy so it sees a spurious
timeout?  The intention of all of this is that it should "never happen"
unless the hardware really is stuck.

Anyway I meant by locking that we take the lock around

			/*
			 * assume the HW is wedged and just free up everything.
			 */
			list_splice_init(&priv->cm.rx_flush_list,
					 &priv->cm.rx_reap_list);
			list_splice_init(&priv->cm.rx_error_list,
					 &priv->cm.rx_reap_list);

so that we don't end up moving stuff out from under other processing.
But now I see that the lock is already held so maybe that's nonsense.

What exactly is causing the crash here?

 - R.



More information about the general mailing list