[openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

Roland Dreier rdreier at cisco.com
Thu Feb 8 20:23:18 PST 2007


BTW, while looking at iwcm.c, I noticed the following highly dubious
code for the first time:

	static int iwcm_deref_id(struct iwcm_id_private *cm_id_priv)
	{
		int ret = 0;
	
		BUG_ON(atomic_read(&cm_id_priv->refcount)==0);
		if (atomic_dec_and_test(&cm_id_priv->refcount)) {
			BUG_ON(!list_empty(&cm_id_priv->work_list));
			if (waitqueue_active(&cm_id_priv->destroy_comp.wait)) {
				BUG_ON(cm_id_priv->state != IW_CM_STATE_DESTROYING);
				BUG_ON(test_bit(IWCM_F_CALLBACK_DESTROY,
						&cm_id_priv->flags));
				ret = 1;
			}
			complete(&cm_id_priv->destroy_comp);
		}
	
		return ret;
	}

The test of waitqueue_active on destroy_comp.wait looks really bad for
two reasons: first, it is relying on an internal implementation detail
of struct completion that really shouldn't be used by generic code.
And second, it seems to me that this doesn't even work right, since
there is a race something like the following:

iw_destroy_cm_id():
destroy_cm_id(cm_id); // still 1 ref left

				cm_work_handler():
					if (iwcm_deref_id()) // drop last ref
						return;
					// no one waiting yet, doesn't
					// return, but destroy_comp is
					// signaled

wait_for_completion(&cm_id_priv->destroy_comp);
// destroy_comp is signaled, proceed
kfree(cm_id_priv);

					// continue using cm_id_priv
					// OOPS

I don't understand this code well enough for the fix to be obvious.

 - R.




More information about the general mailing list