[ofw] IBAL CEP reference counting is... interesting

Sean Hefty sean.hefty at intel.com
Fri Jan 9 11:54:36 PST 2009


While trying to track down a ctrl-c hang, I noticed that the CM cep code uses a
ref_cnt that is allowed to drop to 0 while references are still held on the cep
(connection endpoint).

The problem is most easily seen in process_timewait(), where the ref_cnt of the
cep must be 0 before being processed.  Meaning that the cep structure is
expected to be on a timewait list with a ref_cnt of 0.  Also, at the end of the
loop in process_timewait(), if the cep state is not cep_state_destroy, its state
is set to idle and left dangling.  (Either there's no reference on the cep, or
whatever has a reference on it has not incremented the ref_cnt.)

I'm not sure how to fix this.
 
As for the ctrl-c hang, I tracked that problem down to destroying the cep in the
established state.  (I have the remote endpoint of the connection blocked from
responding.)  The issue is that the cep being destroyed sends a DREQ, taking a
reference on the cep.  The reference is not released until the DREQ has been
retried and completely times out, resulting in blocking the upper level code
waiting for the destroy callback.

Destroying a cep needs to be limited to some sort of reasonable time, rather
than on the order of seconds or minutes, depending on the remote CM response.
For large clusters, the CM timeout can be huge.  My idea to fix this was to have
the DREQ sent once without being tied to the cep if initiated from the destroy
call.  Comments?

- Sean




More information about the ofw mailing list