[ofw] IBAL CM ignores responder resources of the passive side

Leonid Keller leonid at mellanox.co.il
Mon Jun 27 10:24:54 PDT 2011


I think there are several application, that should be fixed and tested with your propose.
See, for example, __ndi_fill_cm_rep()

BTW, could someone remind me why do we have ib_cm_rep (which doesn't contain resp_res at all) and iba_cm_rep ?
Is the first one regarded as outdated ?
Do we have two CM APIs ?
What's the diff ?

> -----Original Message-----
> From: Fab Tillier [mailto:ftillier at microsoft.com]
> Sent: Monday, June 27, 2011 7:07 PM
> To: Leonid Keller; Hefty, Sean
> Cc: ofw_list
> Subject: RE: IBAL CM ignores responder resources of the passive side
> 
> Leonid Keller wrote on Mon, 27 Jun 2011 at 00:54:40
> 
> > We did the below patch and got instantly the assert.
> 
> Running which ULP in WinOFED?  Looking at the code, the ND provider does the
> right thing.  WSD seems to not set resp_res and I'd say that was a bug.  Since
> the WSD provider gets updated with any driver update, there's little chance of
> users running into this if both get fixed concurrently.
> 
> > Which brought me to understanding, that assert should be removed, but 'if'
> > should be kept.
> 
> I think it's important to allow the CEP Manager to support 0 as a valid value,
> indicating that no RDMA reads are supported.  I would keep the assert and
> remove the if.  If you had a different value to indicate "auto-negotiate" you
> could have the proxy trap for zero and convert before passing down to the CEP
> manager.
> 
> > Why?
> > Because the active side do not look at ReponderResources and InitDepth of
> > the passive side (sent in REP).
> > It allows CM to set all the parameters.
> > The old code gave to the passive side a possibility not to negotiate
> > parameters, but to leave it to CM.
> > And I think, that we should keep this possibility.
> 
> Both the active and passive sides should support auto-negotiating, but it
> shouldn't be done using the value 0.
> 
> I think clients should be able to specify whatever values they desire, and the
> CM should then cap to the abilities of the hardware/negotiate on behalf of the
> client.  The actual values can always be examined during connection
> establishment (when retrieving the QP attributes for QP transitions) or
> queried from the QP.
> 
> If an application knows that it could have up to 8 RDMA reads in flight at a
> time, it could always use 8 as the value.  If the hardware doesn't support 8,
> it would cap it to whatever it does support.  The client would only get an
> error if the underlying hardware had a limit of zero (unable to do RDMA
> reads), though I'm not sure that's even allowed by the IB standard.
> 
> -Fab
> 
> >> -----Original Message-----
> >> From: Leonid Keller
> >> Sent: Thursday, June 23, 2011 3:08 PM
> >> To: 'Fab Tillier'; Hefty, Sean
> >> Cc: ofw_list
> >> Subject: RE: IBAL CM ignores responder resources of the passive side
> >>
> >> I'd say, it's not "broken" applications, but applications, that do not
> >> care what Responder Resources should be. And if an application in a
> >> third company works with previous drivers on 1000 computers and now
> >> they get new drivers and install them on 1000 computers and suddenly
> >> the application fails ...
> >>
> >> I'm going to add in our code the following lines:
> >>
> >>  ASSERT(p_cm_rep->resp_res);
> >>  if (p_cm_rep->resp_res)
> >>  	p_cep->resp_res = p_cm_rep->resp_res;
> >> You may think of leaving in the OFED code only the last line.
> >>
> >>
> >>> -----Original Message-----
> >>> From: Fab Tillier [mailto:ftillier at microsoft.com]
> >>> Sent: Thursday, June 23, 2011 2:53 AM
> >>> To: Leonid Keller; Hefty, Sean
> >>> Cc: ofw_list
> >>> Subject: RE: IBAL CM ignores responder resources of the passive side
> >>>
> >>> I'd just let existing broken apps get fixed and fix it in the CM
> >>> properly.  That is, do the straight assignment, not the conditional
> >>> assignment.
> >>>
> >>> Cheers,
> >>> -Fab
> >>>
> >>> -----Original Message-----
> >>> From: Leonid Keller [mailto:leonid at mellanox.co.il]
> >>> Sent: Wednesday, June 22, 2011 10:03 AM
> >>> To: Hefty, Sean; Fab Tillier
> >>> Cc: ofw_list
> >>> Subject: RE: IBAL CM ignores responder resources of the passive side
> >>>
> >>> I'm afraid that just
> >>>
> >>> p_cep->resp_res = p_cm_rep->resp_res;
> >>>
> >>> can break some existing applications, which do not fill p_cm_rep-
> >>>> resp_res and still work.
> >>> Maybe
> >>>
> >>> if (p_cm_rep->resp_res)
> >>> 	p_cep->resp_res = p_cm_rep->resp_res;
> >>>
> >>>> -----Original Message-----
> >>>> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> >>>> Sent: Wednesday, June 22, 2011 7:04 PM
> >>>> To: Leonid Keller; Fab Tillier
> >>>> Cc: ofw_list
> >>>> Subject: RE: IBAL CM ignores responder resources of the passive side
> >>>>
> >>>> without looking at more of the code, it looks like a bug
> >>>>
> >>>>
> >>>>> __save_user_rep(
> >>>>>
> >>>>>                 IN
> >>>>> cep_agent_t* const p_port_cep,
> >>>>>
> >>>>>                 IN
> >>>>> kcep_t* const p_cep,
> >>>>>
> >>>>>                 IN                           const     iba_cm_rep*
> >>>> const
> >>>>> p_cm_rep,
> >>>>>
> >>>>>                 IN
> >>>>> uint8_t
> >>>>> rnr_nak_timeout )
> >>>>>
> >>>>> {
> >>>>>
> >>>>>                 AL_ENTER( AL_DBG_CM );
> >>>>>
> >>>>>
> >>>>>                 p_cep->local_qpn = p_cm_rep->qpn;
> >>>>>
> >>>>>                 p_cep->rq_psn = p_cm_rep->starting_psn;
> >>>>>
> >>>>>                 p_cep->init_depth = p_cm_rep->init_depth;
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> I think we can just always do the assignment:
> >>>>
> >>>> 	p_cep->resp_res = p_cm_rep->resp_res;
> >>>>
> >>>> here.  Is the p_cep->resp_res the value that actually goes into the
> >>>> REP?
> >>>>
> >>>>
> >>>>>                 ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
> >>>>>
> >>>>>                 /* Check the CA's responder resource max and trim
> >>> if
> >>>>> necessary. */
> >>>>>
> >>>>>                 if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
> >>>>>> max_qp_resp_res <
> >>>>
> >>>>
> >>>> whine: The max_qp_resp_res isn't going to dynamically change.  We
> >>>> should use a saved value somewhere that doesn't require serializing
> >>>> all CM users for this check.
> >>>>
> >>>> - Sean
> >



More information about the ofw mailing list