[ofw] IBAL CM ignores responder resources of the passive side
Leonid Keller
leonid at mellanox.co.il
Mon Jun 27 00:54:40 PDT 2011
We did the below patch and got instantly the assert.
Which brought me to understanding, that assert should be removed, but 'if' should be kept.
Why?
Because the active side do not look at ReponderResources and InitDepth of the passive side (sent in REP).
It allows CM to set all the parameters.
The old code gave to the passive side a possibility not to negotiate parameters, but to leave it to CM.
And I think, that we should keep this possibility.
> -----Original Message-----
> From: Leonid Keller
> Sent: Thursday, June 23, 2011 3:08 PM
> To: 'Fab Tillier'; Hefty, Sean
> Cc: ofw_list
> Subject: RE: IBAL CM ignores responder resources of the passive side
>
> I'd say, it's not "broken" applications, but applications, that do not care
> what Responder Resources should be.
> And if an application in a third company works with previous drivers on 1000
> computers and now they get new drivers and install them on 1000 computers and
> suddenly the application fails ...
>
> I'm going to add in our code the following lines:
>
> ASSERT(p_cm_rep->resp_res);
> if (p_cm_rep->resp_res)
> p_cep->resp_res = p_cm_rep->resp_res;
>
> You may think of leaving in the OFED code only the last line.
>
>
> > -----Original Message-----
> > From: Fab Tillier [mailto:ftillier at microsoft.com]
> > Sent: Thursday, June 23, 2011 2:53 AM
> > To: Leonid Keller; Hefty, Sean
> > Cc: ofw_list
> > Subject: RE: IBAL CM ignores responder resources of the passive side
> >
> > I'd just let existing broken apps get fixed and fix it in the CM
> > properly. That is, do the straight assignment, not the conditional
> > assignment.
> >
> > Cheers,
> > -Fab
> >
> > -----Original Message-----
> > From: Leonid Keller [mailto:leonid at mellanox.co.il]
> > Sent: Wednesday, June 22, 2011 10:03 AM
> > To: Hefty, Sean; Fab Tillier
> > Cc: ofw_list
> > Subject: RE: IBAL CM ignores responder resources of the passive side
> >
> > I'm afraid that just
> >
> > p_cep->resp_res = p_cm_rep->resp_res;
> >
> > can break some existing applications, which do not fill p_cm_rep-
> > >resp_res and still work.
> > Maybe
> >
> > if (p_cm_rep->resp_res)
> > p_cep->resp_res = p_cm_rep->resp_res;
> >
> > > -----Original Message-----
> > > From: Hefty, Sean [mailto:sean.hefty at intel.com]
> > > Sent: Wednesday, June 22, 2011 7:04 PM
> > > To: Leonid Keller; Fab Tillier
> > > Cc: ofw_list
> > > Subject: RE: IBAL CM ignores responder resources of the passive side
> > >
> > > without looking at more of the code, it looks like a bug
> > >
> > >
> > > > __save_user_rep(
> > > >
> > > > IN
> > > > cep_agent_t* const
> > p_port_cep,
> > > >
> > > > IN
> > > > kcep_t* const
> > > p_cep,
> > > >
> > > > IN const iba_cm_rep*
> > > const
> > > > p_cm_rep,
> > > >
> > > > IN
> > > > uint8_t
> > > > rnr_nak_timeout )
> > > >
> > > > {
> > > >
> > > > AL_ENTER( AL_DBG_CM );
> > > >
> > > >
> > > >
> > > > p_cep->local_qpn = p_cm_rep->qpn;
> > > >
> > > > p_cep->rq_psn = p_cm_rep->starting_psn;
> > > >
> > > > p_cep->init_depth = p_cm_rep->init_depth;
> > > >
> > > >
> > > >
> > >
> > >
> > > I think we can just always do the assignment:
> > >
> > > p_cep->resp_res = p_cm_rep->resp_res;
> > >
> > > here. Is the p_cep->resp_res the value that actually goes into the
> > > REP?
> > >
> > >
> > > > ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
> > > >
> > > > /* Check the CA's responder resource max and trim
> > if
> > > > necessary. */
> > > >
> > > > if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
> > > > >max_qp_resp_res <
> > >
> > >
> > > whine: The max_qp_resp_res isn't going to dynamically change. We
> > > should use a saved value somewhere that doesn't require serializing
> > all
> > > CM users for this check.
> > >
> > > - Sean
More information about the ofw
mailing list