[openib-general] [PATCH 2/2] ib_cm: fix REJ due to invalid GID

Michael S. Tsirkin mst at mellanox.co.il
Tue Jul 18 11:24:18 PDT 2006


Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [openib-general] [PATCH 2/2] ib_cm: fix REJ due to invalid GID
> 
> Michael S. Tsirkin wrote:
> >> 		ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av);
> >>-		if (ret)
> >>+		if (ret) {
> >>+			cm_issue_rej(work->port, work->mad_recv_wc,
> >>+				     IB_CM_INVALID_ALT_GID, CM_MSG_RESPONSE_REQ,
> >>+				     NULL, 0);
> >>+			reject = 0;
> >> 			goto error3;
> >>+		}
> >> 	}
> > 
> > 
> > Hmm ... it looks like cm_init_av_by_path can also fail if ib_find_cached_pkey
> > returns an error - is it right that your patch will return invalid gid
> > in this case?
> > 
> > Maybe the right thing to do is to
> > 1. Make cm_init_av_by_path return a more specific error in case of GID
> >    mismatch.  ENXIO might be a good fit, but we can always add our own
> > 2. Teach cm_destroy_id to send invalid gid reject on this error
> 
> I'm not sure what the correct reject message would be for an invalid pkey...
> 
> I agree that being more specific would be good though.
> 
> - Sean
> 

By the way, AFAIK  by design cache might be out of sync with actual hardware.
Roland, could you confirm this pls?

So if we look things up in cache and they are not there there must be
a retry strategy which is missing if we reject the connection.
A quick solution would be to force cache update before reject,
or query device directly.

Comments?


-- 
MST




More information about the general mailing list