[openib-general] [PATCH] [CM] add private data comparisonto match REQs with listens
Tom Tucker
tom at opengridcomputing.com
Fri Dec 2 14:14:10 PST 2005
Am I correct to assume that this functionality is unique to the IB CM
and is not going to be exposed through the CMA?
On Fri, 2005-12-02 at 16:22 -0500, Rimmer, Todd wrote:
> > -----Original Message-----
> > From: Caitlin Bestler [mailto:caitlinb at broadcom.com]
> > Sent: Friday, December 02, 2005 1:59 PM
> > To: Sean Hefty
> > Cc: openib-general at openib.org
> > Subject: RE: [openib-general] [PATCH] [CM] add private data
> > comparisonto
> > match REQs with listens
> >
> >
> > openib-general-bounces at openib.org wrote:
> > > Sean Hefty wrote:
> > >> As an update: further testing revealed that there is an issue with
> > >> this implementation that is also found in the original code. The
> > >> issue deals with how listen requests that rely on a data mask are
> > >> inserted and located in the red/black tree. I'm trying to come up
> > >> with a fix for this.
> > >
> > > After researching into this, I'm coming to the conclusion
> > > that there does not exist an efficient way to sort/search for
> > > listens without adding some restrictions.
> > >
> > > For example, a client listens on id1 with mask1. A request
> > > is matched with the listen if its serviceid & mask1 = id1.
> > > If a second client listens on id2 with mask2, then a request
> > > must check against both requests for a match, or until a
> > > match is found. There's no method that I can find that can
> > > be used to filter checks that works in a generic fashion,
> > > resulting in requests needing to walk a linear list of
> > > listens. There are several potential fixes for this, with
> > > only a couple mentioned below.
> > >
> > > One solution around this is to have the IB CM only listen on
> > > service IDs, and remove the mask parameter from the API.
> > > This requires SDP to change to only listen on ports that have a
> > > listener.
> > >
> > > Another alternative is to restrict the type of masks that are
> > > supported. If masks are restricted to a series of most
> > > significant bits, then the existing algorithm can be used.
> > > For instance, we can support masks 0xFF00 and 0xFFF0, but not
> > > 0x00FF or 0xFF0F. This restriction would work for both SDP and the
> > > CMA. To be clear, the API could change from a mask to the
> > number of
> > > bits to match.
> > >
> > > Matching on private data can either be done by clients, or
> > > restrictions can be placed on it as well. For private data,
> > > I believe that a restriction that all listen requests on the
> > > same service ID use the same mask is sufficient.
> > >
> > > Hopefully this makes sense to people. Thoughts?
> > >
> >
> > Just listen on the Service ID / Port and let the ULP sort them
> > out by destination IP address.
>
> On approach is to make the sort criteria of the tree dependent on a comparison function.
>
> For example the sort could have a multi-faceted compare.
>
> We solved this problem in our stack (which allows listen by SID, sender GUID, receiver Port, private data, etc) by the following set of functions. These were called per red/black tree comparison (both inserts and searches used functions, potentially different). I realize these would not be used exactly as given, but they can provide some ideas on how to do it. ListenMap is the red/black tree our stack used to keep track of all listening CEPs in the system.
>
> // ListenMap Key Compare functions
> // Three functions are provided:
> //
> // CepListenAddrCompare - is used to insert cep entries into the ListenMap and
> // is the primary key_compare function for the ListenMap
> //
> // CepReqAddrCompare - is used to search the ListenMap as part of processing
> // an inbound REQ
> //
> // CepSidrReqAddrCompare - is used to search the ListenMap as part of
> // processing an inbound SIDR_REQ
> //
> // To provide the maximum flexibilty, the key for a CEP bound address is
> // sophisticated and allows wildcarded/optional fields. This allows
> // a listener to simply bind for all traffic of a given SID or to refine the
> // scope by binding for traffic to/from specific addresses, or specific
> // private data. The QPN/EECN/CaGUID aspect is used to allow multiple
> // outbound Peer Connects to still be considered unique.
> //
> // The result of this approach is very flexible CM bind. The same SID
> // can be used on different ports or between different node pairs for
> // completely different meanings. However a SID used between a given
> // pair of nodes must be used for a single model (Listen, Peer, Sidr)
> // In addition for Peer connects, each connect must have a unique
> // QPN/EECN/CaGUID.
> //
> // Comparision allows for wildcarding in all but SID
> // A value of 0 is a wildcard. See ib_helper.h:WildcardGidCompare for
> // the rules of GID comparision, which are more involved due to multiple Gid
> // formats
> //
> // Field is Used by models as follows:
> // Coallating order is: Listen Peer Connect Sidr Register
> // SID Y Y Y
> // local GID option Y future option
> // local LID option Y future option
> // QPN wildcard Y wildcard
> // EECN wildcard Y wildcard
> // CaGUID wildcard Y wildcard
> // remote GID option Y future option
> // remote LID option Y future option
> // private data discriminator length option option option
> // private data discriminator value option option option
> //
> // if bPeer is 0 for either CEP, the QPN, EECN and CaGUID are treated as a match
> //
> // FUTURE: add a sid masking option so can easily listen on a group
> // of SIDs with 1 listen (such as if low bits of sid have a private meaning)
> //
> // FUTURE: add a pkey option so can easily listen on a partition
> //
> // FUTURE: for SIDR to support GID/LID they will have to come from the LRH
> // and GRH headers to the CM mad. local GID and lid could be used to merely
> // select the local port number
>
>
> // A qmap key_compare function to compare the bound address for
> // two listener, SIDR or Peer Connect CEPs
> //
> // key1 - CEP1 pointer
> // key2 - CEP2 pointer
> //
> // Returns:
> // -1: cep1 bind address < cep2 bind address
> // 0: cep1 bind address = cep2 bind address (accounting for wildcards)
> // 1: cep1 bind address > cep2 bind address
> int
> CepListenAddrCompare(uint64 key1, uint64 key2)
> {
> IN CM_CEP_OBJECT* pCEP1 = (CM_CEP_OBJECT*)(uintn)key1;
> IN CM_CEP_OBJECT* pCEP2 = (CM_CEP_OBJECT*)(uintn)key2;
> int res;
>
> if (pCEP1->SID < pCEP2->SID)
> return -1;
> else if (pCEP1->SID > pCEP2->SID)
> return 1;
> res = WildcardGidCompare(&pCEP1->PrimaryPath.LocalGID, &pCEP2->PrimaryPath.LocalGID);
> if (res != 0)
> return res;
> res = WildcardCompareU64(pCEP1->PrimaryPath.LocalLID, pCEP2->PrimaryPath.LocalLID);
> if (res != 0)
> return res;
> if (pCEP1->bPeer && pCEP2->bPeer)
> {
> res = CompareU64(pCEP1->LocalEndPoint.QPN, pCEP2->LocalEndPoint.QPN);
> if (res != 0)
> return res;
> res = CompareU64(pCEP1->LocalEndPoint.EECN, pCEP2->LocalEndPoint.EECN);
> if (res != 0)
> return res;
> res = CompareU64(pCEP1->LocalEndPoint.CaGUID, pCEP2->LocalEndPoint.CaGUID);
> if (res != 0)
> return res;
> }
> res = WildcardGidCompare(&pCEP1->PrimaryPath.RemoteGID, &pCEP2->PrimaryPath.RemoteGID);
> if (res != 0)
> return res;
> res = WildcardCompareU64(pCEP1->PrimaryPath.RemoteLID, pCEP2->PrimaryPath.RemoteLID);
> if (res != 0)
> return res;
> // a length of 0 matches any private data, so this too is a wildcard compare
> if (pCEP1->DiscriminatorLen == 0 || pCEP2->DiscriminatorLen == 0)
> return 0;
> res = CompareU64(pCEP1->DiscriminatorLen, pCEP2->DiscriminatorLen);
> if (res != 0)
> return res;
> res = MemoryCompare(pCEP1->Discriminator, pCEP2->Discriminator, pCEP1->DiscriminatorLen);
> return res;
> }
>
> // A qmap key_compare function to search the ListenMap for a match with
> // a given REQ
> //
> // key1 - CEP pointer
> // key2 - REQ pointer
> //
> // Returns:
> // -1: cep1 bind address < req remote address
> // 0: cep1 bind address = req remote address (accounting for wildcards)
> // 1: cep1 bind address > req remote address
> //
> // The QPN/EECN/CaGUID are not part of the search, hence multiple Peer Connects
> // could be matched (and one which was started earliest should be then linearly
> // searched for among the neighbors of the matching CEP)
> int
> CepReqAddrCompare(uint64 key1, uint64 key2)
> {
> IN CM_CEP_OBJECT* pCEP = (CM_CEP_OBJECT*)(uintn)key1;
> IN CMM_REQ* pREQ = (CMM_REQ*)(uintn)key2;
> int res;
>
> if (pCEP->SID < pREQ->ServiceID)
> return -1;
> else if (pCEP->SID > pREQ->ServiceID)
> return 1;
> // local and remote is from perspective of sender (remote node in this
> // case, so we compare local to remote and visa versa
> res = WildcardGidCompare(&pCEP->PrimaryPath.LocalGID, &pREQ->PrimaryRemoteGID);
> if (res != 0)
> return res;
> res = WildcardCompareU64(pCEP->PrimaryPath.LocalLID, pREQ->PrimaryRemoteLID);
> if (res != 0)
> return res;
> // do not compare QPN/EECN/CaGUID
> res = WildcardGidCompare(&pCEP->PrimaryPath.RemoteGID, &pREQ->PrimaryLocalGID);
> if (res != 0)
> return res;
> res = WildcardCompareU64(pCEP->PrimaryPath.RemoteLID, pREQ->PrimaryLocalLID);
> if (res != 0)
> return res;
> // a length of 0 matches any private data, so this too is a wildcard compare
> if (pCEP->DiscriminatorLen == 0)
> return 0;
> res = MemoryCompare(pCEP->Discriminator, pREQ->PrivateData+pCEP->DiscrimPrivateDataOffset, pCEP->DiscriminatorLen);
> return res;
> }
>
> // A qmap key_compare function to search the ListenMap for a match with
> // a given SIDR_REQ
> //
> // key1 - CEP pointer
> // key2 - SIDR_REQ pointer
> //
> // Returns:
> // -1: cep1 bind address < cep2 bind address
> // 0: cep1 bind address = cep2 bind address (accounting for wildcards)
> // 1: cep1 bind address > cep2 bind address
> //
> // The QPN/EECN/CaGUID are not part of the search.
> int
> CepSidrReqAddrCompare(uint64 key1, uint64 key2)
> {
> IN CM_CEP_OBJECT* pCEP = (CM_CEP_OBJECT*)(uintn)key1;
> IN CMM_SIDR_REQ* pSIDR_REQ = (CMM_SIDR_REQ*)(uintn)key2;
> int res;
>
> if (pCEP->SID < pSIDR_REQ->ServiceID)
> return -1;
> else if (pCEP->SID > pSIDR_REQ->ServiceID)
> return 1;
> // GID and LIDs are wildcarded/not available at this time
> // do not compare QPN/EECN/CaGUID
> // a length of 0 matches any private data, so this too is a wildcard compare
> if (pCEP->DiscriminatorLen == 0)
> return 0;
> res = MemoryCompare(pCEP->Discriminator, pSIDR_REQ->PrivateData+pCEP->DiscrimPrivateDataOffset, pCEP->DiscriminatorLen);
> return res;
> }
>
> /* non-Wildcarded compare of 2 64 bit values
> * Return:
> * 0 : v1 == v2
> * -1: v1 < v2
> * 1 : v1 > v2
> */
> static __inline int
> CompareU64(uint64 v1, uint64 v2)
> {
> if (v1 == v2)
> return 0;
> else if (v1 < v2)
> return -1;
> else
> return 1;
> }
>
> /* Wildcarded compare of 2 64 bit values
> * Return:
> * 0 : v1 == v2
> * -1: v1 < v2
> * 1 : v1 > v2
> * if v1 or v2 is 0, they are considered wildcards and match any value
> */
> static __inline int
> WildcardCompareU64(uint64 v1, uint64 v2)
> {
> if (v1 == 0 || v2 == 0 || v1 == v2)
> return 0;
> else if (v1 < v2)
> return -1;
> else
> return 1;
> }
>
> /* Compare Gid1 to Gid2 (host byte order)
> * Return:
> * 0 : Gid1 == Gid2
> * -1: Gid1 < Gid2
> * 1 : Gid1 > Gid2
> * This also allows for Wildcarded compare.
> * A MC Gid with the lower 56 bits all 0, will match any MC gid
> * A SubnetPrefix of 0 will match any top 64 bits of a non-MC gid
> * A InterfaceID of 0 will match any low 64 bits of a non-MC gid
> * Coallating order:
> * non-MC Subnet Prefix (0 is wildcard and comes first)
> * non-MC Interface ID (0 is wilcard and comes first)
> * MC wildcard
> * MC by value of low 56 bits (0 is wildcard and comes first)
> */
> static __inline int
> WildcardGidCompare(IN const IB_GID* const pGid1, IN const IB_GID* const pGid2 )
> {
> if (pGid1->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX
> && pGid2->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX)
> {
> /* Multicast compare: compare low 120 bits, 120 bits of 0 is wildcard */
> uint64 h1 = pGid1->AsReg64s.H & ~IB_GID_MCAST_FORMAT_MASK_H;
> uint64 h2 = pGid2->AsReg64s.H & ~IB_GID_MCAST_FORMAT_MASK_H;
> /* check for 120 bits of wildcard */
> if ((h1 == 0 && pGid1->AsReg64s.L == 0)
> || (h2 == 0 && pGid2->AsReg64s.L == 0))
> {
> return 0;
> } else if (h1 < h2) {
> return -1;
> } else if (h1 > h2) {
> return 1;
> } else {
> return CompareU64(pGid1->AsReg64s.L, pGid1->AsReg64s.L);
> }
> } else if (pGid1->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX) {
> /* Gid1 is MC, Gid2 is other, treat MC as > others */
> return 1;
> } else if (pGid2->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX) {
> /* Gid1 is other, Gid2 is MC, treat other as < MC */
> return -1;
> } else {
> /* Non-Multicast compare: compare high 64 bits */
> /* Note all other GID formats are essentially a prefix in upper */
> /* 64 bits and a identifier in the low 64 bits */
> /* so this covers link local, site local, global formats */
> int res = WildcardCompareU64(pGid1->AsReg64s.H, pGid2->AsReg64s.H);
> if (res == 0)
> {
> return WildcardCompareU64(pGid1->AsReg64s.L, pGid2->AsReg64s.L);
> } else {
> return res;
> }
> }
> }
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list