[openib-general] [PATCH] [CM] add private data comparisonto match REQs with listens

Tom Tucker tom at opengridcomputing.com
Fri Dec 2 14:14:10 PST 2005


Am I correct to assume that this functionality is unique to the IB CM
and is not going to be exposed through the CMA?

On Fri, 2005-12-02 at 16:22 -0500, Rimmer, Todd wrote:
> > -----Original Message-----
> > From: Caitlin Bestler [mailto:caitlinb at broadcom.com]
> > Sent: Friday, December 02, 2005 1:59 PM
> > To: Sean Hefty
> > Cc: openib-general at openib.org
> > Subject: RE: [openib-general] [PATCH] [CM] add private data 
> > comparisonto
> > match REQs with listens
> > 
> > 
> > openib-general-bounces at openib.org wrote:
> > > Sean Hefty wrote:
> > >> As an update: further testing revealed that there is an issue with
> > >> this implementation that is also found in the original code.  The
> > >> issue deals with how listen requests that rely on a data mask are
> > >> inserted and located in the red/black tree.  I'm trying to come up
> > >> with a fix for this. 
> > > 
> > > After researching into this, I'm coming to the conclusion
> > > that there does not exist an efficient way to sort/search for
> > > listens without adding some restrictions.
> > > 
> > > For example, a client listens on id1 with mask1.  A request
> > > is matched with the listen if its serviceid & mask1 = id1.
> > > If a second client listens on id2 with mask2, then a request
> > > must check against both requests for a match, or until a
> > > match is found.  There's no method that I can find that can
> > > be used to filter checks that works in a generic fashion,
> > > resulting in requests needing to walk a linear list of
> > > listens.  There are several potential fixes for this, with
> > > only a couple mentioned below.
> > > 
> > > One solution around this is to have the IB CM only listen on
> > > service IDs, and remove the mask parameter from the API.
> > > This requires SDP to change to only listen on ports that have a
> > > listener. 
> > > 
> > > Another alternative is to restrict the type of masks that are
> > > supported.  If masks are restricted to a series of most
> > > significant bits, then the existing algorithm can be used.
> > > For instance, we can support masks 0xFF00 and 0xFFF0, but not
> > > 0x00FF or 0xFF0F.  This restriction would work for both SDP and the
> > >   CMA. To be clear, the API could change from a mask to the 
> > number of
> > > bits to match. 
> > > 
> > > Matching on private data can either be done by clients, or
> > > restrictions can be placed on it as well.  For private data,
> > > I believe that a restriction that all listen requests on the
> > > same service ID use the same mask is sufficient.
> > > 
> > > Hopefully this makes sense to people.  Thoughts?
> > > 
> > 
> > Just listen on the Service ID / Port and let the ULP sort them
> > out by destination IP address.
> 
> On approach is to make the sort criteria of the tree dependent on a comparison function.
> 
> For example the sort could have a multi-faceted compare.
> 
> We solved this problem in our stack (which allows listen by SID, sender GUID, receiver Port, private data, etc) by the following set of functions.  These were called per red/black tree comparison (both inserts and searches used functions, potentially different).  I realize these would not be used exactly as given, but they can provide some ideas on how to do it.  ListenMap is the red/black tree our stack used to keep track of all listening CEPs in the system.
> 
> // ListenMap Key Compare functions
> // Three functions are provided:
> //
> // CepListenAddrCompare - is used to insert cep entries into the ListenMap and
> // is the primary key_compare function for the ListenMap
> //
> // CepReqAddrCompare - is used to search the ListenMap as part of processing
> // an inbound REQ
> //
> // CepSidrReqAddrCompare - is used to search the ListenMap as part of
> // processing an inbound SIDR_REQ
> //
> // To provide the maximum flexibilty, the key for a CEP bound address is
> // sophisticated and allows wildcarded/optional fields.  This allows
> // a listener to simply bind for all traffic of a given SID or to refine the
> // scope by binding for traffic to/from specific addresses, or specific
> // private data.  The QPN/EECN/CaGUID aspect is used to allow multiple
> // outbound Peer Connects to still be considered unique.
> //
> // The result of this approach is very flexible CM bind.  The same SID
> // can be used on different ports or between different node pairs for
> // completely different meanings.  However a SID used between a given
> // pair of nodes must be used for a single model (Listen, Peer, Sidr)
> // In addition for Peer connects, each connect must have a unique
> // QPN/EECN/CaGUID.
> //
> // Comparision allows for wildcarding in all but SID
> // A value of 0 is a wildcard.  See ib_helper.h:WildcardGidCompare for
> // the rules of GID comparision, which are more involved due to multiple Gid
> // formats
> //
> //                                          Field is Used by models as follows:
> // Coallating order is:                  Listen     Peer Connect   Sidr Register
> // SID                                     Y             Y              Y
> // local GID                             option          Y         future option
> // local LID                             option          Y         future option
> // QPN                                  wildcard         Y           wildcard
> // EECN                                 wildcard         Y           wildcard
> // CaGUID                               wildcard         Y           wildcard
> // remote GID                            option          Y         future option
> // remote LID                            option          Y         future option
> // private data discriminator length     option        option         option
> // private data discriminator value      option        option         option
> //
> // if bPeer is 0 for either CEP, the QPN, EECN and CaGUID are treated as a match
> //
> // FUTURE: add a sid masking option so can easily listen on a group
> // of SIDs with 1 listen (such as if low bits of sid have a private meaning)
> //
> // FUTURE: add a pkey option so can easily listen on a partition
> //
> // FUTURE: for SIDR to support GID/LID they will have to come from the LRH
> // and GRH headers to the CM mad.  local GID and lid could be used to merely
> // select the local port number
> 
> 
> // A qmap key_compare function to compare the bound address for
> // two listener, SIDR or Peer Connect CEPs
> //
> // key1 - CEP1 pointer
> // key2 - CEP2 pointer
> //
> // Returns:
> // -1:  cep1 bind address < cep2 bind address
> //      0:      cep1 bind address = cep2 bind address (accounting for wildcards)
> //      1:      cep1 bind address > cep2 bind address
> int
> CepListenAddrCompare(uint64 key1, uint64 key2)
> {
>         IN CM_CEP_OBJECT* pCEP1 = (CM_CEP_OBJECT*)(uintn)key1;
>         IN CM_CEP_OBJECT* pCEP2 = (CM_CEP_OBJECT*)(uintn)key2;
>         int res;
> 
>         if (pCEP1->SID < pCEP2->SID)
>                 return -1;
>         else if (pCEP1->SID > pCEP2->SID)
>                 return 1;
>         res = WildcardGidCompare(&pCEP1->PrimaryPath.LocalGID, &pCEP2->PrimaryPath.LocalGID);
>         if (res != 0)
>                 return res;
>         res = WildcardCompareU64(pCEP1->PrimaryPath.LocalLID, pCEP2->PrimaryPath.LocalLID);
>         if (res != 0)
>                 return res;
>         if (pCEP1->bPeer && pCEP2->bPeer)
>         {
>                 res = CompareU64(pCEP1->LocalEndPoint.QPN, pCEP2->LocalEndPoint.QPN);
>                 if (res != 0)
>                         return res;
>                 res = CompareU64(pCEP1->LocalEndPoint.EECN, pCEP2->LocalEndPoint.EECN);
>                 if (res != 0)
>                         return res;
>                 res = CompareU64(pCEP1->LocalEndPoint.CaGUID, pCEP2->LocalEndPoint.CaGUID);
>                 if (res != 0)
>                         return res;
>         }
>         res = WildcardGidCompare(&pCEP1->PrimaryPath.RemoteGID, &pCEP2->PrimaryPath.RemoteGID);
>         if (res != 0)
>                 return res;
>         res = WildcardCompareU64(pCEP1->PrimaryPath.RemoteLID, pCEP2->PrimaryPath.RemoteLID);
>         if (res != 0)
>                 return res;
>         // a length of 0 matches any private data, so this too is a wildcard compare
>         if (pCEP1->DiscriminatorLen == 0 || pCEP2->DiscriminatorLen == 0)
>                 return 0;
>         res = CompareU64(pCEP1->DiscriminatorLen, pCEP2->DiscriminatorLen);
>         if (res != 0)
>                 return res;
>         res = MemoryCompare(pCEP1->Discriminator, pCEP2->Discriminator, pCEP1->DiscriminatorLen);
>         return res;
> }
> 
> // A qmap key_compare function to search the ListenMap for a match with
> // a given REQ
> //
> // key1 - CEP pointer
> // key2 - REQ pointer
> //
> // Returns:
> // -1:  cep1 bind address < req remote address
> //      0:      cep1 bind address = req remote address (accounting for wildcards)
> //      1:      cep1 bind address > req remote address
> //
> // The QPN/EECN/CaGUID are not part of the search, hence multiple Peer Connects
> // could be matched (and one which was started earliest should be then linearly
> // searched for among the neighbors of the matching CEP)
> int
> CepReqAddrCompare(uint64 key1, uint64 key2)
> {
>         IN CM_CEP_OBJECT* pCEP = (CM_CEP_OBJECT*)(uintn)key1;
>         IN CMM_REQ* pREQ = (CMM_REQ*)(uintn)key2;
>         int res;
> 
>         if (pCEP->SID < pREQ->ServiceID)
>                 return -1;
>         else if (pCEP->SID > pREQ->ServiceID)
>                 return 1;
>         // local and remote is from perspective of sender (remote node in this
>         // case, so we compare local to remote and visa versa
>         res = WildcardGidCompare(&pCEP->PrimaryPath.LocalGID, &pREQ->PrimaryRemoteGID);
>         if (res != 0)
>                 return res;
>         res = WildcardCompareU64(pCEP->PrimaryPath.LocalLID, pREQ->PrimaryRemoteLID);
>         if (res != 0)
>                 return res;
>         // do not compare QPN/EECN/CaGUID
>         res = WildcardGidCompare(&pCEP->PrimaryPath.RemoteGID, &pREQ->PrimaryLocalGID);
>         if (res != 0)
>                 return res;
>         res = WildcardCompareU64(pCEP->PrimaryPath.RemoteLID, pREQ->PrimaryLocalLID);
>         if (res != 0)
>                 return res;
>         // a length of 0 matches any private data, so this too is a wildcard compare
>         if (pCEP->DiscriminatorLen == 0)
>                 return 0;
>         res = MemoryCompare(pCEP->Discriminator, pREQ->PrivateData+pCEP->DiscrimPrivateDataOffset, pCEP->DiscriminatorLen);
>         return res;
> }
> 
> // A qmap key_compare function to search the ListenMap for a match with
> // a given SIDR_REQ
> //
> // key1 - CEP pointer
> // key2 - SIDR_REQ pointer
> //
> // Returns:
> // -1:  cep1 bind address < cep2 bind address
> //      0:      cep1 bind address = cep2 bind address (accounting for wildcards)
> //      1:      cep1 bind address > cep2 bind address
> //
> // The QPN/EECN/CaGUID are not part of the search.
> int
> CepSidrReqAddrCompare(uint64 key1, uint64 key2)
> {
>         IN CM_CEP_OBJECT* pCEP = (CM_CEP_OBJECT*)(uintn)key1;
>         IN CMM_SIDR_REQ* pSIDR_REQ = (CMM_SIDR_REQ*)(uintn)key2;
>         int res;
> 
>         if (pCEP->SID < pSIDR_REQ->ServiceID)
>                 return -1;
>         else if (pCEP->SID > pSIDR_REQ->ServiceID)
>                 return 1;
>         // GID and LIDs are wildcarded/not available at this time
>         // do not compare QPN/EECN/CaGUID
>         // a length of 0 matches any private data, so this too is a wildcard compare
>         if (pCEP->DiscriminatorLen == 0)
>                 return 0;
>         res = MemoryCompare(pCEP->Discriminator, pSIDR_REQ->PrivateData+pCEP->DiscrimPrivateDataOffset, pCEP->DiscriminatorLen);
>         return res;
> }
> 
> /* non-Wildcarded compare of 2 64 bit values
>  * Return:
>  *      0 : v1 == v2
>  *      -1: v1 < v2
>  *      1 : v1 > v2
>  */
> static __inline int
> CompareU64(uint64 v1, uint64 v2)
> {
>         if (v1 == v2)
>                 return 0;
>         else if (v1 < v2)
>                 return -1;
>         else
>                 return 1;
> }
> 
> /* Wildcarded compare of 2 64 bit values
>  * Return:
>  *      0 : v1 == v2
>  *      -1: v1 < v2
>  *      1 : v1 > v2
>  *      if v1 or v2 is 0, they are considered wildcards and match any value
>  */
> static __inline int
> WildcardCompareU64(uint64 v1, uint64 v2)
> {
>         if (v1 == 0 || v2 == 0 || v1 == v2)
>                 return 0;
>         else if (v1 < v2)
>                 return -1;
>         else
>                 return 1;
> }
> 
> /* Compare Gid1 to Gid2 (host byte order)
>  * Return:
>  *      0 : Gid1 == Gid2
>  *      -1: Gid1 < Gid2
>  *      1 : Gid1 > Gid2
>  * This also allows for Wildcarded compare.
>  * A MC Gid with the lower 56 bits all 0, will match any MC gid
>  * A SubnetPrefix of 0 will match any top 64 bits of a non-MC gid
>  * A InterfaceID of 0 will match any low 64 bits of a non-MC gid
>  * Coallating order:
>  *  non-MC Subnet Prefix (0 is wildcard and comes first)
>  *  non-MC Interface ID (0 is wilcard and comes first)
>  *      MC wildcard
>  *      MC by value of low 56 bits (0 is wildcard and comes first)
>  */
> static __inline int
> WildcardGidCompare(IN const IB_GID* const pGid1, IN const IB_GID* const pGid2 )
> {
>         if (pGid1->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX
>                 && pGid2->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX)
>         {
>                 /* Multicast compare: compare low 120 bits, 120 bits of 0 is wildcard */
>                 uint64 h1 = pGid1->AsReg64s.H & ~IB_GID_MCAST_FORMAT_MASK_H;
>                 uint64 h2 = pGid2->AsReg64s.H & ~IB_GID_MCAST_FORMAT_MASK_H;
>                 /* check for 120 bits of wildcard */
>                 if ((h1 == 0 && pGid1->AsReg64s.L == 0)
>                         || (h2 == 0 && pGid2->AsReg64s.L == 0))
>                 {
>                         return 0;
>                 } else if (h1 < h2) {
>                         return -1;
>                 } else if (h1 > h2) {
>                         return 1;
>                 } else {
>                         return CompareU64(pGid1->AsReg64s.L, pGid1->AsReg64s.L);
>                 }
>         } else if (pGid1->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX) {
>                 /* Gid1 is MC, Gid2 is other, treat MC as > others */
>                 return 1;
>         } else if (pGid2->Type.Multicast.s.FormatPrefix == IPV6_MULTICAST_PREFIX) {
>                 /* Gid1 is other, Gid2 is MC, treat other as < MC */
>                 return -1;
>         } else {
>                 /* Non-Multicast compare: compare high 64 bits */
>                 /* Note all other GID formats are essentially a prefix in upper */
>                 /* 64 bits and a identifier in the low 64 bits */
>                 /* so this covers link local, site local, global formats */
>                 int res = WildcardCompareU64(pGid1->AsReg64s.H, pGid2->AsReg64s.H);
>                 if (res == 0)
>                 {
>                         return WildcardCompareU64(pGid1->AsReg64s.L, pGid2->AsReg64s.L);
>                 } else {
>                         return res;
>                 }
>         }
> }
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list