[Openib-windows] LID change event

Fabian Tillier ftillier at silverstorm.com
Fri Jul 7 10:19:01 PDT 2006


Hi Yossi,

On 7/7/06, Yossi Leybovich <sleybo at mellanox.co.il> wrote:
>
>
> > -----Original Message-----
> > From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com]
> > On Behalf Of Fabian Tillier
> > Sent: Thursday, July 06, 2006 9:57 PM
> >
> > Hi Yossi,
> >
> > On 7/4/06, Yossi Leybovich <sleybo at mellanox.co.il> wrote:
> > > Hi
> > >
> > > We found more cases that IPoIB discover duplicate LID in
> > > its endptlist
> > > (even after we clean the LID list in ipoib_reset_all) This can be
> > > cause from old packets in the network (recv packets create p_src
> > > endpnt if it does not exist and the packet can carry the old LID) I
> > > think that this patch reduce the possibility of getting duplicate
> > > entries in the LID.
> > > It insert to the LIDs list only when the path record query is back
> > > (with the av).
> >
> > Not inserting into the LID map until the AV is created means
> > that we won't ever report unicast packets until we've tried
> > to send to that node.  I don't know how big of an issue this
> > is, since most communication start with an ARP exchange.
> >
> > However, there are cases where discarding unicast traffic
> > like this is the wrong thing to do.  Think of two systems, A
> > and B.  B resolves A's IP address via ARP (A responded, so
> > all is well).  A now loses its link, but B doesn't - this
> > flushes all of A's endpoint entries since the port went down
> > - all endpoints lose their LID assignment.  B now tries to
> > send unicast packets to A - it doesn't need to ARP again
> > since it just did.  The packets, when received by A, fail any
> > lookup by LID, and are discarded.
> >
>
> Isn't this what will happen if the SM will change A LID.
> If A LID is changed by the SM after the link is up(I am not really sure
> that the SM allowed to do that ), if B will try to send to the old LID
> the packets will still be discarded.

This may happen when the SM changes the LID, but we don't want it to
happen when the LID does not change and we had a port go down due to a
cabling change.  I don't know if it's valid for the SM to change the
LID while the port is in the ACTIVE state.

> > > More over same as we create endpt entry in recv_arp (with LID 0
> > > because source LID may not be the original initiator) we should do
> > > that  in recv_get_endpt function as well and wait to the
> > > LID from the path record query.
> >
> > Looking at it, I think recv_arp is wrong, and should include the LID.
> > Otherwise further unicast traffic will be discarded.
> >
> > > I also add assert to check for duplication in the path_record_cb
> > >
> > > Another option is:
> > > To check in each insertion to the LIDs list if the LID
> > > already exist
> > > in the list , if yes remove the entry from the LIDs list
> > > and zero the LID field of the endpt struct.
> >
> > I think the right thing to do is to remove the old entry, and
> > replace it with the new anytime the LID changes.  We can't
> > require every packet to include the GRH, as the IPoIB draft
> > states that implementations must handle receiving packets
> > without a GRH.
>
> This will also solve the cleanup I made when the SM changed(first part
> of the patch)
>
> > I have to think about this a little more - I don't know what
> > to do with the "old" endpoint if a new one is being inserted
> > with a duplicate LID.  Do we just set its LID to zero, or do
> > we remove it all together?
>
> I think we should set the LID to 0 clear its av (if exist) and remove it
> from the LIDs list.
> We should keep it in the MAC/GID list so that new sends to that
> destination will issue pr query to resolve the LID and send the packet.

This will introduce the possibility of duplicates being found when
inserting into the MAC and GID maps.  Say a node X has its LID
changed, so it clears all the endpoint LIDs and removes them from the
LID map.  If that node now receives a packet from some other node, it
will try to create an endpoint for that node, and could fail inserting
into the MAC and GID maps because that endpoint already exists (even
if the sender did not change LIDs).  I think we need to keep the
endpoint around, but we need to trap duplicate insertions into all the
maps now when a packet is received.

> Any way we need to come up with something because running over windows
> CCP 8 nodes cluster get us to scenarios when LIDs changed and that hang
> IPoIB.

I agree this needs to be fixed, though realistically the SM should not
be going up and down and reassigning LIDs in an 8-node cluster unless
the SM is misconfigured.

It's just a matter of finding the proper way to handle it at this point.

- Fab




More information about the ofw mailing list