[ewg] Re: pkey change handling patch

Wed Apr 25 01:13:34 PDT 2007

On 4/19/07, Michael S. Tsirkin <mst at dev.mellanox.co.il> wrote:
> > Quoting Roland Dreier <rdreier at cisco.com>:
> > Subject: Re: pkey change handling patch
> >
> >  > So since all this thread was started by Moni because of IPoIB,
> >  > the path is clear in that respect, and would already be a step in the
> >  > right direction:
> >  >
> >  > - a patch to add ib_find_pkey() and ib_find_gid() to core
> >  > - a patch to replace cache usage in IPoIB / SRP with uncached
> >  >   hardware accesses on top of this
> >  > - pkey change handling patch on top of these
> >
> > Makes good sense to me.
>
> OK, let's do this for starters. Moni?

Before getting to the implementation phase, I would like to get your
opinion on two more things:

1. Direct access in ib_find_pkey will probably heart RC connections
per second rate.
2. What do you think about OrG's opinion (I'm copying it from the other thread):

Roland, Michael,

Please note that there is quite a big difference between UD vs RC based
IB ULPs with respect to how there are influenced from using a wrong pkey
index at their QP.

In the RC case, the receiving side transport level would not get any
packets and hence would not send acks etc, at some point the sending
side would get completion with error and retry the connection.

In the UD case, nothing other then pkey-violation-counter/traps etc
would happen unless both side would re-initiate their QP (this is
exactly what Moni is doing at ipoib in the patch that followed).

Hence, it is extremely important that UD based ULPs would react to the
async event of pkey change, and would retry reading the pkey from the
cache when getting ESTALE or any other error code from the cache.

For the RC case, note that a) the connection would not break if the
change did not involve the index of the pkey used for it b) once the
connection breaks and re-initiated by the ULP the cache would be very
much --already updated--.

So the only case which might be problematic with a patch that does not
change the RC ULPs (and CM) code is when in the exact millisecond you
set your RC connection the cache changes. I don't think the IB portion
of the ULP code has to be changed other then sensing the ESTALE error
and propagating it up. Higher layers would retry the connection and we
are done.

Anyway, thanks for bringing all this up! while thinking on it i have
realized that the RDMA CM can (should) be enhanced to register on async
events and for the pkey change event issue disconnect event on the
relevant UD unicast IDs and multicast error event on the relevant UD
multicast IDs.

-- Moni

>
> --
> MST
>