[Openib-windows] A major patch
Leonid Keller
leonid at mellanox.co.il
Thu Jun 8 03:59:02 PDT 2006
See below
> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com]
> On Behalf Of Fabian Tillier
> Sent: Thursday, June 08, 2006 12:18 AM
> To: Leonid Keller
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] A major patch
>
> Hi Leo,
>
> On 6/6/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> >
> > Hi Fab,
> > I've sync'ed our repository with Openib, because of a lot of
> > changes, accumulated.
> > Some of the patches were intersecting with changes, introducing for
> > FMR support, so i've added also the FMR patch.
>
> I wish you wouldn't have, but so be it. In the future,
> please back out the stuff that isn't critical. We're trying
> to stablize the stack for WHQL now, and adding new features
> like FMR should not take priority over other bugs, like
> support for mixed rate fabrics. I would much rather have
> seen support for CQ resize than FMR, as that is actually used by WSD.
>
> Perhaps your internal development for new features should
> happen on branches so that the new features can be
> incorporated at the appropriate time without creating
> interdependencies with bug fixes.
>
> > I'm waiting for you comments to the changes in IBAL in FMR
> patch, if any.
>
> The code looks good, thanks. There are minor formatting
> issues that I'll go through and fix.
>
> Note that I am going to rename the functions from ib_xxx_fmr
> to mlnx_xxx_fmr, because we're not dealing with IB standard
> FRM support, and when we do it will just create confusion.
> It needs to be clear that FMR support as you implemented it
> is really a vendor specific extension to work around memory
> registration performance problems, not a IB spec standard verb.
>
> >
> > Here are the comment to the sync, i've performed:
> >
> >
> > [MTHCA, IBAL]
> >
> > added FMR support;
> >
> > [MTHCA]
> >
> > 1. fixed (and now works) "livefish" support;
> >
> > 2. fixed (and now works) multiple HCA support;
> >
> > 3. support of work of 32-bit tools with 64-bit kernel;
> >
> > 4. support *bad_wr parameter in post/recv verbs as optional;
> >
> > 5. make the wait on a command completion alertable for user
> > processes;
>
> What happens when an operation wakes up due to an alert? I
> assume you then resume the wait?
No, and seems right wrong. But ...
To recall, it's a wait on a command, sent to the HCA card.
The real reason for that change was to facilitate cancelling of user
applications by Ctrl-C.
The original solution is:
1. to wait in KernelMode with timeout in non-alertable state.
Other probable soutions are:
2. to wait in UserMode with timeout in non-alertable state. On
alert return an error and exit.
it's usually allowable only for the highest-level
drivers;
3. to wait in KernelMode with timeout in alertable state. On
alert resume the wait.
it can cause an executing of an APC with a racing
contents, e.g. the thread is waiting on a command during create_qp,
while APC performs destroy of all the thread resources.
I tend to return to the original solution as a more robust.
What to you think ?
>
> Thanks,
>
> - Fab
>
>
>
More information about the ofw
mailing list