[openib-general] [PATCH] The ibv_cmd_* create functions need to set the context.
Michael S. Tsirkin
mst at mellanox.co.il
Wed Jan 31 10:25:32 PST 2007
> Quoting Steve Wise <swise at opengridcomputing.com>:
> Subject: Re: [PATCH] The ibv_cmd_* create functions need to set the context.
>
> On Wed, 2007-01-31 at 12:24 +0200, Michael S. Tsirkin wrote:
> > > Quoting Roland Dreier <rdreier at cisco.com>:
> > > Subject: Re: [PATCH] The ibv_cmd_* create functions need to set the context.
> > >
> > > Thanks, applied to master and stable branches.
> >
> > Did you test it?
> > This patch (8b3d225476c99ea29a68109a7d40e5ef353d4388) causes ibv_ud_pingpong
> > to segfault on libmthca: libmthca never calls ibv_cmd_create_ah to context is now
> > never set.
> >
> >
>
> I didn't test UD.
Well, when you touch the AH functions, UD is really the only way to test them.
>
> > Starting program: /usr/local/ofed/bin/ibv_ud_pingpong sw069
> > [Thread debugging using libthread_db enabled]
> > [New Thread 47299578320592 (LWP 5085)]
> > local address: LID 0x0002, QPN 0x090406, PSN 0x71bffb
> > remote address: LID 0x0001, QPN 0x040406, PSN 0x92316a
> > 4096000 bytes in 0.02 seconds = 1893.99 Mbit/sec
> > 1000 iters in 0.02 seconds = 17.30 usec/iter
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 47299578320592 (LWP 5085)]
> > 0x00002b04ca3b7263 in __ibv_destroy_ah (ah=0x5050b0) at src/verbs.c:475
> > 475 return ah->context->ops.destroy_ah(ah);
> > (gdb) p ah->context
> > $1 = (struct ibv_context *) 0x0
> >
> > I actually think this approach is a wrong one: context should be
> > set in common code like ibv_create_ah, not in ibv_cmd_ which is
> > a library function low level driver might or might not call.
> > And certainly this kind of change does not seem appropriate for stable branch.
> >
> > I think the proper thing is for low level driver not to assume that
> > fields such as contex are intialized until create functions have returned.
> > Steve, pls fix your low level driver not to rely on this.
> >
>
> The issue is that the provider lib calls ibv_cmd_create_blah to create
> the object, then some failure happens (like a failure mmap()ing the
> object's DMA area to the process). At this point the provider lib must
> destroy this object that is created from the perspective of the ibv_cmd*
> interface. The only way to do that is to call the ibv_cmd_destroy_blah
> call, which needs the context field.
For stable, in case of error, set the context in the provider lib then?
> So I don't think solving this in the provider lib is the right thing to
> do.
At least for stable branch, this seams more sensible than the disruptive
patch that was applied. Roland, what do you think?
For master, maybe ibv_cmd destructors should get the context as a parameter?
> > Roland, I have reverted this in OFED, please revert on master and stable.
> >
>
> I think we should fix the bug introduced: set the context field in the
> ibv_create_blah service if its not set after calling the provider
> method.
This is ugly as well, but at least it would work.
--
MST
More information about the general
mailing list