[openib-general] RHEL 4 U3 - lost completions

Bill Hartner bhartner at austin.rr.com
Wed Oct 4 08:52:24 PDT 2006


"Michael S. Tsirkin" wrote:
> 
> Quoting r. glebn at voltaire.com <glebn at voltaire.com>:
> > AFAIR there is a bug in kernel 2.6.9 that makes it possible for page to
> > be changed in process's VM even though it is locked by get_user_pages().
> > That is why Mellanox driver used mlock() in addition to
> > get_user_pages(). I think this bug was fixed somewhere around 2.6.11.
> 
> I think it got fixed around 2.6.7. RHEL4 U3 has this fix,
> and AFAIK last SLES9 update has backported that to 2.6.7 too.

Another data point here. On gen1 stacks + RHEL 4 U3, the app I'm working
on mlock()s a region from user space and also does get_user_pages() on
the same region from a kernel piece of the app.  When the adapter was
closed or the registration was freed, the region was munlock()ed by the
IB stack and the page structs changed from under us, even though the app
still had get_user_pages() on the region.  Is this an indication that
get_user_pages() not guaranteeing a page does not move on RHEL 4 U3?

I created a test case using pthreads and simulated what the real app
does and can not recreate. I will continue to debug the app.  I will
also verify no forks take place.

-Bill




More information about the general mailing list