[openib-general] ipath and current git woes

Roland Dreier rdreier at cisco.com
Thu Feb 1 20:45:11 PST 2007


 > After applying that patch the user space consumers load but we got a
 > kernel oops when we tried to run a test here :<
 > 
 > Unable to handle kernel NULL pointer dereference at 0000000000000918 RIP: 
 >  [<ffffffff88074c76>] :ib_ipath:ipath_mmap+0x37/0x95

So I had a look at this, and it seems that there are two bugs that
lead to this.

First of all, libipathverbs gets a response from the kernel that has a
64-bit kernel address in it, and passes that back into a call to
mmap(), where it uses that address as the offset.  On 32-bit
userspace, that chops off the high bits of the address and so the
ipath kernel driver can't find the address in its list.

So that explains why things don't work.  And unfortunately the obvious
fix for libipathverbs to use mmap64() instead of mmap() doesn't work,
because on Linux, mmap64() is implemented with the mmap2 system call,
which just allows the offset to be 12 bits bigger -- so it only gets
you to 44 bits, which is not enough to reach a 64-bit kernel address
(which is typically something like 0xffffc20000072000).  So you
probably want to use something like a 32-bit serial number to point at
your buffers or something like that.

The oops is caused by another more serious problem.  Obviously a buggy
libipathverbs shouldn't be able to crash the kernel, because even if
libipathverbs is fixed then malicious userspace could do the same
thing too.

It turns out that all the handling of pending_mmaps in the ipath
driver is not really careful about userspace screwing it up.  When
userspace creates a CQ, the CQ buffer is added to the device-wide list
of pending mmaps.  Of course 32-bit userspace never succeeds in
mapping that CQ, so it stays on the list (the only way it gets removed
is if it is successfully mmapped).  But then the destroy CQ operation
sees that the mmap is pending, and frees the structure holding the
information (without removing it from the list).  And of course when
that memory gets reused, then the pending mmap list gets corrupted,
etc etc.

Of course this is ugly to fix with the current data structure -- the
list of pending mmaps is singly-linked, which means I have to walk the
whole list to delete an entry.  It also makes the list walking in
ipath_mmap() is unnecessarily obfuscated.  I think it's much better to
just use the standard kernel list_head stuff if you're going to delete
things from the middle of the list, rather than implementing your own
singly-linked list.  Sure it costs an extra pointer in each entry but
no one ever has to worry about whether you're deleting things
correctly, etc.

There's some other silly stuff I noticed too, like:

    grep -n mmap_cnt *.[ch] /dev/null
    ipath_cq.c:232:		ip->mmap_cnt = 0;
    ipath_mmap.c:63:	ip->mmap_cnt++;
    ipath_mmap.c:70:	ip->mmap_cnt--;
    ipath_qp.c:837:			ip->mmap_cnt = 0;
    ipath_srq.c:162:		ip->mmap_cnt = 0;
    ipath_verbs.h:178:	unsigned mmap_cnt;

umm -- no one ever looks at mmap_cnt (there's a kref too), so why keep
it at all?

So Qlogic guys -- please fix this up!

 - R.




More information about the general mailing list