[ofa-general] mapping kernel memory to userspace in <= 2.6.14

Steve Wise swise at aoot.com
Tue Mar 13 07:08:23 PDT 2007


Hey Roland,

Remember my little ofed 1.2 bug where the chelsio WQ and CQ memory
aren't getting mapped to userspace correctly for RHEL4U4?  Well through
experimentation I've shown that it fails all the way through 2.6.14 and
works fine with 2.6.15 and beyond.  

Perusing the mm/memory.c log from v2.6.14..v2.6.15 shows lots of
changes.  Many of the comments talk about no longer needing to set the
reserved bit on page entries.  On a hunch I hacked in calling
SetPageReserved() on each page entry for the memory allocated via
dma_alloc_coherent().  And BOOM...things start working.  Below is a
patch to show the hack.

Does this make sense to you?  I'm not a VM expert.  

Also, looking at other 2.6.14 drivers, I see that many of them seem to
do this same trick apparently for making sure the pages aren't swapped.
However, they also clear the bit before freeing the memory, which makes
sense.  My original hack had a ClearPageReserved() loop before freeing
my dma coherent memory, but I got crashes on process exit where the map
count on pages wasn't correct (the refcnt went to -1 apparently). It hit
the BUG_ON() in page_remove_rmap()  at line 487 in mm/rmap.c (2.6.14.7
kernel).  

So I'm thinking this hack isn't quite correct.  Got and ideas?

Thanks,

Steve.


diff -up /home/swise/git/ofed_1_2/drivers/infiniband/hw/cxgb3/iwch_provider.c drivers/infiniband/hw/cxgb3/iwch_provider.c
--- /home/swise/git/ofed_1_2/drivers/infiniband/hw/cxgb3/iwch_provider.c	2007-03-10 14:14:31.000000000 -0600
+++ drivers/infiniband/hw/cxgb3/iwch_provider.c	2007-03-12 22:24:42.000000000 -0500
@@ -139,6 +139,15 @@ static int iwch_destroy_cq(struct ib_cq 
 	return 0;
 }
 
+static void reserve_pages(void *p, int size)
+{
+	while (size > 0) {
+		SetPageReserved(virt_to_page(p));
+		p += PAGE_SIZE;
+		size -= PAGE_SIZE;
+	}
+}
+
 static struct ib_cq *iwch_create_cq(struct ib_device *ibdev, int entries,
 			     struct ib_ucontext *context,
 			     struct ib_udata *udata)
@@ -205,6 +214,7 @@ static struct ib_cq *iwch_create_cq(stru
 			iwch_destroy_cq(&chp->ibcq);
 			return ERR_PTR(-EFAULT);
 		}
+		reserve_pages(chp->cq.queue, entries * sizeof (struct t3_cq));
 		mm->addr = uresp.physaddr;
 		mm->len = PAGE_ALIGN((1UL << uresp.size_log2) *
 					     sizeof (struct t3_cqe));
@@ -848,6 +843,7 @@ static struct ib_qp *iwch_create_qp(stru
 		insert_mmap(ucontext, mm1);
 		mm2->addr = uresp.doorbell & PAGE_MASK;
 		mm2->len = PAGE_SIZE;
+		reserve_pages(qhp->wq.queue, wqsize * sizeof(union t3_wr));
 		insert_mmap(ucontext, mm2);
 	}
 	qhp->ibqp.qp_num = qhp->wq.qpid;






More information about the general mailing list