[openib-general] RFC: [PATCH untested] IB/uverbs: optimize registration for huge pages
Michael S. Tsirkin
mst at mellanox.co.il
Tue Aug 15 14:13:19 PDT 2006
Quoting r. Roland Dreier <rdreier at cisco.com>:
> Subject: Re: question: ib_umem page_size
>
> Michael> Roland, could you please clarify what does the page_size
> Michael> field in struct ib_mem do?
>
> It gives the page size for the user memory described by the struct.
> The idea was that if/when someone tries to optimize for huge pages,
> then the low-level driver can know that a region is using huge pages
> without having to walk through the page list and search for the
> minimum physically contiguous size.
OK, so here's a patch [warning: untested] that attempts to do this - we have
customers that run out of resources when they register lots of huge pages,
and this will help.
How does this look? Is this the intended usage?
uverbs_mem.c | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletion(-)
--
Optimize memory registration for huge pages, by walking through
the page list and searching for the minimum physically contiguous
size.
Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
diff --git a/drivers/infiniband/core/uverbs_mem.c b/drivers/infiniband/core/uverbs_mem.c
index efe147d..f750652 100644
--- a/drivers/infiniband/core/uverbs_mem.c
+++ b/drivers/infiniband/core/uverbs_mem.c
@@ -73,6 +73,8 @@ int ib_umem_get(struct ib_device *dev, s
unsigned long lock_limit;
unsigned long cur_base;
unsigned long npages;
+ dma_addr_t a, seg_end;
+ u32 mask = 0;
int ret = 0;
int off;
int i;
@@ -87,7 +89,6 @@ int ib_umem_get(struct ib_device *dev, s
mem->user_base = (unsigned long) addr;
mem->length = size;
mem->offset = (unsigned long) addr & ~PAGE_MASK;
- mem->page_size = PAGE_SIZE;
mem->writable = write;
INIT_LIST_HEAD(&mem->chunk_list);
@@ -149,6 +150,15 @@ int ib_umem_get(struct ib_device *dev, s
goto out;
}
+ for (i = 0; i < chunk->nents; ++i) {
+ a = sg_dma_adress(chunk->page_list[i]);
+ if ((i || off) && a != seg_end) {
+ mask |= seg_end;
+ mask |= a;
+ }
+ seg_end = a + sg_dma_len(chunk->page_list[i]);
+ }
+
ret -= chunk->nents;
off += chunk->nents;
list_add_tail(&chunk->list, &mem->chunk_list);
@@ -157,6 +167,8 @@ int ib_umem_get(struct ib_device *dev, s
ret = 0;
}
+ mem->page_size = ffs(mask) ? 1 << (ffs(mask) - 1) : (1 << 31);
+
out:
if (ret < 0)
__ib_umem_release(dev, mem, 0);
--
MST
More information about the general
mailing list