[ofa-general] rdma_create_qp fails with -12
Or Gerlitz
ogerlitz at voltaire.com
Tue Jan 22 03:19:37 PST 2008
David Dillow wrote:
> On Mon, 2008-01-21 at 10:28 -0500, Shipman, Galen M. wrote:
>> We are seeing failures setting up a QP using rdma_create_qp.
>> This only occurs when:
>> init_qp_attr.cap.max_send_wr
>> init_qp_attr.cap.max_recv_wr
>> Totals to more than 16K.
> -12 is -ENOMEM
Hi Galen, indeed, this was discussed on the list before with the reason
for the -ENOMEM failure being the kmalloc (and its such, as mentioed in
https://bugs.openfabrics.org/show_bug.cgi?id=331) mentioned below.
Basically, I see it more of a missing questionable feature then a bug,
but you may want to get a response from Roland on that.
What would be an applicative design/need to allow sender or bunch of
senders (in case of SRQ) to have few K of credits (=inflight messages),
can you give an example of concrete middleware/app that would use that?
Or.
> You may want to add some printk's to mthca_alloc_wqe_buf() in
> mthca_qp.c. I think you're failing in the line
> qp->wrid = kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64),
> GFP_KERNEL);
>
> rq.max gets set the max_recv_wr, sq.max gets max_send_wr, so you're
> trying to allocate 128KB when those total 16K -- that's trying to
> allocate 32 contiguous pages, which I believe is the max RHEL4's kernel
> will let you do via kmalloc(). New kernels may have alleviated this
> somewhat -- my Fedora 8 box has a 1MB slab/slub cache, but good luck
> actually allocating that if the box has been up any length of time.
>
> I'm not sure why it would work under userspace, but I've not looked very
> hard either. Perhaps the IOMMU is coming into play there?
More information about the general
mailing list