[openib-general] [RFC/BUG] libibverbs: DMA vs. CQ race

Roland Dreier rdreier at cisco.com
Mon Jan 29 13:49:04 PST 2007


Hmm...

Well, first the changes to the userspace libmthca need to be such that
new libmthca continues to work with old kernels.  I'm OK with saying
to people, "You upgraded your kernel so you also have to upgrade your
userspace library."  But I'm not OK with saying to people, "To get a
fix for that bug, you need to upgrade libmthca, which means you also
need to upgrade your kernel," and I also don't want to tell people,
"If you reboot into an older kernel then you need to downgrade your
userspace library."

Also,

 > +	off_t                  	    offset;

 > +	/* offset encodes CQ and cqn; lower PAGE_SHIFT bits MBZ */
 > +	offset = cq->cqn;
 > +	offset <<= 32;
 > +	offset += MTHCA_MAGIC_CQ_OFFSET * page_size;

is obviously not going to work on architectures where off_t is 32 bits.

Even with that resolved this all seems rather unfortunate to me.  I
don't like the idea of having the kernel keep all these buffers around
and then have the userspace library have to map the right buffer.  It
leads to awkwardness like the fact that mthca_resize_cq() seems to be
totally screwed if ibv_cmd_resize_cq() fails for some reason -- it
already munmap'ed the original buffer, and it can't map the new
buffer, and so the CQ is dead with no chance to recover.

The really strange thing about this is that this Altix
coherent/consistent memory really isn't about the memory itself, but
about the relationship of that memory with DMA elsewhere -- as I
understand the code, doing dma_alloc_coherent() returns normal memory
with a special DMA address that tells the system to flush other DMAs
before doing DMA to the coherent region.  Which isn't really what most
people understand coherent memory to be, but it has the magic property
of making most drivers work.

So I'd really like a better solution, but I don't have one in mind
unfortunately.  Maybe we can all meditate on this and try to come up
with something cleaner -- I really hope there is a better way to
handle this.

 - R.




More information about the general mailing list