[openib-general] Re: uverbs events

Grant Grundler iod00d at hp.com
Mon Apr 11 19:58:35 PDT 2005


On Mon, Apr 11, 2005 at 05:10:51PM -0700, Roland Dreier wrote:
>     ardavis> Redhat EL 4.0, 64-bit
> 
> OK, I found a system with that distro installed, although I can't test
> the results of the build.  However, I built libmthca with the same
> CFLAGS that rpm seems to use, namely "-g -O2 -m64 -pipe".  I found
> that mthca_tavor_arm_cq() compiles to the following tiny fragment:
> 
> 0000000000001d10 <mthca_tavor_arm_cq>:
>     1d10:       48 8b 07                mov    (%rdi),%rax
>     1d13:       48 8b 90 a8 ef ff ff    mov    0xffffffffffffefa8(%rax),%rdx
>     1d1a:       48 8b 44 24 f8          mov    0xfffffffffffffff8(%rsp),%rax
>     1d1f:       48 89 42 20             mov    %rax,0x20(%rdx)
>     1d23:       31 c0                   xor    %eax,%eax
>     1d25:       c3                      retq
> 
> in other words, the compiler seems to be discarding all the
> assignments to doorbell[0] and doorbell[1].

doorbell[] is a local variable and mthca_write64() is static inline.
I don't see a problem with the assignments to doorbell getting
optimized out since the scope of that variable is completely
visible to gcc. A smart compiler would just use registers and
reduce the 32-bit stores.

I see a problem with "(notify == IB_CQ_SOLICITED ? ....)" code getting
optimized away. "notifier" is passed in parameter (not a constant) and
the function is only invoked as an indirect function call. I don't see
how gcc could know what value notifier will have and optimize the test away.

Hrm...maybe the bug is "notifier" is somehow overloaded to a constant.
You'd have to look at the intermediate "-E" (preprocessed) output.


> I'm not sure if this is a compiler bug or what -- I need to
> investigate further.> In any case
> can you try the following patch to libmthca and see if it fixes
> things:
> 
> Index: src/cq.c
> ===================================================================
> --- src/cq.c	(revision 2156)
> +++ src/cq.c	(working copy)
> @@ -441,6 +441,8 @@ int mthca_tavor_arm_cq(struct ibv_cq *cq
>  			    to_mcq(cq)->cqn);
>  	doorbell[1] = 0xffffffff;
>  
> +	mb();
> +
>  	mthca_write64(doorbell, to_mctx(cq->context), MTHCA_CQ_DOORBELL);

I don't get how this fixes the problem.
mthca_write64() uses a spinlock and I thought that has to enforce
some sort of memory/instruction ordering already. I'm sketchy on
details and can't look it up right now.

hth,
grant



More information about the general mailing list