[ofa-general] atomic operations on ppc64
Rui Machado
ruimario at gmail.com
Wed Sep 17 10:14:37 PDT 2008
From: Rui Machado <ruimario at gmail.com>
Date: 2008/9/17
Subject: Re: [ofa-general] atomic operations on ppc64
To: Dotan Barak <dotanba at gmail.com>
2008/9/17 Dotan Barak <dotanba at gmail.com>:
> On Wed, Sep 17, 2008 at 5:54 PM, Rui Machado <ruimario at gmail.com> wrote:
>> 2008/9/17 Dotan Barak <dotanba at gmail.com>:
>>> On Wed, Sep 17, 2008 at 5:44 PM, Rui Machado <ruimario at gmail.com> wrote:
>>>> 2008/9/17 Dotan Barak <dotanba at gmail.com>:
>>>>> On Wed, Sep 17, 2008 at 5:28 PM, Rui Machado <ruimario at gmail.com> wrote:
>>>>>> Hey Dotan,
>>>>>>
>>>>>> 2008/9/17 Dotan Barak <dotanba at gmail.com>:
>>>>>>> On Wed, Sep 17, 2008 at 5:12 PM, Rui Machado <ruimario at gmail.com> wrote:
>>>>>>>> Hi list,
>>>>>>>>
>>>>>>>> does anyone have experienced problems using IB atomic operations
>>>>>>>> (fetch and add) on a ppc64 platform?
>>>>>>>> I tried a small example (using fetch and add) on x86 and ppc64 and on
>>>>>>>> x86 worked fine while on ppc64 didn't.
>>>>>>>
>>>>>>> Do you handle the ntoh/hton or do you let the driver/HCA deal with it by itself?
>>>>>>
>>>>>> Nop, I don't use those. I guess then I'm letting the driver/HCA deal with it....
>>>>>
>>>>> Do you see endianess issues or completely corrupted data?
>>>>>
>>>>
>>>> Just to make it clear (to me :) ). I'm talking about ppc64<-->ppc64
>>>> communication.
>>>> Should I still concern with converting data because of endianess?
>>>> What happens is that I ask for a fetch and add and it doesn't happen.
>>>> The value on the server doesn't get modified.
>>>
>>> This is a weird behaviour indeed ..
>>>
>>> Can you post the code in your program that fill the SR?
>>>
>>> Dotan
>>>
>>
>> Not sure what do you mean by SR.
>> Here's is the function inc() which I call to increment 1 one the
>> remote machine. The remote machine has its buffer full of zeroes.
>> That's what the client gets all the time although I increment 3 times
>> in a row (with a sleep in between)
>>
>> Is this enough?
>> Thanks for the help
>>
>> void inc()
>> {
>>
>> struct ibv_qp_attr check_attr;
>> struct ibv_qp_init_attr check_init_attr;
>>
>> void *ev_ctx;
>>
>> struct ibv_send_wr *bad_wr;
>> struct ibv_wc wc;
>> struct ibv_sge slist;
>> struct ibv_send_wr swr3;
>>
>>
>> slist.addr = (uintptr_t)buffer;
>> slist.length = 8;
>> slist.lkey =mr->lkey;
>>
>> swr3.wr.atomic.remote_addr = remote_node->mi.bufAddr;
>> swr3.wr.atomic.rkey = remote_node->mi.buf_rkey;
>> swr3.wr.atomic.compare_add = 1;
>>
>> swr3.wr_id = 1;
>> swr3.sg_list = &slist;
>> swr3.num_sge = 1;
>> swr3.opcode = IBV_WR_ATOMIC_FETCH_AND_ADD;
>> swr3.send_flags = IBV_SEND_SIGNALED;
>> swr3.next = NULL;
>>
>>
>> if(ibv_post_send(qp,&swr3,&bad_wr)){
>> printf("Couldn't post send...\n");
>> return 0;
>> }
>>
>>
>> int ne=0;
>> do{
>> ne = ibv_poll_cq(cq,1,&wc);
>> }while(ne==0);
>>
>> if((ne < 0) || (wc.status != IBV_WC_SUCCESS)){
>>
>> //check qp status
>> if(!ibv_query_qp(qp,&check_attr,IBV_QP_STATE,&check_init_attr))
>> printf("The qp state is: %d\n ",check_attr.qp_state);
>>
>> }
>> }
>>
>
> The code looks good and it should work...
> (I would have memset every structure before using it ..)
>
>
> Did you check the memory in the sender side or in the reciver side?
>
As I mentioned it does work on x86.
Actually on both:
server:
Initial counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
client:
initial IB atomic counter 0
IB atomic counter 0
IB atomic counter 0
IB atomic counter 0
What could this be related to? Driver, HW?
More information about the general
mailing list