***SPAM*** Re: [ofa-general] atomic operations on ppc64
Dotan Barak
dotanba at gmail.com
Thu Sep 25 16:11:10 PDT 2008
Rui Machado wrote:
> 2008/9/25 Ronni Zimmermann <ronniz at mellanox.co.il>:
>
>> Rui Machado wrote:
>>
>>>> 2008/9/22 Ronni Zimmermann <ronniz at mellanox.co.il>:
>>>>
>>>>> Hi,
>>>>> We run tests which use atomic operations (both fetch and
>>>>>
>>> add and comp and swap) on PPC64 all the time, without
>>> experiencing any problem.
>>>
>>>>> Just to make sure I ran few simple tests, which use atomic
>>>>>
>>> operations, on our PPC64 machines, both with SLES10 SP1 and
>>> with RHAS5.1, and all of them passed.
>>>
>>>>> I was working with the latest OFED1.4 driver and mlx4 HCA
>>>>>
>>> with the latest released FW and with FW 2.3.000 (on the
>>> SLES10 SP1 machine).
>>>
>>>>> Given the above information I believe that there's either
>>>>>
>>> a problem with your code (although looking at the code you
>>> posted I couldn't see anything wrong) or it's an OFED1.2.5
>>> issue, as Dotan suggested.
>>>
>>> OK thanks for the feedback. We have ppc64 machines with mlx4
>>> and mthca0 (from ibv_devinfo) ) Both don't work. Any
>>> experience with the mthca0? It is older and should be better
>>> supported on 1.2.5 or?
>>> My priority is the machines with the mlx4 but of course I
>>> would like to see both working.
>>>
>>>
>> Sorry, I have no experience with mthca0 on PPC64 machines.
>> It is indeed an older HCA, but I don't know weather or not it's working
>> properly on PPC64 with ofed 1.2.5.
>>
>>
>>> I also tried with a 2.6.26.2 kernel (had it at hand) with the same
>>> ofed1.2.5 installation and still see the problem.
>>> I guess my last and longest try to install the whole ofed 1.4 package.
>>>
>>>
>> Please bear in mind that OFED 1.4 is RC2 and will probably be GA by the
>> end of October.
>> If installing a new driver on youe machine is a big problem for you, and
>> you don't need the new features supported by ofed 1.4 and not by ofed
>> 1.3.1, maybe it'll be better for you to install ofed 1.3.1, which is
>> already GA.
>>
>>
>
> Actually I just tried with ofed 1.4 and still see the problem :(
> I think I installed it correctly with a 2.6.26.2 kernel although I see
> the warning:
> libibverbs: Warning: couldn't load driver 'mthca': libmthca-rdmav2.so:
> cannot open shared object file: No such file or directory
> A small example using RDMA read is working.
>
> I just wanted to see if the problem exists with 1.4 even if it is a
> RC. Probably I will install 1.3.1 when I solve this problem. And I
> really need to solve it!
>
>
The problem that you describes is pretty basic and even an RC shouldn't
have this issue.
I think that you should upgrade the HCA's Firmware. as Ronni suggested.
I have a feeling that the problem is in your code:
You should access the buffer that the HCA read/write as volatile, to
"tip" the compiler
that this memory will be modified by other components and he shouldn't
do any optimization
when you want to read data from it and actually do the reading ...
Dotan
More information about the general
mailing list