[openib-general] [PATCH] OFED 1.1-rc3 is ready

Michael S. Tsirkin mst at mellanox.co.il
Thu Sep 14 22:03:50 PDT 2006


Well, it looks like the libipathverbs that went into 1.1 branch was botched.
How come?
Please note that Mellanox for one is unable to test libipathverbs at all.
libipathverbs maintainers, please, try to fix by Sunday.
And please, test the changes before you commit them.


Quoting r. Robert Walsh <rjwalsh at pathscale.com>:
Subject: Re: [openib-general] [PATCH] OFED 1.1-rc3 is ready

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Woodruff, Robert J wrote:
> Robert Walsh wrote, 
>> [woody at rkl-13 bin]$ ./ib_rdma_bw -n 10000 -t 1000 -s 2000000 rkl-12
>> 4730: | port=18515 | ib_port=1 | size=2000000 | tx_depth=1000 |
>> iters=10000 | duplex=0 | cma=0 |
>> 4730: Local address:  LID 0x03, QPN 0x001d, PSN 0x9e070c RKey
> 0x2302400
>> VAddr 0x00002a95dd3480
>> 4730: Remote address: LID 0x04, QPN 0x001e, PSN 0x2bd6ba, RKey
> 0x2402500
>> VAddr 0x00002a95c85480
>> 4730:main: Completion with error at client:
>> 4730:main: Failed status 9: wr_id 3
>> 4730:main: scnt=7584, ccnt=6584
>> [woody at rkl-13 bin]$  
> 
>> Hi Woody,
> Robert Walsh wrote, 
>> When RC4 is available, there should be a patch in there that will fix
>> this.  Can you let us know if you continue to see problems?
> 
>> Regards,
>> Robert.
> 
> I installed RC5 and now it just hangs, 
> 
> [woody at rkl-13 bin]$ ./ib_rdma_bw -n 10000 -t 1000 -s 2000000 rkl-12
> 4702: | port=18515 | ib_port=1 | size=2000000 | tx_depth=1000 |
> iters=10000 | duplex=0 | cma=0 |
> 4702: Local address:  LID 0x03, QPN 0x000d, PSN 0xf1b711 RKey 0x1101200
> VAddr 0x00002a95dc8480
> 4702: Remote address: LID 0x04, QPN 0x000d, PSN 0xe62247, RKey 0x1101200
> VAddr 0x00002a95c7c480
> hangs here and have to cntrl-c the test.
> 
> 
> Intel MPI also fails with, 
> # Barrier
> [1][rdma_iba.c:260] Intel MPI fatal error: DTO operation completed with
> error. status=0x8. cookie=0x514ee0
> rank 1 in job 4  rkl-13_32779   caused collective abort of all ranks
>   exit status of rank 1: killed by signal 9 

Hi Woody,

So, we built everything using RC5 plus the libipathverbs from subversion
and we were successfully able to run ib_rdma_bw (with your arguments
above) and Intel MPI (a simple MPI hello world program).  I'm going to
continue testing with the Intel MPI testsuite and some applications ISV
applications.

I'll keep you informed.

Regards,
 Robert.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iQEVAwUBRQoATfzvnpzTd9fxAQLUKQf9E1ps9XbbXplMm6+5O/XDdlWF0BQws1SC
L/aGygh34fZSkpGmCrfze3HhsaOqasu9gUOsJQ89jX6pKNkv4tJAxSJCr+n+bdG3
21Bqr9gcM0MbzrDvOcUDHqvnmC0THlCf0XhikjKg/FJR1e48BIiAOFUzfi0VvI36
G1ZtD8xZXydOfWq7Z4xvyf9Y3qNPIeSKR2JZGJQoGHjxY4+vcteK0UVHfic1Bgpy
9uql47af6tncN+CazYcwf8xnHegiDr34iEEre5wUz//Qy62j8JNPnxhit0W9lXij
zFszTkOHQeibxbFWi9ZRyigTmHanxxRUuznW54NL8NIF30jhnmcksQ==
=06gu
-----END PGP SIGNATURE-----

_______________________________________________
openfabrics-ewg mailing list
openfabrics-ewg at openib.org
http://openib.org/mailman/listinfo/openfabrics-ewg

-- 
MST




More information about the general mailing list