[openfabrics-ewg] rc3 installation issues.

Scott Weitzenkamp (sweitzen) sweitzen at cisco.com
Wed May 3 14:23:49 PDT 2006


I was using Intel MPI incorrectly, I was running the two procs on the
same node (I ran mpdboot incorrectly).
 
I am now getting 4.61 usec latency, with Cheetah 1.0.800 firmware.  I
tried with -genv I_MPI_USE_RENEDEZVOUS_RDMA_WRITE 1, too, but that
didn't help.
 
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 


________________________________

	From: Davis, Arlin R [mailto:arlin.r.davis at intel.com] 
	Sent: Wednesday, May 03, 2006 2:10 PM
	To: Scott Weitzenkamp (sweitzen); Woodruff, Robert J; Aviram
Gutman; Tziporet Koren; Amit Mehrotra (amehrotr);
openfabrics-ewg at openib.org
	Cc: Alexander Smirnov
	Subject: RE: [openfabrics-ewg] rc3 installation issues.
	
	

	Scott,

	 

	What is your HCA firmware revision? We saw some RDMA read
performance issues with older firmware.

	 

	You can try " -genv I_MPI_USE_RENEDEZVOUS_RDMA_WRITE 1" to use
rdma writes instead of reads.

	 

	-arlin

	 

	
________________________________


	From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
	Sent: Monday, May 01, 2006 12:15 PM
	To: Davis, Arlin R; Woodruff, Robert J; Aviram Gutman; Tziporet
Koren; Amit Mehrotra (amehrotr); openfabrics-ewg at openib.org
	Cc: Alexander Smirnov
	Subject: RE: [openfabrics-ewg] rc3 installation issues.

	 

	We are using Intel MPI 2.0 (plan to upgrade to 2.0.1 soon) with
"mpiexec -genv I_MPI_DEBUG 3 -genv I_MPI_DEVICE rdma".  Are there any
tunable parameters we should be setting?

	 

	###TEST-D: MPI home dir is
/data/software/qa/MPI/intel_mpi/intelmpi-2.0-`uname -
	m`.

	 

	###TEST-D: Intel-MPI uDAPL osu_latency test on hosts
svbu-qa1850-1 svbu-qa1850-2
	# OSU MPI Latency Test (Version 2.1)
	# Size          Latency (us)
	0               6.89
	1               7.25
	2               7.43
	4               7.38
	8               7.25
	16              7.62
	32              7.45
	64              7.86
	128             8.46
	256             9.53
	512             6.55
	1024            7.29
	2048            8.57
	4096            11.46
	8192            17.19
	16384           29.26
	32768           52.75
	65536           82.67
	131072          159.05
	262144          282.31
	524288          527.08
	1048576         1019.57
	2097152         1980.07
	4194304         3921.89
	###TEST-D: Intel-MPI uDAPL osu_bw test on hosts svbu-qa1850-1
svbu-qa1850-2
	# OSU MPI Bandwidth Test (Version 2.1)
	# Size          Bandwidth (MB/s)
	1               0.426379
	2               0.852988
	4               1.711797
	8               3.370452
	16              6.725704
	32              13.475464
	64              25.936968
	128             48.790506
	256             93.297867
	512             170.188285
	1024            290.329228
	2048            452.958568
	4096            579.709399
	8192            627.702319
	16384           643.040963
	32768           652.009193
	65536           861.314972
	131072          995.337153
	262144          1112.127322
	524288          1184.367387
	1048576         1223.328478
	2097152         1245.594018
	4194304         1256.059934
	###TEST-D: Intel-MPI uDAPL osu_bibw test on hosts svbu-qa1850-1
svbu-qa1850-2
	# OSU MPI Bidirectional Bandwidth Test (Version 2.1)
	# Size          Bi-Bandwidth (MB/s)
	1               0.531761
	2               1.063121
	4               2.147913
	8               4.280755
	16              8.523452
	32              16.981693
	64              33.874987
	128             64.956544
	256             122.497886
	512             226.783855
	1024            367.798527
	2048            531.763593
	4096            613.404163
	8192            638.258306
	16384           645.862115
	32768           955.762031
	65536           1089.402315
	131072          1134.346706
	262144          1191.923405
	524288          1227.324266
	1048576         1245.633844
	2097152         1254.414692
	4194304         1261.847862

	 

	Scott Weitzenkamp

	SQA and Release Manager

	Server Virtualization Business Unit

	Cisco Systems

	 

		 

		
________________________________


		From: Davis, Arlin R [mailto:arlin.r.davis at intel.com] 
		Sent: Monday, May 01, 2006 11:29 AM
		To: Scott Weitzenkamp (sweitzen); Woodruff, Robert J;
Aviram Gutman; Tziporet Koren; Amit Mehrotra (amehrotr);
openfabrics-ewg at openib.org
		Cc: Alexander Smirnov
		Subject: RE: [openfabrics-ewg] rc3 installation issues.

		Scott,

		 

		Thanks for testing Intel MPI on OFED 1.0 rc3. Yes, the
socket CM uDAPL provider will work fine without uCMA and IPoIB
installed. Any idea why your latency is so high? It should be around
3-4us.

		 

		-arlin

		 

		
________________________________


		From: Scott Weitzenkamp (sweitzen)
[mailto:sweitzen at cisco.com] 
		Sent: Sunday, April 30, 2006 10:32 PM
		To: Woodruff, Robert J; Aviram Gutman; Tziporet Koren;
Amit Mehrotra (amehrotr); openfabrics-ewg at openib.org
		Cc: Alexander Smirnov; Davis, Arlin R
		Subject: RE: [openfabrics-ewg] rc3 installation issues.

		 

		Intel MPI 2.0 does work with OFED 1.0 rc3 on RHEL4 U3
despite the lack of uCMA, here's example output on x86_64 w/PCI-E:

		 

		.../osu_latency.x
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		I_MPI: [0] set_up_devices(): will use device:
libmpi.rdma.so
		I_MPI: [0] set_up_devices(): will use DAPL provider:
OpenIB-scm1
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		CMA: unable to open /dev/infiniband/rdma_cm
		I_MPI: [0] set_up_devices(): will use device:
libmpi.rdma.so
		I_MPI: [0] set_up_devices(): will use DAPL provider:
OpenIB-scm1
		# OSU MPI Latency Test (Version 2.1)
		# Size          Latency (us) 
		0               7.86
		1               7.53
		2               7.37
		4               7.43
		8               7.92
		16              7.29
		32              7.36
		64              7.50

		 

		Scott Weitzenkamp

		SQA and Release Manager

		Server Virtualization Business Unit

		Cisco Systems

		 

			 

			
________________________________


			From: openfabrics-ewg-bounces at openib.org
[mailto:openfabrics-ewg-bounces at openib.org] On Behalf Of Woodruff,
Robert J
			Sent: Tuesday, April 25, 2006 9:16 AM
			To: Aviram Gutman; Tziporet Koren; Amit Mehrotra
(amehrotr); openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			Intel MPI needs a stable uDAPL and supporting
usermode verbs/uCMA and such.

			If you have that, then Intel MPI should work. If
you want to get a copy of Intel MPI

			to use in your testing of OFED, I can put you in
contact with our Intel MPI lead and he can

			provide you with the code and a license. 

			 

			
________________________________


			From: Aviram Gutman
[mailto:aviram at mellanox.co.il] 
			Sent: Tuesday, April 25, 2006 9:09 AM
			To: Woodruff, Robert J; Tziporet Koren; Amit
Mehrotra (amehrotr); openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			Please work with Vlad to make sure you have what
you need in rc4.

			 

			Regards,

			   Aviram

			 

			 

			
________________________________


			From: Woodruff, Robert J
[mailto:robert.j.woodruff at intel.com] 
			Sent: Tuesday, April 25, 2006 6:56 PM
			To: Aviram Gutman; Tziporet Koren; Amit Mehrotra
(amehrotr); openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			Not yet. Last I heard, there was no uDAPL in
OFED that is neeed by Intel MPI. Maybe RC4. 

			 

			 

			 Tziporet wrote,

			> Regarding uDAPL and RHEL4 (up2 & 3): Since we
have not completed the uCMA backport for kernel 2.6.9 we eliminated
uDAPL  

			 

			
________________________________


			From: Aviram Gutman
[mailto:aviram at mellanox.co.il] 
			Sent: Tuesday, April 25, 2006 8:48 AM
			To: Woodruff, Robert J; Tziporet Koren; Amit
Mehrotra (amehrotr); openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			Woody,

			 

			Are you testing Intel MPI on top of OFED?

			 

			Regards,

			   Aviram

			 

			 

			
________________________________


			From: openfabrics-ewg-bounces at openib.org
[mailto:openfabrics-ewg-bounces at openib.org] On Behalf Of Woodruff,
Robert J
			Sent: Monday, April 24, 2006 11:33 PM
			To: Tziporet Koren; Amit Mehrotra (amehrotr);
openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			 

			The latest patches I have is for SVN 6509 on the
trunk. 

	
https://openib.org/svn/gen2/branches/backport-to-2.6.9/

	
infiniband-backport-svn6509-to-2.6.9-34.EL-kernel-fixups-00.diff
	
infiniband-backport-svn6509-to-2.6.9-34.EL-kernel-fixups-01.diff
	
infiniband-backport-svn6509-to-2.6.9-openib-drivers-02.diff
	
infiniband-backport-svn6509-to-2.6.9-openib-fixups-03.diff

			I don't split the patches by component, but
rather have one to strip out the old IB

			code in RedHat EL4.0, one to apply needed kernel
changes to the RedHat kernel,

			one for the openib drivers, and one for openib
fixups that are needed.

			 

			 

			 

			 

			 

			
________________________________


			From: Tziporet Koren
[mailto:tziporet at mellanox.co.il] 
			Sent: Monday, April 24, 2006 11:43 AM
			To: Woodruff, Robert J; Amit Mehrotra
(amehrotr); openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			We plan to do a backport of uCMA for RC4.

			Can you point me to your backport of uCMA for
2.6.9-34EL

			 

			Thanks,

			Tziporet

			 

			-----Original Message-----
			From: Bob Woodruff
[mailto:robert.j.woodruff at intel.com] 
			Sent: Monday, April 24, 2006 6:56 PM
			To: Tziporet Koren; Amit Mehrotra (amehrotr);
openfabrics-ewg at openib.org
			Cc: Alexander Smirnov; Davis, Arlin R
			Subject: RE: [openfabrics-ewg] rc3 installation
issues.

			 

			 Tziporet wrote,

			> Regarding uDAPL and RHEL4 (up2 & 3): Since we
have not completed the uCMA backport for kernel 2.6.9 we eliminated
uDAPL  

			>  installation in case of this kernel. We also
indicated this in the package limitation mail. 

			 

			Note that there are two uDAPL providers for
OpenIB. One uses standard sockets for connections (socket CM)

			and the other uses CMA. I would suggest that you
should include the socket CM uDAPL provider even

			if you do not have a backport for uCMA.

			BTW. - My backport patches contain backport for
uCMA for 2.6.9-34-EL if you need an example.

			 

			woody

			 

			 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20060503/1034868b/attachment.html>


More information about the ewg mailing list