[ewg] Need help for Infiniband optimisation for our cluster (MTU...)

giggzounet giggzounet at gmail.com
Tue Dec 7 07:32:05 PST 2010


Hi,

Thx for your answer!

Particularly the explication between connected and datagram mode (I see
that with the IMB1 benchmarks of mpi)!

The hardware we are using in details:
- on the master: Mellanox MHGH18-XTC ConnectX with VPI adapter, single
port 20Gb/s, PCIe2.0 x8 2.5GT/s
- on the nodes: Integrated Mellanox DDR Infiniband 20Gbs ConnectX with
QSFP Connector.

How can I know the limit of the MTU size ?


On the Infiniband we are just using mpi with different CFD programs. But
always with mpi (intel mpi or openmpi). Sould I use QoS ?

Thx for your help!


Le 07/12/2010 16:09, Richard Croucher a écrit :
> Connected mode will provide more throughput but datagram mode will provide
> lower latency.  
> You don't say what HCA's you are using.  Some of the optimizations for
> Connected mode are only available for the newer ConnectX QDR HCA's.
> 
> Your HCA will probably limit the MTU size.  Leave this as large as possible.
> 
> If you are only running a single application on the InfiniBand you need not
> bother with QoS.   If you are running multiple, then you do need to set
> this.  This is quite complex since you need to define V'L's, their
> arbitration policies and assign SL's to them.  This is described in the
> OpenSM docs.  This is relevant even if you are using the embedded SM in the
> switch.
> 
> AS a newbie, take a look in the ../OFED/docs  
> There is probably all you need there. Mellanox also have some useful docs on
> their website.
> 
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org
> [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of giggzounet
> Sent: 07 December 2010 14:01
> To: ewg at lists.openfabrics.org
> Subject: [ewg] Need help for Infiniband optimisation for our cluster
> (MTU...)
> 
> Hi,
> 
> I'm new on this list. We have in our laboratory a little cluster:
> - master 8 cores
> - 8 nodes with 12 cores
> - DDR infiniband switch Mellanox MTS3600R
> 
> On these machines we have an oscar cluster with CentOS 5.5. We have
> installed the ofed packages 1.5.1. The default config for the infiniband
> is used. So infiniband is running in connected mode.
> 
> Our cluster is used to solve CFD (Computational Fluid Dynamics)
> problems. And I'm trying to optimize the infiniband network and so I
> have several questions:
> 
> - Is it the right mailing list to ask ? (if not...where should I post ?)
> 
> - Is there a how-to on infiniband optimisation ?
> 
> - CFD computations need a lot of bandwidth. There are a lot of data
> exchange through MPI (we are using intel mpi). Has the infiniband mode
> (connected or datagram) influence in this case ? What is the "best" MTU
> for those computation ?
> 
> 
> Best regards,
> Guillaume
> 
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg





More information about the ewg mailing list