[ewg] Need help for Infiniband optimisation for our cluster (MTU...)
Tom Elken
tom.elken at qlogic.com
Tue Dec 7 08:53:11 PST 2010
ibv_devinfo is perhaps a more commonly availble command that shows the HCA in use and the max and active IB MTU size.
e.g.
$ ibv_devinfo
hca_id: qib0
transport: InfiniBand (0)
fw_ver: 0.0.0
node_guid: 0011:7500:005a:6b02
sys_image_guid: 0011:7500:005a:6b02
vendor_id: 0x1175
vendor_part_id: 29474
hw_ver: 0x1
board_id: InfiniPath_QLE7340
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 2048 (4)
sm_lid: 56
port_lid: 248
port_lmc: 0x00
link_layer: IB
-Tom
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Mike Heinz
> Sent: Tuesday, December 07, 2010 8:24 AM
> To: Richard Croucher; 'giggzounet'; ewg at lists.openfabrics.org
> Subject: Re: [ewg] Need help for Infiniband optimisation for our
> cluster (MTU...)
>
> Richard - that's odd, I don't see an "ibv_portstat" command on my boxes
> - do you know what package provides it?
>
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Richard Croucher
> Sent: Tuesday, December 07, 2010 11:17 AM
> To: 'giggzounet'; ewg at lists.openfabrics.org
> Subject: Re: [ewg] Need help for Infiniband optimisation for our
> cluster (MTU...)
>
> The InfiniBand standard allows a MTU of 4096 bytes but the HCA you are
> using
> limits this to 2048. This can be set and queried using your SM
> management.
>
> On the server side, the ibv_portstat command will show the current MTU
> size.
>
> All the information you need is in the docs.
> There are also multiple parties, including myself, who offer training
> workshops for this stuff.
>
> Richard
>
>
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org
> [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of giggzounet
> Sent: 07 December 2010 15:32
> To: ewg at lists.openfabrics.org
> Subject: Re: [ewg] Need help for Infiniband optimisation for our
> cluster
> (MTU...)
>
> Hi,
>
> Thx for your answer!
>
> Particularly the explication between connected and datagram mode (I see
> that with the IMB1 benchmarks of mpi)!
>
> The hardware we are using in details:
> - on the master: Mellanox MHGH18-XTC ConnectX with VPI adapter, single
> port 20Gb/s, PCIe2.0 x8 2.5GT/s
> - on the nodes: Integrated Mellanox DDR Infiniband 20Gbs ConnectX with
> QSFP Connector.
>
> How can I know the limit of the MTU size ?
>
>
> On the Infiniband we are just using mpi with different CFD programs.
> But
> always with mpi (intel mpi or openmpi). Sould I use QoS ?
>
> Thx for your help!
>
>
> Le 07/12/2010 16:09, Richard Croucher a écrit :
> > Connected mode will provide more throughput but datagram mode will
> provide
> > lower latency.
> > You don't say what HCA's you are using. Some of the optimizations
> for
> > Connected mode are only available for the newer ConnectX QDR HCA's.
> >
> > Your HCA will probably limit the MTU size. Leave this as large as
> possible.
> >
> > If you are only running a single application on the InfiniBand you
> need
> not
> > bother with QoS. If you are running multiple, then you do need to
> set
> > this. This is quite complex since you need to define V'L's, their
> > arbitration policies and assign SL's to them. This is described in
> the
> > OpenSM docs. This is relevant even if you are using the embedded SM
> in
> the
> > switch.
> >
> > AS a newbie, take a look in the ../OFED/docs
> > There is probably all you need there. Mellanox also have some useful
> docs
> on
> > their website.
> >
> > -----Original Message-----
> > From: ewg-bounces at lists.openfabrics.org
> > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of giggzounet
> > Sent: 07 December 2010 14:01
> > To: ewg at lists.openfabrics.org
> > Subject: [ewg] Need help for Infiniband optimisation for our cluster
> > (MTU...)
> >
> > Hi,
> >
> > I'm new on this list. We have in our laboratory a little cluster:
> > - master 8 cores
> > - 8 nodes with 12 cores
> > - DDR infiniband switch Mellanox MTS3600R
> >
> > On these machines we have an oscar cluster with CentOS 5.5. We have
> > installed the ofed packages 1.5.1. The default config for the
> infiniband
> > is used. So infiniband is running in connected mode.
> >
> > Our cluster is used to solve CFD (Computational Fluid Dynamics)
> > problems. And I'm trying to optimize the infiniband network and so I
> > have several questions:
> >
> > - Is it the right mailing list to ask ? (if not...where should I post
> ?)
> >
> > - Is there a how-to on infiniband optimisation ?
> >
> > - CFD computations need a lot of bandwidth. There are a lot of data
> > exchange through MPI (we are using intel mpi). Has the infiniband
> mode
> > (connected or datagram) influence in this case ? What is the "best"
> MTU
> > for those computation ?
> >
> >
> > Best regards,
> > Guillaume
> >
> > _______________________________________________
> > ewg mailing list
> > ewg at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
More information about the ewg
mailing list