[openib-general] OpenSM (again)

Roland Fehrenbacher rf at q-leap.de
Mon Apr 11 09:28:20 PDT 2005


Hi,

I got gen2 opensm running fine now (there was a problem with a wrong
include file), and managed to get IP running on a network of
currently 40 machines (final size will be 144). Performance is pretty
impressive (initial tests with a simple netpipe): I got a latency of
18microsec, and a maximum throughput of approx. 400MB/sec at packet
size approx. 1MB which then levels of at about 340MB/s for larger
packets.

One problem and two questions:

Problem: When I reboot all the 40 nodes (apart from the one the opensm
is running), the network is non-functional (no pings go through, even
though ports show status "Active") for quite a while (more than 10
minutes) after all the nodes have come up. It then recovers without
intervention. Is this normal? Single node reboots don't affect the
network operation. osm Log file is appended.

Question 1: Can I run opensm in a master slave configuration? I noticed
that there is a priority commandline option, but am not sure how to
apply this.

Question 2: I plan to run the gen1/Mellanox IBGD drivers on the
compute nodes (need fast MPI), and gen2 on the control/storage nodes
(need only IP) with gen2 opensm running on the control nodes. Is there
any reason why this should not work reliably?

Roland

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: osm-port1.log
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050411/4d15ba5b/attachment.ksh>


More information about the general mailing list