[Users] IB Partitioning

Jesper Larsen JLA at fcoo.dk
Tue Sep 18 01:24:06 PDT 2012


Hi David

Thanks for the help. I have now configured a subinterface. And I have tested that it is used when pinging one of my nodes. Unfortunately it only works in this opensm subnet manager configuration:

#Default=0x7fff, rate=7, defmember=limited: ALL;
Default=0x7fff, ipoib, rate=7: ALL=full;
DevNet=0x0001, ipoib, rate=7, defmember=full: 0x0002c903004ef895, 0x78e7d1030023ffd5, 0x78e7d1030023fdfd, 0x78e7d10300239885, 0x78e7d1030023ff7d, 0x78e7d1030021ecad, 0x78e7d10300245fa5, 0x78e7d1030024578d;

And not when I use the line which is commented out instead of the second line. I have checked the addresses in the DevNet and they are OK. And the pkeys seems to check out (I assume it is fine with the 8 instead of 0):

[root at dn003 ~]# cat /sys/class/net/ib0.8001/pkey
0x8001
[root at bifrost ~]# cat /sys/class/net/ib0.8001/pkey
0x8001

My routing table is:

[root at bifrost ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.128.0   *               255.255.255.0   U     0      0        0 ib0.8001
192.168.0.0     *               255.255.255.0   U     0      0        0 eth0
192.168.127.0   *               255.255.255.0   U     0      0        0 ib0
10.230.80.0     *               255.255.255.0   U     0      0        0 eth4
link-local      *               255.255.0.0     U     1002   0        0 eth0
link-local      *               255.255.0.0     U     1006   0        0 eth4
link-local      *               255.255.0.0     U     1013   0        0 ib0
link-local      *               255.255.0.0     U     1015   0        0 ib0.8001
default         10.230.80.3     0.0.0.0         UG    0      0        0 eth4
[root at bifrost ~]# ping 192.168.128.103
PING 192.168.128.103 (192.168.128.103) 56(84) bytes of data.
^C
--- 192.168.128.103 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2974ms

So it should use ib0.8001 for the ping command. Any hints on what I am doing wrong?

I have also tried to look for tools that can show/sniff which pkey is actually used in a communication. But I haven't been able to figure it out what to use (I looked at smpdump and smpquery). Any pointers?

Best regards,
Jesper

On Fri, Sep 14, 2012 at 4:21 AM, Jesper Larsen <JLA at fcoo.dk<mailto:JLA at fcoo.dk>> wrote:
Dear OFED users,

We have a cluster connected by a single IB switch which we want to split into two separate partitions (development/test and production). The partitions must not be able to talk to each other since we do not want errors happening on the development partition to be able to affect the production partition. I have therefore tried to make a partition configuration file for opensm:

/etc/opensm/partitions.conf

which at this point (just testing if I can make it work for a single non-default partition) looks something like this:

Default=0xffff, rate=7, defmember=limited: ALL;
DevNet=0x0001, ipoib, rate=7, defmember=full: 0x0002c903004ef895, 0x78e7d1030023ffd5, 0x78e7d1030023fdfd, 0x78e7d10300239885, 0x78e7d1030023ff7d, 0x78e7d1030021ecad, 0x78e7d10300245fa5, 0x78e7d1030024578d;

So I have made all the members of the default group limited so that they cannot talk together. The reason is that all nodes are members of the default group and that cannot as far as I understand be changed. And I have made a new group containing only development nodes. But when I try to ping one development node from another (over ib0) it cannot find it (Destination Host Unreachable). And when I look at at the two development nodes I see:

# cat /sys/class/net/ib0/pkey
0xffff

This will always show 0xffff since ib0 is always on the default partition.

Your 0x0001 partition will be used by ib0.8001 since IB partitions look like VLANs in the network interface world.

If you want to know what pkeys your HCA has, look at something like /sys/class/infiniband/*/ports/1/pkeys/*

Dave



Does this mean that it is only trying the default network? And if yes, how do I make it try the DevNet?

Best regards,
Jesper
_______________________________________________
Users mailing list
Users at lists.openfabrics.org<mailto:Users at lists.openfabrics.org>
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20120918/66a71b7d/attachment.html>


More information about the Users mailing list