[ofa-general] Infiniband back-to-back without OpenSM?

Talpey, Thomas Thomas.Talpey at netapp.com
Wed May 28 06:24:21 PDT 2008


At 09:03 AM 5/28/2008, Hal Rosenstock wrote:
>On Wed, 2008-05-28 at 08:56 -0400, Talpey, Thomas wrote:
>> At 08:39 AM 5/28/2008, Hal Rosenstock wrote:
>> >Tom,
>> >
>> >On Wed, 2008-05-28 at 08:06 -0400, Talpey, Thomas wrote:
>> >> Is it possible to manually configure two Infiniband ports to operate
>> >> with one another in back-to-back mode, without running OpenSM
>> >> on one of them?
>> >
>> >This is possible but something would need to do at least some subset of
>> >what the SM does depending on the precise requirements and the limits
>> >placed on the environment supported without a "full blown" SM.
>> 
>> Okay ... but IMO the only thing we need is a LID. Or at least, in my 
>experience
>> all I've needed is a LID.
>
>The port also needs to be walked from init to active which takes
>coordination at both ends of the b2b link.

Yep. But, it has all it needs with a LID, right? No messages need to be
exchanged, for instance.

>
>> In a previous effort, we simply stole the low octet of an IP address, so we'd
>> "ifconfig ib0 1.2.3.X" and it would jam lid=X into the interface. 
>Worked great.
>> If necessary, we would set a manual arp entry (using iproute) to avoid having
>> to broadcast.
>
>That could be done if that is what is desired and can be relied upon
>(that ib0 is configured and we only care about the first port).
>
>Is it just ARP support that is needed ?

Well, ARP is the precursor to establishing an IP send and a TCP connection,
which we need to do also. But, if the resulting ipaddr-hwaddr mapping is
installed, then ARP is unnecessary and the IP layer can send without using it.

When we did this before, we'd install a "permanent" ARP entry, in a two-line
shell script. Roughly, for peers configuring lids X and Y, it would do

peer X:
	ifconfig ib0 1.2.3.X
	ip neigh add 1.2.3.Y nud permanent lladdr a.b.c.d.e.f....Y (i.e. Y's guid)

peer Y:
	ifconfig ib0 1.2.3.Y
	ip neigh add 1.2.3.X nud permanent lladdr a.b.c.d.e.f....X

And we'd be up and running for both IP and RDMA connections. We fixed a
bug in the old iproute2 command to allow the long IB link addresses.

I'm thinking that using IPOIB to drive this kind of manual setup is one way
to approach it. It certainly would be simple, and worked for us before there
was an OFA stack.

Maybe I'm getting ahead of myself though, still wondering if there's a way
to do it with what we have.

Tom.

>
>> >> We have done this on other IB implementations by manually assigning
>> >> LIDs, but I discover that the "lid" entry below 
>> >/sys/class/infiniband/<device>
>> >> is not writable, at least for mthca.
>> >
>> >This can be done via MADs so user_mad kernel module would be needed to
>> >do this.
>> 
>> Okay, all kernel modules can be assumed to be in place. How do we tell it
>> to manage the LID, with a shell command?
>
>A new "command" would be needed.
>
>-- Hal
>
>> >> Also, I expect that the ipoib driver will
>> >> be unable to join the broadcast group, so will be unwilling to 
>come up fully.
>> >
>> >Is IPoIB a requirement ?
>> 
>> I think so, for two reasons. One, principle of least surprise - the user will
>> expect to be able to ping, telnet etc if it has connectivity. Two, 
>for NFS/RDMA
>> we require TCP and UDP connections in order to perform the mount and do
>> locking and recovery. We could do those over a parallel ethernet connection,
>> but that's kind of not the point.
>> 
>> >
>> >> With ethernet, and maybe iWARP, just a simple ifconfig can do this. So why
>> >> not IB?
>> >
>> >The simple answer is that it is the nature of IB management (being
>> >different than ethernet).
>> 
>> Which, IMO, we need to boil down to simplest-possible, for at least some
>> workable configuration.
>> 
>> Thanks for the ideas!
>> 
>> Tom.
>> 
>> >
>> >-- Hal
>> >
>> >> If you're wondering, my goal is give NFS/RDMA users a way to avoid having
>> >> to install the many userspace modules needed to do this, including 
>> >libibverbs,
>> >> opensm, etc. There's a lot to get wrong, and things go missing. Seeking an
>> >> "easy" way to get started with just the kernel and some shell commands.
>> >> 
>> >> Tom.
>> >> 
>> >> _______________________________________________
>> >> general mailing list
>> >> general at lists.openfabrics.org
>> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> >> 
>> >> To unsubscribe, please visit 
>> >http://openib.org/mailman/listinfo/openib-general
>> 
>
>_______________________________________________
>general mailing list
>general at lists.openfabrics.org
>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list