[openib-general] Re: RDMA connection and address translation API

Wed Aug 24 14:16:59 PDT 2005

Replying to the digest format, so lots all at once

----------------------------------------------------------------------

Message: 1
Date: Wed, 24 Aug 2005 11:59:57 -0700
From: "Fab Tillier" <ftillier at silverstorm.com>
Subject: RE: [openib-general] RDMA connection and address translation
	API
To: "'Caitlin Bestler'" <caitlin.bestler at gmail.com>
Cc: openib-general at openib.org
Message-ID: <000301c5a8de$14fc1cc0$6312000a at infiniconsys.com>
Content-Type: text/plain;	charset="us-ascii"

> From: Caitlin Bestler [mailto:caitlin.bestler at gmail.com]
> Sent: Wednesday, August 24, 2005 11:14 AM
> 
> On 8/24/05, Fab Tillier <ftillier at silverstorm.com> wrote:
> > > From: Roland Dreier [mailto:rolandd at cisco.com]
> > > Sent: Wednesday, August 24, 2005 10:16 AM
> > >
> > >     Fab> Knowledge of actual IP addresses would be up to the consumer.
> > >     Fab> However, the IB CM can facilitate checks by allowing the user
> > >     Fab> to specify an offset and length in the private data to match
> > >     Fab> to for incoming requests.
> > >
> > > This seems too complex and at the same time too limited to me.  
> > > For one thing -- although I think ATS should die -- this doesn't 
> > > support ATS reverse lookups.
> >
> > I think if all ULPs provide their source and destination IP in the 
> > private data, you can eliminate the reverse lookup altogether.  A 
> > simple forward lookup is all that's needed to validate that the 
> > source GID in the REQ matches the reported source IP in the private 
> > data.  The forward lookup could be done via ATS or via ARP, but the 
> > CM doesn't need to care which method is used.
> 
> That is not an option.
> 
> The applications are expecting source/destination network addresses 
> that come from a network layer, not from the peer application. IP has 
> no problem meeting this requirement. This is an IB problem that needs 
> to be solved within the scope of IB without changing any ULPs.

If the app wants to use source/destination network addresses, there isn't a
problem.
The problem is the app wants to use IP addresses, which are *not* network
addresses in IB.
So the app needs to decide between one of two things - be aware of IB network
addresses,
or provide meaning to IP addresses over IB.
The latter can't be done reliably under the covers - ATS reverse lookups
won't tell you
the IP the source actually used, and there's no way to do so without either
using private
data in the CM REQ or requiring a 1:1 mapping of IB:IP addresses.  The 1:1
IB:IP mapping
is not feasible, so the only way to know what IP address the application used
is to embed
that into the private data.  I would expect protocols that try to use IP as
their addressing
would accommodate this in their IB usage, just like SDP accommodates it in
the hello message.

- Fab

<caitlin>
The question under debate is precisely how to define an API for transport
neutral
middleware and kernel applications. No one has proposed elimination or
deprecation
of the IB CM optimized IB-specific API.

Any definition of a transport neutral "network address" is going to conform
to the
semantics of an IPv6 address. IP networks will never evolve pass the legacy
of
the current definition of an IP address. In fact it could be argued that
IPoIB
shows that IB networks won't either.

There is no need for a 1:1 IB:IP mapping. All that is required is:

a) The network address be mappable to a host name.
b) That the host name be mappable back to the network address) even if the
latter is a list).
c) A server can determine its local address and advertise it by application
specific means.
d) A server can determine the network address of a remote peer requesting
service
	1) use it to identify the remote host (see a).
	2) will be able to use it to send packets back to the peer in a
reliable connection.

Can you show me anything in the IB spec that prevents direct translation of
specific 
IPV6 subnets to IB networks? Subdivision of the 128-bit network address space
is a
requirement for supporting multiple transports on the same host anyway. There
is no
reason the same technique cannot be used within a single IB device to
distinquish
between addresses that don't need translation (network address is GID) and
those
that do (when IPV4 addresses and/or multiple IPV6 addresses are desired).
</caitlin>

------------------------------

Message: 2
Date: Wed, 24 Aug 2005 11:59:57 -0700
From: "Fab Tillier" <ftillier at silverstorm.com>
Subject: RE: [openib-general] RDMA connection and address translation
	API
To: "'Roland Dreier'" <rolandd at cisco.com>
Cc: openib-general at openib.org
Message-ID: <000401c5a8de$2c32cce0$6312000a at infiniconsys.com>
Content-Type: text/plain;	charset="us-ascii"

> From: Roland Dreier [mailto:rolandd at cisco.com]
> Sent: Wednesday, August 24, 2005 11:03 AM
> 
>     Fab> Why can't the IPV field be ignored?  If a listen wants only
>     Fab> IPV4 addresses, it would specify a 16-byte compare buffer
>     Fab> with the first 12 bytes zero, the next 4 filled with the IPV4
>     Fab> address, and would set the offset to that of the hello
>     Fab> message's destination address (32).
> 
> Yes, you're right for SDP.  I guess if we're comfortable mandating 
> that all protocols put their source and destination IPs in the private 
> data for the IB case, then this works.  Of course it's somewhat 
> awkward to pass this information into the transport-neutral CM API but 
> I think this can be worked around.

I don't know if we need to mandate IP usage - it's up to the application.
Any application that wants to have similar semantics to the way socket
listens work (especially when bound to one of multiple IP addresses on a
port) the application would have to define its private data to accommodate
this.

At the IB level, the contents of the private data are still opaque, even to
the CM.  The CM would only expose the ability to have it perform an initial
triage of requests by doing binary comparisons over regions of private data.
It doesn't know (or need to know) what the data represents - it only cares
about finding a match (or not).  The CM doesn't define any sort of policy
here, and I don't think it should.  It's just bytes to the CM, and it's doing
a blind comparison without interpreting the contents.

- Fab

<caitlin>
If the application wants IB specific semantics it can use an IB specific API.
That is probably needed for options like CM redirection anyway.
</caitlin>

------------------------------

Message: 4
Date: Wed, 24 Aug 2005 15:09:18 -0400
From: "Tom Tucker" <tom at ammasso.com>
Subject: RE: [openib-general] RDMA connection and address translation
	API
To: "Roland Dreier" <rolandd at cisco.com>
Cc: openib-general at openib.org
Message-ID: <8E9D028761D8264D910612167E8457E8FA3B36 at mail2.ammasso.com>
Content-Type: text/plain;	charset="US-ASCII"

> -----Original Message-----
> From: Roland Dreier [mailto:rolandd at cisco.com]
> Sent: Wednesday, August 24, 2005 1:17 PM
> To: Tom Tucker
> Cc: openib-general at openib.org
> Subject: Re: [openib-general] RDMA connection and address translation 
> API
> 
>     Tom> Good point, although for iWARP it will work that way that you
>     Tom> expect.  For IB, admitedly it's more complex and would
>     Tom> require ATS. There seems to be significant reluctance around
>     Tom> ATS and I don't understand the issues. Can you provide a
>     Tom> quick synopsis?
> 
> My resistance is that ATS is just complexity without any benefit.  

IMHO the benefit is that you have a transport independent addressing
mechanism -- albeit with some limitations as you've mentioned. In this case,
the vast majority of clients enjoy the benefit without suffering the
limitations.

<caitlin>
Exactly. We are trying to define an *additional* API that has transport
independent addressing semantics. There is already an API for transports
that have IB semantics.  We don't need a second one.
</caitlin>

> ... It
> doesn't provide additional security.  It doesn't solve the 
> multi-homing problem we're talking about now.

Whenever a single GID maps to multiple IP addresses, I agree, it is a
limitation. However, I don't believe that this is strictly necessary.

> ... Once you've thrown away
> information by turning your IP address into an IB GID, there's no 
> magic way ATS can recreate that information and be psychic about which 
> of the multi-homed IPs you actually meant.

I agree, so don't do that. If you want it to work properly, then you need to
map GIDS to IP addresses. 

<caitlin>
But you don't need to know which of the multi-homed IPs you actually meant.
You just need one that translates back to the same remote entity. The exact
same problem already exists in IP networks because of PNAT.
</caitlin>

> ... So why not just put the IP
> addressing information into the CM private data, the way that the SDP 
> protocol already does?
> 
>  - R.
> 

Because it would be better to configure your network "properly". Putting IP
addresses in private data is fundamentally insecure since any user mode
client can spoof the IP address. 

<caitlin>
The existing contract with the ULP (especially NFS) is that
the network layer identifies the remote peer and that said
identification is coming from the network layer not from the
remote application layer.

It's up to IB to decide how to meet those semantics. iWARP
has no problem with them.
</caitlin>

------------------------------

Message: 5
Date: Wed, 24 Aug 2005 12:11:38 -0700
From: "Sean Hefty" <sean.hefty at intel.com>
Subject: RE: [openib-general] RDMA connection and address translation
	API
To: "'Tom Tucker'" <tom at ammasso.com>,	"Roland Dreier"
	<rolandd at cisco.com>
Cc: openib-general at openib.org
Message-ID: <ORSMSX4011XvpFVjCRG000000c6 at orsmsx401.amr.corp.intel.com>
Content-Type: text/plain;	charset="us-ascii"

>Because it would be better to configure your network "properly". 
>Putting IP addresses in private data is fundamentally insecure since 
>any user mode client can spoof the IP address.

A simple forward lookup could detect this.

- Sean

<caitlin>
A simple forward lookup by whom?

Again, the point is that identification of the remote peer provided
to the Consumer is supposed to be already validated.
</caitlin>

------------------------------

Message: 6
Date: Wed, 24 Aug 2005 22:12:57 +0300
From: "Yaron Haviv" <yaronh at voltaire.com>
Subject: RE: [openib-general] RDMA connection and address translation
	API
To: "Fab Tillier" <ftillier at silverstorm.com>,	"Roland Dreier"
	<rolandd at cisco.com>
Cc: openib-general at openib.org
Message-ID:
	<35EA21F54A45CB47B879F21A91F4862F7141F3 at taurus.voltaire.com>
Content-Type: text/plain;	charset="us-ascii"

> -----Original Message-----
> From: openib-general-bounces at openib.org [mailto:openib-general- 
> bounces at openib.org] On Behalf Of Fab Tillier
> Sent: Wednesday, August 24, 2005 3:00 PM
> To: 'Roland Dreier'
> Cc: openib-general at openib.org
> Subject: RE: [openib-general] RDMA connection and address translation
API
> 
> > From: Roland Dreier [mailto:rolandd at cisco.com]
> > Sent: Wednesday, August 24, 2005 11:03 AM
> >
> >     Fab> Why can't the IPV field be ignored?  If a listen wants only
> >     Fab> IPV4 addresses, it would specify a 16-byte compare buffer
> >     Fab> with the first 12 bytes zero, the next 4 filled with the
IPV4
> >     Fab> address, and would set the offset to that of the hello
> >     Fab> message's destination address (32).
> >
> > Yes, you're right for SDP.  I guess if we're comfortable mandating 
> > that all protocols put their source and destination IPs in the
private
> > data for the IB case, then this works.  Of course it's somewhat 
> > awkward to pass this information into the transport-neutral CM API
but
> > I think this can be worked around.
> 
> I don't know if we need to mandate IP usage - it's up to the
application.
> Any
> application that wants to have similar semantics to the way socket
listens
> work
> (especially when bound to one of multiple IP addresses on a port) the 
> application would have to define its private data to accommodate this.
> 

The context of this discussion is around a common API for iWarp/IB ULPs In
that case they all use IP addresses (since it's the common
addressing) 

If someone would use the IB specific API under this abstraction level he can
provide what ever data he wants to the CM

Any way providing src/dst IPs in the CM Private data is simple, and we can
come with IBTA extension blessing that data structure as a general way to map
IP oriented protocols over IB (a 1-2 page draft at the most) This way it can
also address Caitlin concerns regarding NFS & IETF (since now it's a
transport specific issue)

Yaron

<caitlin>
Correct, an IBTA and/or  IETF sanctioned standard use of CM Private Data that
supplied IP Addresses in the Private Data (especially when the data came from
the stack rather than the Consumer) would be perfectly fine.

It would be interoperable (not OpenIB dependent), and it would provide the
required API semantics (the network address supplied would be coming from
the network layer). No changes to the ULP would be required.
</caitin>