<html>

<body>

<font size=3><br>

Isn't this getting a bit more complex than it needs to be.  Let me

see if I have this correct:<br><br>

1. Applications want to use existing API to identify remote endnodes /

services.  <br><br>

2. Endnodes are identified by an IPv4 / v6 address and services by a port

number<br><br>

3. The existing network stacks already comprehend how to discover routes

to endnodes using ARP / ND.  These protocols can determine whether

there is a single or multiple IP addresses and store these in the local

network stack route table.<br><br>

4. Route tables can contain any number of layer 2 and 3 address

information (function of implementation) and various policies can be

constructed to make an intelligent decision on which layer 2 and 3

addresses to return to an application.<br><br>

5. iWARP can use the existing infrastructure without modification so no

changes are required to make it work.<br><br>

6. IB uses a different layer 2 address - not just a 48-bit MAC - thus

while different than Ethernet, conceptually works just the

same.   Both can support multiple IP addresses per layer 2

address as it is really just a matter of replicating the information on a

per IP address basis.<br><br>

7. When a route look up occurs, a set of IP addresses are returned. 

Depending upon the kernel interface, one can also return the layer 2

information either as part of this look up or as a separate query to the

route table.<br><br>

8. Layer 2 information provides the necessary data to construct CM

messages or to identify the path for the IP over IB ULP.<br><br>

So, from the above, it seems the IP and IB world can operate using the

same code and work just fine.  So, where is the problem?  Is it

really just how management assigns IP address to IB interfaces and how an

application should select or be informed of which IP address to use and

thus transparently identifies the IB port?  Where is the connection

establishment problem?  The application does not see any

difference.   The network stack only acts as a repository for

routing information unless running directly over IP over IB thus is not

impacted.  The middleware simply needs to extract the layer 2

information thus obtains the requisite data to construct the CM messages

when going straight to IB (there is no change required here for iWARP as

this is all native to its operation).   What am I missing

here?<br><br>

Mike<br><br>

<br><br>

At 10:10 AM 10/9/2005, Tom Tucker wrote:<br>

<blockquote type=cite class=cite cite="">On Sun, 2005-10-09 at 07:57

-0700, Sean Hefty wrote:<br>

> >It is theoretically possible to support all this on an IPoIB

based<br>

> >network. Multiple subnets, multiple routes to remote peers,

ICMP<br>

> >redirect, multiple IP addresses for each physical interface,

yada yada<br>

> >yada. But IMHO, the only way to do this would be to tie directly

into<br>

> >the existing routing,  ARP, ICMP, etc... subsystems in

Linux. Otherwise<br>

> >you'll end up recreating a gigantic (and I mean GIGANTIC) amount

of<br>

> <br>

> The current implementation ties into the standard Linux ARP

tables.  If<br>

> connections were made over TCP/IP, using IPoIB, then I don't think

that there<br>

> would be any issues.  The issues only arise because of the

desire to use TCP/IP<br>

> network addresses over a non-TCP/IP network.<br>

> <br>

> >code. This belief is why I've been a proponent of mapping GIDs

to one<br>

> >and only one IP address and treating it for management purposes

as the<br>

> >equivalent of an IP address. Without this, the whole mechanism

for<br>

> >determining routes, etc.. breaks down. If you treat the GID like

a MAC<br>

> >address -- it breaks, because a MAC address can have multiple

IP<br>

> >addresses -- the observation that lead to the conclusion that

ATS was<br>

> >broken in the first place.<br>

> <br>

> We should be able to handle the case where a GID has multiple IP

addresses bound<br>

> to it.  But even if we added a 1:1 restriction, the connection

over IB issue<br>

> still exists.<br><br>

I agree, except for RARP.<br><br>

> <br>

> >I know there is significant resistance to this idea, but I just

don't<br>

> >see how we get this generically resolved without binding the

two<br>

> >addressing schemes more closely. With the current binding, I

just don't<br>

> >think it works.<br>

> <br>

> Again, I don't think that the binding is the issue, so much as the

desire to use<br>

> an address for a protocol that isn't actually being used for

communication.  <br><br>

Not to be pedantic, but if binding or mapping or somesuch weren't an<br>

issue we wouldn't need AT. <br><br>

> I<br>

> don't view a GID as an IP address because we're not sending and

receiving IP<br>

> packets on the GID.  IPoIB treats GIDs as only part of a MAC

address, which I<br>

> think is the proper view. <br>

><br>

> Anyway, returning back to the original problem of connecting to an

IB gateway if<br>

> a given a destination IP address on a different subnet...  I'm

slowly convincing<br>

> myself that either the CMA or AT should do this.  (I believe

that the ib_addr<br>

> code will do this now, but still wasn't sure that we wanted it

to.)<br>

> <br><br>

IMHO, you need a service separate from the CMA to do address<br>

translation. My (iWARP's) rationale for this is that there are two<br>

clients of the service, the CM and IP. For CM, you need it to elect

a<br>

route and thereby a local interface. For IP you need it because

routes<br>

change and ARP entries time out. <br><br>

BTW, can you educate me ... is the following what you're

thinking:<br><br>

On the client side...<br><br>

- route is discovered by looking at the Linux routing table<br>

- local interface is IPoIB (looks at rdma_ptr embedded in netdev

struct)<br>

- send ARP AT message over local IB interface<br><br>

At the gateway...bridging to IP<br><br>

- ARP AT query received on IB interface<br>

- Lookup route to destination IP address in gateway's route table. <br>

- If next hop's Ethernet address is already known, it is returned <br>

- Otherwise, local interface identified is IPoEthernet<br>

- New ARP query goes out on the local interface from the route<br>

- When response comes back, answer is returned.<br><br>

At the gateway...bridging to IPoIB<br><br>

- ARP AT message received on IB interface, delivered to AT<br>

- Lookup route to destination IP address in gateway's route table<br>

- If next hop's Ethernet address is already known, it is returned<br>

- otherwise, local interface identified in route is IPoIB<br>

- New ARP AT query goes out on the local interface<br>

- When response comes back, answer is returned.<br><br>

Thanks,<br><br>

<br><br>

> - Sean<br>

> <br>

> <br>

_______________________________________________<br>

openib-general mailing list<br>

openib-general@openib.org<br>

<a href="http://openib.org/mailman/listinfo/openib-general" eudora="autourl">

http://openib.org/mailman/listinfo/openib-general</a><br><br>

To unsubscribe, please visit

<a href="http://openib.org/mailman/listinfo/openib-general" eudora="autourl">

http://openib.org/mailman/listinfo/openib-general</a>

</font></blockquote></body>

</html>