[Openib-windows] RE: Geting remote and locale ip addresses - the functionality

Fab Tillier ftillier at silverstorm.com
Wed Sep 7 11:00:33 PDT 2005


> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> Sent: Wednesday, September 07, 2005 2:46 AM
> 
> In a previous mail Fab asked the following questions:
> 
> What functionality should IPoIB provide?  Should it provide?
> 
> 1. IP to path
> 2. Ethernet MAC to path
> 3. IP to GID
> 4. Ethernet MAC to GID
> 5. Get all locally assigned IP addresses
> 6. Path validation (e.g. given an IP pair and a path, return validity of
> path)
> 
> WSD would care most about 1, 5, and 6.  DAPL probably only cares about 1 and
> 6. What about SDP?
> 
> Before I'll try to answer the questions I'll say that I assume that each
> adapter has one mac address and that it can have one or many IP's. This is
> not always the case, but this is the very big majority of the cases.

Will we need to support multiple MAC addresses per instance?  What about
multiple GIDs per instance?

> The questions above are not very clear since they don't really say if this
> is remote IP or local ip, that should be used. Still, I'll try to explain
> the way I see things.

Items 1-4 in my list take as input pairs of IP or MAC addresses (source &
destination).  Items 3 and 4 return a pair of GIDs (source & destination).
Sorry for not clarifying this in my original mail.

> On SDP and WSD (I don't know about DAPL) one gets the remote IP that he has
> to connect to it. He might get the local IP.

WSD has the local IP - the WSD switch binds sockets to a specific IP before
using them in a connection.  So a connect request always has both the source and
destination IP addresses.
 
> If he doesn't get the locale
> IP, he should be able to find it. Windows supplies the GetBestRoute function
> that does exactly this. This function exists in user mode, I'm not sure if
> it exists in kernel mode, and in any case this is not something that the
> IPOIB should know about. From here I assume that we have both IP's.
> 
> Given the locale IP and the remote IP one should be able to find both ports
> GIDS. This is something that the IPOIB should be able to supply.
> 
> Let's look at the problem of converting the locale IP to a port's GID (and
> probably also the HCA's GID). This information should be known to the IPOIB,
> and we can make a query that gets the answer. In appendix a I explain how
> one can get this information even without using the IPOIB module.
> We will have to decide if we want to take the way from the appendix or
> create more changes to IPOIB. In any case, we have to note that this
> information changes. No matter how we do it, this makes the problem much
> more complicated (if you are pessimistic and much more interesting if you
> are optimistic).

IPoIB has a list of all locally assigned IP addresses.  It currently only
support IPv4, but should be simple enough to extend to support IPv6.  The local
IP addresses are stored in ipoib_adapter_t::ip_vector, and are of type
net_address_item_t.

> The second problem is getting the remote GID from the remote IP. This
> information is only known to the IPOIB module. After doing an arp (can be
> done easily from user mode and we should have the IP translated to the
> remote MAC). To my understanding, all information should already be in the
> IPOIB driver as an "endpt" (the function __endpt_mgr_ref should return it
> immediately).

Right now, IPoIB doesn't sort endpoints by IP address - only GID, LID, and MAC.
The code only supports a 1:1:1 mapping at the moment, so a MAC can only have one
LID and GID, each LID can only have one associated MAC and GID, etc.  Is this
something we need to change?  If we add IP to the endpoints, we will need at
least support for many IPs to one endpoint.

An alternative to adding IP support to IPoIB is to use the network stack to get
us from IP to Ethernet MAC, and then perform a lookup based on that.  It is a
bit more cumbersome, but certainly possible.

> As for translating the two gids to a path: This can be easily done by each
> ULP and is not something special to the IPOIB driver. Still each ULP has to
> do it, so it might be a good idea to also do the path lookup. To my filling
> it makes things somewhat complicated, and we can avoid it, but as I said, it
> can be done. 

I think initially supporting just the lookup from IP (or MAC) pair to GID pair
is good enough initially.  We can extend this later to add support for returning
the path so that we save on SA queries, or we can add caching to the access
layer for common SA records (like paths, node info, port info, etc).

> This also rises the question of doing it sync or non sync
> (which will make things even more complicated).

Assuming we're dealing with IRPs here and not a direct call interface, sync or
async is trivial - all we need to do is follow IRP processing rules.  If the
request is going to take some time we need to mark the IRP pending and return
(this is critical to allow clients to call this at DISPATCH_LEVEL).  Once the
request completes, complete the IRP.  It is then up to the caller to decide
whether to block waiting for a completion or use I/O completion notifications.

I would expect that for lookups into the IPoIB adapter's internal cache, the
operations would pretty much always complete immediately.  If we ever add logic
in IPoIB to send ARPs to resolve missing entries, asynchronous processing will
likely be necessary.  In any case, the only requirement on us is that we don't
perform any blocking operations in any of the IOCTL paths.

> Path validation - I'm not sure that I understand the question, if I have a
> path to validate, I can simply ask for the new path to be created and
> compare them, or is there anything else in this?

The idea of path validation is that if you receive a connection request that
claims to have a certain source and destination IP address pair, the recipient
needs to validate that those IP addresses are valid for the path record provided
by the CM in the REQ callback.  Basically, the code would do a lookup of the
source and destination IP addresses and compare the resulting GIDs to the source
and destination GIDs in the path record provided by the CM.  This provides a
level of validation to prevent malicious apps from pretending to have certain
trusted IP addresses to establish a connection when they otherwise should not be
allowed to.

However, thinking about it more, there's nothing that prevents the application
from doing the IP to GID lookup itself, and then comparing the results to the
path it received.  So please disregard this "feature".

> So to summarize:
> 1)     Get all locale ip addresses, We can do it or not do it, I recommend
> to do it in the IPOIB. As the information changes, it would be better to
> create some notification mechanism, I suggest not implementing this
> mechanism at start.

IPoIB already has this information, so I agree we should provide a way to get to
it - the WSD provider certainly needs it.  We don't need to implement any
notification mechanisms for when the addresses change, as the existing stack
already handles that.  Specifically, it doesn't matter for DAPL case because
DAPL is static.  For WSD, the switch already polls the providers when it detects
that an update is needed.

The alternative is to use the IP helper functions, but there are missing headers
from the DDK that prevent that form working properly without manually copying
headers from the platform SDK into the DDK directories (<mprapi.h> is one such
header).

I think we probably want two calls here.  The first would return all CA and port
GUIDs in use by IPoIB, with no input parameter.  The output would be something
like an array <CA GUID, PORT NUMBER, PORT GUID> tupples.  This allows a client
to open the CA (using the CA GUID), create a QP bound to the proper port (using
the port number), and query for IP addresses (using the port GUID).

The second call would take as input a port GUID, and return an array of IP
addresses assigned to that port.  The array entries should probably be 16 bytes
to accommodate IPv6 addresses, and use the standard method of storing IPv4
addresses (prefix with zero).

Does that make sense?

> 2)     Get remote IP - we must implement.

Assuming this takes a source and destination IP as input and returns a source
and destination GID as output, I agree.

How does something like this for the IOCTL input and output buffers sound:

struct _IPOIB_AT_IN
{
	UCHAR		SrcIp[16];
	UCHAR		DstIp[16];
};
struct _IPOIB_AT_OUT
{
	GID		SrcGid;
	GID		DstGid;
};

> 3)     Get/verify path we can implement but I believe that we shouldn't do
> it.

Yep, this isn't necessary - with the IP to GID lookup support an application can
easily implement this functionality itself.
 
> Appendix A - How to get all the locale IP addresses (and even to map them to
> InfiniBand ports):
> Windows provides the function GetAdaptersInfo which gives all IP adapters as
> well as their locale IP. For each adapter there is a "name" that is actually
> a guid that uniquely describes the adapter. In the registry there is a key
> named:
> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Network\{4D36E972-E325-1
> 1CE-BFC1-08002BE10318} that has an entry for each adapter. For each of the
> adapters there is a sub key call Connection. Under this there is a value
> that is called PnpInstanceID this value starts with IBA\IPOIB for InfiniBand
> devices, so we know that this is "our" card. Next there is the guid of the
> port.

Thanks for the information - the documentation doesn't really explain what the
name represents.  So I take it that given the name, we could query the registry
to find out if it's an IPoIB device, and if so we could even extract the port
GUID from the PnpInstanceId.

> Please note that since adapters (and more likely IP addresses) change with
> time, it is not that simple to cache the information in user mode.

The NotifyAddrChange IPHelper function can provide the notification mechanism
needed.  For the case of WSD, however, the switch can be used to drive this.

Thanks for the details!

- Fab




More information about the ofw mailing list