[Openib-windows] RE: Geting remote and locale ip addresses - the functionality

Wed Sep 7 13:23:50 PDT 2005

>-----Original Message-----
>From: Fab Tillier [mailto:ftillier at silverstorm.com]
>Sent: Wednesday, September 07, 2005 9:01 PM
>To: 'Tzachi Dar'; openib-windows at openib.org
>Subject: RE: Geting remote and locale ip addresses - the functionality
>
>> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
>> Sent: Wednesday, September 07, 2005 2:46 AM
>>
>> In a previous mail Fab asked the following questions:
>>
>> What functionality should IPoIB provide?  Should it provide?
>>
>> 1. IP to path
>> 2. Ethernet MAC to path
>> 3. IP to GID
>> 4. Ethernet MAC to GID
>> 5. Get all locally assigned IP addresses
>> 6. Path validation (e.g. given an IP pair and a path, return validity of
>> path)
>>
>> WSD would care most about 1, 5, and 6.  DAPL probably only cares about 1
>and
>> 6. What about SDP?
>>
>> Before I'll try to answer the questions I'll say that I assume that each
>> adapter has one mac address and that it can have one or many IP's. This
>is
>> not always the case, but this is the very big majority of the cases.
>
>Will we need to support multiple MAC addresses per instance?  What about
>multiple GIDs per instance?
>
I believe that in the near future we will not have to support such
scenarios. 

>> The questions above are not very clear since they don't really say if
>this
>> is remote IP or local ip, that should be used. Still, I'll try to explain
>> the way I see things.
>
>Items 1-4 in my list take as input pairs of IP or MAC addresses (source &
>destination).  Items 3 and 4 return a pair of GIDs (source & destination).
>Sorry for not clarifying this in my original mail.
>
>> On SDP and WSD (I don't know about DAPL) one gets the remote IP that he
>has
>> to connect to it. He might get the local IP.
>
>WSD has the local IP - the WSD switch binds sockets to a specific IP before
>using them in a connection.  So a connect request always has both the
>source and
>destination IP addresses.
>
>> If he doesn't get the locale
>> IP, he should be able to find it. Windows supplies the GetBestRoute
>function
>> that does exactly this. This function exists in user mode, I'm not sure
>if
>> it exists in kernel mode, and in any case this is not something that the
>> IPOIB should know about. From here I assume that we have both IP's.
>>
>> Given the locale IP and the remote IP one should be able to find both
>ports
>> GIDS. This is something that the IPOIB should be able to supply.
>>
>> Let's look at the problem of converting the locale IP to a port's GID
>(and
>> probably also the HCA's GID). This information should be known to the
>IPOIB,
>> and we can make a query that gets the answer. In appendix a I explain how
>> one can get this information even without using the IPOIB module.
>> We will have to decide if we want to take the way from the appendix or
>> create more changes to IPOIB. In any case, we have to note that this
>> information changes. No matter how we do it, this makes the problem much
>> more complicated (if you are pessimistic and much more interesting if you
>> are optimistic).
>
>IPoIB has a list of all locally assigned IP addresses.  It currently only
>support IPv4, but should be simple enough to extend to support IPv6.  The
>local
>IP addresses are stored in ipoib_adapter_t::ip_vector, and are of type
>net_address_item_t.
>
>> The second problem is getting the remote GID from the remote IP. This
>> information is only known to the IPOIB module. After doing an arp (can be
>> done easily from user mode and we should have the IP translated to the
>> remote MAC). To my understanding, all information should already be in
>the
>> IPOIB driver as an "endpt" (the function __endpt_mgr_ref should return it
>> immediately).
>
>Right now, IPoIB doesn't sort endpoints by IP address - only GID, LID, and
>MAC.
>The code only supports a 1:1:1 mapping at the moment, so a MAC can only
>have one
>LID and GID, each LID can only have one associated MAC and GID, etc.  Is
>this
>something we need to change?  If we add IP to the endpoints, we will need
>at
>least support for many IPs to one endpoint.
>
>An alternative to adding IP support to IPoIB is to use the network stack to
>get
>us from IP to Ethernet MAC, and then perform a lookup based on that.  It is
>a
>bit more cumbersome, but certainly possible.
>
I believe that this is the straight forward way and this is what should be
used. Since there is a function that converts the remote IP to a mac
addresses, I believe that we should use it and do the remote query based on
this arp addresses.

>> As for translating the two gids to a path: This can be easily done by
>each
>> ULP and is not something special to the IPOIB driver. Still each ULP has
>to
>> do it, so it might be a good idea to also do the path lookup. To my
>filling
>> it makes things somewhat complicated, and we can avoid it, but as I said,
>it
>> can be done.
>
>I think initially supporting just the lookup from IP (or MAC) pair to GID
>pair
>is good enough initially.  We can extend this later to add support for
>returning
>the path so that we save on SA queries, or we can add caching to the access
>layer for common SA records (like paths, node info, port info, etc).
>
>> This also rises the question of doing it sync or non sync
>> (which will make things even more complicated).
>
>Assuming we're dealing with IRPs here and not a direct call interface, sync
>or
>async is trivial - all we need to do is follow IRP processing rules.  If
>the
>request is going to take some time we need to mark the IRP pending and
>return
>(this is critical to allow clients to call this at DISPATCH_LEVEL).  Once
>the
>request completes, complete the IRP.  It is then up to the caller to decide
>whether to block waiting for a completion or use I/O completion
>notifications.
>
>I would expect that for lookups into the IPoIB adapter's internal cache,
>the
>operations would pretty much always complete immediately.  If we ever add
>logic
>in IPoIB to send ARPs to resolve missing entries, asynchronous processing
>will
>likely be necessary.  In any case, the only requirement on us is that we
>don't
>perform any blocking operations in any of the IOCTL paths.
>
The function that will convert the remote IP to the remote mac will very
likely do the arp by itself, so I don't expect any blocking operation.

>> Path validation - I'm not sure that I understand the question, if I have
>a
>> path to validate, I can simply ask for the new path to be created and
>> compare them, or is there anything else in this?
>
>The idea of path validation is that if you receive a connection request
>that
>claims to have a certain source and destination IP address pair, the
>recipient
>needs to validate that those IP addresses are valid for the path record
>provided
>by the CM in the REQ callback.  Basically, the code would do a lookup of
>the
>source and destination IP addresses and compare the resulting GIDs to the
>source
>and destination GIDs in the path record provided by the CM.  This provides
>a
>level of validation to prevent malicious apps from pretending to have
>certain
>trusted IP addresses to establish a connection when they otherwise should
>not be
>allowed to.
>
>However, thinking about it more, there's nothing that prevents the
>application
>from doing the IP to GID lookup itself, and then comparing the results to
>the
>path it received.  So please disregard this "feature".
>
>> So to summarize:
>> 1)     Get all locale ip addresses, We can do it or not do it, I
>recommend
>> to do it in the IPOIB. As the information changes, it would be better to
>> create some notification mechanism, I suggest not implementing this
>> mechanism at start.
>
>IPoIB already has this information, so I agree we should provide a way to
>get to
>it - the WSD provider certainly needs it.  We don't need to implement any
>notification mechanisms for when the addresses change, as the existing
>stack
>already handles that.  Specifically, it doesn't matter for DAPL case
>because
>DAPL is static.  For WSD, the switch already polls the providers when it
>detects
>that an update is needed.
>
>The alternative is to use the IP helper functions, but there are missing
>headers
>from the DDK that prevent that form working properly without manually
>copying
>headers from the platform SDK into the DDK directories (<mprapi.h> is one
>such
>header).
>
>I think we probably want two calls here.  The first would return all CA and
>port
>GUIDs in use by IPoIB, with no input parameter.  The output would be
>something
>like an array <CA GUID, PORT NUMBER, PORT GUID> tupples.  This allows a
>client
>to open the CA (using the CA GUID), create a QP bound to the proper port
>(using
>the port number), and query for IP addresses (using the port GUID).
>
>The second call would take as input a port GUID, and return an array of IP
>addresses assigned to that port.  The array entries should probably be 16
>bytes
>to accommodate IPv6 addresses, and use the standard method of storing IPv4
>addresses (prefix with zero).
>
>Does that make sense?
>
This does make sense, however there is one thing that we might consider
changing: it seems to me that the application will first do the first part
and later do a query on all ports to get their IP's. Therefore I believe
that we can have one function that returns all this information and save
some time passing from user mode to kernel. I'm not really sure if that code
would look better.

>> 2)     Get remote IP - we must implement.
>
>Assuming this takes a source and destination IP as input and returns a
>source
>and destination GID as output, I agree.
>
>How does something like this for the IOCTL input and output buffers sound:
>
>struct _IPOIB_AT_IN
>{
>	UCHAR		SrcIp[16];
>	UCHAR		DstIp[16];
>};
>struct _IPOIB_AT_OUT
>{
>	GID		SrcGid;
>	GID		DstGid;
>};
>
As I said before I believe that the functionality that we are really looking
for is the remote mac addresses, translated into a remote lid. (This is the
information that we have). So the interface should look like: 
struct _IPOIB_AT_IN
{
	UCHAR		DstMac[6]; // Do we want this longer ?
};
>struct _IPOIB_AT_OUT
{
	GID		DstGid;
};

Does this make sense?

>> 3)     Get/verify path we can implement but I believe that we shouldn't
>do
>> it.
>
>Yep, this isn't necessary - with the IP to GID lookup support an
>application can
>easily implement this functionality itself.
>
>> Appendix A - How to get all the locale IP addresses (and even to map them
>to
>> InfiniBand ports):
>> Windows provides the function GetAdaptersInfo which gives all IP adapters
>as
>> well as their locale IP. For each adapter there is a "name" that is
>actually
>> a guid that uniquely describes the adapter. In the registry there is a
>key
>> named:
>> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Network\{4D36E972-
>E325-1
>> 1CE-BFC1-08002BE10318} that has an entry for each adapter. For each of
>the
>> adapters there is a sub key call Connection. Under this there is a value
>> that is called PnpInstanceID this value starts with IBA\IPOIB for
>InfiniBand
>> devices, so we know that this is "our" card. Next there is the guid of
>the
>> port.
>
>Thanks for the information - the documentation doesn't really explain what
>the
>name represents.  So I take it that given the name, we could query the
>registry
>to find out if it's an IPoIB device, and if so we could even extract the
>port
>GUID from the PnpInstanceId.
>
>> Please note that since adapters (and more likely IP addresses) change
>with
>> time, it is not that simple to cache the information in user mode.
>
>The NotifyAddrChange IPHelper function can provide the notification
>mechanism
>needed.  For the case of WSD, however, the switch can be used to drive
>this.
>
>Thanks for the details!
>
>- Fab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20050907/e7415353/attachment.html>