[openib-general] IPoIB IETF WG presentation updated again

Dror Goldenberg gdror at mellanox.co.il
Wed Aug 4 04:15:57 PDT 2004


>-----Original Message-----
>From: Hal Rosenstock [mailto:halr at voltaire.com]
>Sent: Tuesday, August 03, 2004 5:13 PM

>Hi Dror,
> 
>> It's good to see that MAX_ADDR_LEN has been changed to 32. Does that
solve 
>> all the IPoIB ARP related problems for 2.6 kernel ? Can we store all
related link 
>> information in this 32 bytes ? What is envisioned to be stored in this 32
bytes - is 
>> it just the QPN+GID, or the entire path info, or the address vector
object too ? 
> 
>If it holds 32 bytes, then it can hold GID + QPN with 13 bytes still
available.
> 
>Other information you might want to hold:
>SL 1 bytes
>LID 2 bytes
>MTU (for connected mode) (1 byte)
>Rate (1 byte)
>Network Layer
>Flow Label 3 bytes (20 bits)
>Hop Limit 1 byte
>TClass 1 byte
> 

Source Path bits (1 bytes) ?

>So all the info for an AV could be stored there. Did I miss something
needed ? I didn't 
>double check this but there is still some room left over.

This is the information for the AV. However, you don't want to create the AV
for each packet that you send. Although in Mellanox devices the creation of
AVs is relatively inexpensive (compared with other resources like QP, CQ), I
don't think that it's the right way to go. I think that the right way to go
is to store the AV handle in the ARP table as part of the HW address. And
here comes the problem...
If you're not notified of ARP table entry invalidation, then you can not
really destroy the AV handle. So you have now to maintain some state
yourself. And I was wondering if there is a way to solve that. 

> 
>> I think that ideally, if a network device can replace the ARP
functionality in the kernel 
>> that'll be better. Because this way the IPoIB can get an address
resolution request 
>> from the IP stack, handle it by sending an ARP, then SA query for the
path record, then 
>> creation of HCA address handle, and then place it in cache and pass back
this address 
>> handle. When cache is replaced or expires, IPoIB will destroy the HCA
address handle. 
>> If this is not supported, then IPoIB will still need to maintain a shadow
table. 
> 
>Cloning an AH is probably faster than creating a new one from scratch. (We
would need
>an additional verb for this). How much does this cost ? Is this
optimization worth it ?

In Tavor the creation/modification of AV is relatively inexpensive. It
involves writing the AV information to the attached DRAM or to the main
memory. For userland application, it may be more complicated in some mode of
operation. Writing AV to DRAM involves PIO writes, which you probably want
to avoid on the datapath. So, you'd better have an AV ready and persistent
for each neighbor, as long as your cache is long enough.
BTW, I don't think that modification will almost cost you the same as new
creation.

>
>> Beyond that, it'll be nice if we could have gotten the IP datagram
without the "Ethernet" 
>> header. Currently the IPoIB driver has to chop it, and replace it with
the IPoIB encapsulation 
>> header. Anyway, this is just the purity of the protocol  stack layering. 
> 
>There would need to be another way to identify the various protocols (aka
ethertypes) being >carried.
> 
Yes. That was what Roland mentioned. Implement hard_header.

-Dror
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20040804/2b2e0b26/attachment.html>


More information about the general mailing list