[openib-general] ip over ib throughtput

Michael Krause krause at cup.hp.com
Thu Jan 6 09:05:00 PST 2005


At 04:43 AM 1/6/2005, Diego Crupnicoff wrote:
>I feel like we are talking about different things here:
>
>The ***IP*** MTU is relevant for IPoIB performance because it determines 
>the number of times that you are going to be hit by the per-packet 
>overhead of the ***host*** networking stack. My point was that the ***IP 
>MTU*** will not be tied to the ***IB*** MTU if a connected mode IPoIB (or 
>SDP) is used instead of the current IPoIB that uses IB UD transport 
>service. The IB MTU would then be irrelevant to this discussion.
>
>As for the eventual 2G ***IP*** MTU limit, it still sounds more than 
>reasonable to me. I wouldn't mind if a 10TB file gets split into some IP 
>packets up to 2GB?!?!? each.

Keep in mind that IP has a limit on its datagram size (normal and jumbo 
datagrams) which is far below 2GB.  IP datagrams are datagrams.  Large 
messages are expected to use SAR across a set of datagrams to insure 
forward progress with minimal impact to overall performance in the event of 
a transmission error.

>(With the exception of the UD transport service where IB messages are 
>limited to be single packet), the choice of ***IB*** MTU and its impact on 
>performance is a completely unrelated issue. IB messages are split into 
>packets and reassembled by the HCA HW. So the per-IB-message overhead of 
>the SW stack is independent of the IB MTU. The choice of IB MTU may indeed 
>affect performance for other reasons but it is not immediately obvious 
>that the largest available IB MTU is the best option for all cases. For 
>example, latency optimization of small high priority packets under load 
>may benefit from smaller IB MTUs (e.g. 256).

This is best handled by VL arbitration.  Changing the IB MTU to 256 for a 
UD based implementation would violate the IP minimum datagram size 
requirement.

Mike

>
>Diego
>
>-----Original Message-----
>From: Stephen Poole [mailto:spoole at lanl.gov]
>Sent: Thursday, January 06, 2005 5:45 AM
>To: Diego Crupnicoff
>Cc: 'openib-general at openib.org'
>Subject: RE: [openib-general] ip over ib throughtput
>
>Have you done any "load" analysis of a 2K .vs. 4K MTU ? Your analogy of 
>having 2G as a total message size is potentially flawed. You seem to 
>assume that 2G is the end-all in size, it is not. What about when you want 
>to (down the road) use IB for files in the 1-10TB in size. Granted, we can 
>live with 2G, but it is not some nirvana number. Second, with the 2G limit 
>on messages sizes, only determines the upper bound in overall size, I 
>could send 2G @ 32bytes MTU. So, the question is, how much less of a 
>system load/impact would a 4K MTU be over a 2K MTU. Remember, even 
>Ethernet finally decided to go to Jumbo Frames, why, system impact and 
>more. Remember HIPPI/GSN, the MTU was 64K, reason, system impact. The 
>numbers I have seen running IPoIB really impact the system.
>
>Steve...
>
>At 10:38 AM -0800 1/5/05, Diego Crupnicoff wrote:
>>Note however that the relevant IB limit is the max ***message size*** 
>>which happens to be equal to the ***IB*** MTU for the current IPoIB (that 
>>runs on top of IB UD transport service where IB messages are limited to a 
>>single packet).
>>A connected mode IPoIB (that runs on top of IB RC/UC transport service) 
>>would allow IB messages up to 2GB long. That will allow for much larger 
>>(effectively as large as you may ever dream of) ***IP*** MTUs, regardless 
>>of the underlying IB MTU.
>>Diego
>> > -----Original Message-----
>> > From: Hal Rosenstock [<mailto:halr at voltaire.com>mailto:halr at voltaire.com]
>> > Sent: Wednesday, January 05, 2005 2:21 PM
>> > To: Peter Buckingham
>> > Cc: openib-general at openib.org
>> > Subject: Re: [openib-general] ip over ib throughtput
>> >
>> >
>> > On Wed, 2005-01-05 at 12:23, Peter Buckingham wrote:
>> > > stupid question: why are we limited to a 2K MTU for IPoIB?
>> >
>> > The IB max MTU is 4K. The current HCAs support a max MTU of 2K.
>> >
>> > -- Hal
>> >
>> > _______________________________________________
>> > openib-general mailing list
>> > openib-general at openib.org
>> > 
>> <http://openib.org/mailman/listinfo/openib->http://openib.org/mailman/listinfo/openib-> 
>> general
>> >
>> > To
>> > unsubscribe, please visit
>> > 
>> <http://openib.org/mailman/listinfo/openib-general>http://openib.org/mailman/listinfo/openib-general
>> >
>>
>>_______________________________________________
>>openib-general mailing list
>>openib-general at openib.org
>>http://openib.org/mailman/listinfo/openib-general
>>
>>To unsubscribe, please visit 
>>http://openib.org/mailman/listinfo/openib-general
>
>
>
>--
>Steve Poole (spoole at lanl.gov)                                   Office: 
>505.665.9662
>Los Alamos National 
>Laboratory                                  Cell:    505.699.3807
>CCN - Special Projects / Advanced 
>Development                   Fax:    505.665.7793
>P.O. Box 1663, MS B255
>Los Alamos, NM. 87545
>03149801S
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit 
>http://openib.org/mailman/listinfo/openib-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050106/14f3f190/attachment.html>


More information about the general mailing list