[openib-general] [PATCH] osm: PathRecord prefer 1K MTU for MT23108 devices

Eitan Zahavi eitan at mellanox.co.il
Mon Sep 18 13:17:20 PDT 2006


Rimmer, Todd wrote:

>>From: Michael S. Tsirkin [mailto:mst at mellanox.co.il]
>>Sent: Monday, September 18, 2006 2:06 PM
>>To: Rimmer, Todd
>>Cc: Eitan Zahavi; Or Gerlitz; OPENIB
>>Subject: Re: [openib-general] [PATCH] osm: PathRecord prefer 1K MTU
>>    
>>
>for
>  
>
>>MT23108 devices
>>
>>Quoting r. Rimmer, Todd <trimmer at silverstorm.com>:
>>    
>>
>>>Subject: RE: [openib-general] [PATCH] osm: PathRecord prefer 1K MTU
>>>      
>>>
>for
>  
>
>>MT23108 devices
>>    
>>
>>>>From: Eitan Zahavi [mailto:eitan at mellanox.co.il]
>>>>Sent: Monday, September 18, 2006 11:20 AM
>>>>To: Rimmer, Todd
>>>>Cc: Or Gerlitz; Michael S. Tsirkin; OPENIB
>>>>Subject: Re: [openib-general] [PATCH] osm: PathRecord prefer 1K
>>>>        
>>>>
>MTU
>  
>
>>>for
>>>      
>>>
>>>>MT23108 devices
>>>>
>>>>Hi Todd,
>>>>
>>>>Seems like your knowledge about the specific MTU best for the
>>>>application (MPI) you are running is good
>>>>enough such that you will be able to include the MTU in the
>>>>        
>>>>
>PathRecord
>  
>
>>>>request and thus the patch describe in here will not affect your
>>>>        
>>>>
>MPI
>  
>
>>>at
>>>      
>>>
>>>>all.
>>>>The patch only applies if your request does not  provide any MTU &
>>>>        
>>>>
>MTU
>  
>
>>>>SEL comp_mask
>>>>        
>>>>
>>>Eitan,
>>>
>>>The question is not about "our MPI", rather its to ensure the Open
>>>Fabrics and OFED included MPIs and ULPs are capable of being tuned
>>>      
>>>
>for
>  
>
>>>optimal performance.  When a fabric runs more than 1 application,
>>>      
>>>
>its
>  
>
>>>necessary to be able to tune this at the MPI, SDP, etc level, not at
>>>      
>>>
>the
>  
>
>>>SM level.
>>>      
>>>
>>We did not remove this ability at all. So it's there.
>>
>>    
>>
>>>In order to be complete, this patch would need to
>>>include ULP level tunability in all the relevant ULPs (MPI, SDP,
>>>      
>>>
>uDAPL,
>  
>
>>>etc) to select the "MAX MTU" to use or to request.
>>>      
>>>
>>This tunability is already there - that's what MTU selector in path
>>queries
>>does.
>>
>>    
>>
>>>This then begs the question, if proper tuning requires all the ULPs
>>>      
>>>
>to
>  
>
>>>have a configurable MAX MTU, why should the SA need to implement the
>>>quirk at all?
>>>
>>>      
>>>
>>If ULP wants MAX MTU, it must set MTU selector to 3 in path query.
>>
>>If MTU selector is disabled in the query, SM will guess which MTU is
>>    
>>
>best
>  
>
>>to
>>select. SM used a specific heuristic to perform that guess.  All we
>>    
>>
>did
>  
>
>>is,
>>provide an option to use a different heuristic.
>>
>>This is useful because, SM has data on the whole fabric as opposed to
>>    
>>
>ULPs
>  
>
>>which often only have data on the endnode.
>>    
>>
>
>The patch you submitted only modified Open SM.  So please show me the
>patch where MVAPICH, Open MPI, SDP, SRP and other ULPs allow this to be
>tuned by the user or application?  Lacking that patch, all the "if a ULP
>wants" statements above are mute.  The goal is for OFED to provide a
>high performance standard solution.  If end users must modify the ULPs
>source code to achieve that goal, OFED misses the mark.
>  
>
To our best knowledge the change which automatically selects 1K MTU for 
the above ULPs improves their performance.
Do you have any measurement on OFED 1.1 that shows otherwise? Under what 
cases?
If this is the case then you basically do not have to do anything and 
all just works as it used to.
But if we are correct then a user can create an OpenSM cache file, 
modify the enable_quirks to TRUE and restart the SM.
I am sure you could imaging this patch was not just another way for us 
to spend our time...

>Todd Rimmer
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>  
>





More information about the general mailing list