[ofa-general] Couple of questions about OpenSM

Nicolas Morey-Chaisemartin devel at morey-chaisemartin.com
Fri Mar 20 09:03:47 PDT 2009


Sasha Khapyorsky a écrit :
> Hi Nicolas,
> 
> On 11:12 Thu 19 Mar     , Nicolas Morey Chaisemartin wrote:
>> First question and an easy one:
>> By default with what optimization options is OpenSM compiled with? 
> 
> It is defined in autoconf I guess.
> 
>> Without any specific options using git, there a no -O2 or such so all the inline functions are not inline which make a huge bad impact on performances (few millions calls to osm_switch_get_least_hops not inline consumes over 15% of computing time)
> 
> In my setup I have in /usr/share/autoconf/autoconf/c.m4:
> 
>   if test "$GCC" = yes; then
>     CFLAGS="-g -O2"
> 
> And -O2 is turned on by default.

This doesn't seem to be the case for me, I'll check where that comes from
> 
>> Next one and a bit harder:
>> In the Fat-Tree we have a 2D array for hop table (destination lid/port num). Why is this table allocated as we need and not all at once?
> 
> Yes, I think it can be preallocated. Just not that this should not have
> port arrays for each lid in a fabric, but only for switches.

Good. There should be one table for each switch and each of them as one entry per (port/lid) couple no?

>> And is it really necessary to check  each time if the lid we use is not greater than max_lid_ho ?
> 
> May be not, but I will need to check carefully.
> 
>> The only reason I would see for this is if a new node/switch with a bigger lid was added to the fabric while openSM is routing. In such a case, wouldn't a lock protect the variables so new lid can't appear/disappear while it calculates the routes ?
> 
> Routing calculation cannot happen in parallel with discovery (it is all
> serialized in do_sweep() function), we should be protected at least in
> this part.
> 
>> If yes, we could allocate all and skip a lot of checks. We have millions of calls to malloc and memset in osm_switch_set_hops plus tests in get_hops/get_least_hops.
> 
> malloc() calls are conditional, there could be many checks, but only
> "needed" amount of malloc() itself.
> 
We don't do more than necessary but it costs a lot of time to do millions of syscall.

>> This may cost a bit more memory,
> 
> Min hop's port arrays preallocation should not cost any extra memory (if
> done properly - for switches only) - we are allocating all needed buffers
> in routing calculation time anyway.
> 
>> but easily gain 15% on routing computation time.
> 
> Well, I'm skeptical about 15% :). But it doesn't really matter - even
> 1% performance gain and/or cleaner code would be nice improvement.
> 
Well it about 15% of computation time (for Ftree at least) is spent in set_hops and specifically in malloc/memset part
If we do it only once at the start, it should probably be much faster ! 

By the way, is there a reason you don't use likely/unlikely commands in conditions?


Regards

Nicolas




More information about the general mailing list