[ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue processing if heavy sweep requested

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Tue Dec 18 23:40:08 PST 2007


Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
>> If a heavy sweep requested during idle queue processing, OSM continues
>> to process it till the end and only then notices the heavy sweep request.
>> In some cases this might leave a topology change unhandled for several
>> minutes.
> 
> Could you provide more details about such cases?
> 
> As far as I know the idle queue is used only for multicast re-routing.
> If so, it is interesting by itself why it takes minutes and where. Is
> where MCG join/leave storm?

Exactly. The problem was discovered on a big cluster with hundreds of mcast groups,
when there is some massive change in the subnet (like rebooting hundreds of nodes).

-- Yevgeny

> Or single re-routing cycle takes minutes?
> 
> Sasha
> 
>> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>>  opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
>>  1 files changed, 24 insertions(+), 7 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
>> index 5c39f11..6ee5ee6 100644
>> --- a/opensm/opensm/osm_state_mgr.c
>> +++ b/opensm/opensm/osm_state_mgr.c
>> @@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>  				/* CALL the done function */
>>  				__process_idle_time_queue_done(p_mgr);
>>
>> -				/*
>> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>> -				 * so that the next element in the queue gets processed
>> -				 */
>> -
>> -				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>> -				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>> +				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
>> +					/*
>> +					 * Do not read next item from the idle queue.
>> +					 * Immediate heavy sweep is requested, so it's
>> +					 * more important.
>> +					 * Besides, there is a chance that after the
>> +					 * heavy sweep complition, idle queue processing
>> +					 * that SM would have performed here will be obsolete.
>> +					 */
>> +					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
>> +						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>> +						"osm_state_mgr_process: "
>> +						"interrupting idle time queue processing - heavy sweep requested\n");
>> +					signal = OSM_SIGNAL_NONE:
>> +					p_mgr->state = OSM_SM_STATE_IDLE;
>> +				}
>> +				else {
>> +					/*
>> +					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>> +					 * so that the next element in the queue gets processed
>> +					 */
>> +					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>> +					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>> +				}
>>  				break;
>>
>>  			default:
>> -- 
>> 1.5.1.4
>>
> 




More information about the general mailing list