[ofa-general] [PATCH] opensm/osm_sa_mcmember_record.c: fix mcgrp cleanup crash

Sasha Khapyorsky sashak at voltaire.com
Fri Sep 18 06:44:33 PDT 2009


When multiple MCMember leave requests are issued the request processors
may run concurrently and then the race between port from multicast group
removing and multicast group deletion (then it becomes empty) is
possible for different processors, as result we are getting double free
crash (reproducible under stress testing).

This patch fixes this by moving multicast group cleanup call under some
lock protected block as port removing code.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_sa_mcmember_record.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c
index b580eb7..f291bf0 100644
--- a/opensm/opensm/osm_sa_mcmember_record.c
+++ b/opensm/opensm/osm_sa_mcmember_record.c
@@ -964,13 +964,11 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw)
 	/* remove port and/or update join state */
 	osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, p_mcm_port,
 			     &mcmember_rec);
+	osm_mgrp_cleanup(sa->p_subn, p_mgrp);
 	CL_PLOCK_RELEASE(sa->p_lock);
 
 	mcmr_rcv_respond(sa, p_madw, &mcmember_rec);
 
-	CL_PLOCK_EXCL_ACQUIRE(sa->p_lock);
-	osm_mgrp_cleanup(sa->p_subn, p_mgrp);
-	CL_PLOCK_RELEASE(sa->p_lock);
 Exit:
 	OSM_LOG_EXIT(sa->p_log);
 }
-- 
1.6.5.rc1




More information about the general mailing list