[openib-general] Re: RFC on SDP checkin

Libor Michalek libor at topspin.com
Tue Feb 22 13:48:20 PST 2005


On Mon, Feb 21, 2005 at 08:12:57PM +0200, Michael S. Tsirkin wrote:
> Quoting r. Libor Michalek <libor at topspin.com>:
> > On Thu, Feb 17, 2005 at 04:03:59PM +0200, Michael S. Tsirkin wrote:
> > > Quoting r. Libor Michalek <libor at topspin.com>:
> > > > Possible Issues
> > > > 
> > > >  - Memory locking for AIO requires a call to do_mlock() which is not a
> > > >    kernel exported function, the method for calling the function is not
> > > >    standard.
> > > 
> > > Libor, in my eyes this is the biggest issue with this gen2 sdp code.
> > > In short, I dont think using do_mlock is a good idea.
> > 
> >   I do agree that this is one of the more, if not most, problematic area
> > of the code, and I like your alternative suggestion below.
> 
> BTW, just noted that sdp_iocb.c gets current->mm directly, and keeps
> this pointer cashed instead of calling get_task_mm each time.
> Thus it seems a race condition may occur if the task
> is exiting while the cashed value is used.

  Actually, the AIO code itself, outside of SDP, is incrementing the 
mm reference count for each submitted IO which has a one-to-one mapping
with sdpc_iocb instances. We cannot increment the reference count using
get_task_mm() otherwise the AIO code will run into problems. The 
destruction of the mm, when the reference count is decremented, triggers
the AIO cleanup code, which will not complete until all outstanding AIOs
have been completed or canceled. If we were to increment the mm reference
count using get_task_mm() then a userspace process which has outstanding
AIOs and is terminated would not attempt to destroy it's mm, which would
not cleanup the oustanding AIOs, which would then hang the process.

> I am not fixing it since we agreed we are changing this to get_user_pages
> anyway.

  OK.

-Libor

To: Tom Duffy <tduffy at sun.com>
Subject: Re: [openib-general] [PATCH][SDP][0/22] Whitespace cleanup in SDP
Message-ID: <20050222141603.G23420 at topspin.com>
References: <11087738623823 at sun.com> <20050222122058.A23420 at topspin.com>
	<1109105820.24909.48.camel at duffman>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <1109105820.24909.48.camel at duffman>;
	from tduffy at sun.com on Tue, Feb 22, 2005 at 12:57:00PM -0800
X-OriginalArrivalTime: 22 Feb 2005 22:16:03.0362 (UTC)
	FILETIME=[18B3D020:01C5192C]
X-Spam-Checker-Version: SpamAssassin 2.64 (2004-01-11) on openib.ca.sandia.gov
X-Spam-Level: 
X-Spam-Status: No, hits=0.0 required=5.0 tests=none autolearn=no version=2.64
Cc: openib-general at openib.org
X-BeenThere: openib-general at openib.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: OpenIB General Mailing List <openib-general.openib.org>
List-Unsubscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=unsubscribe>
List-Archive: <http://openib.org/pipermail/openib-general>
List-Post: <mailto:openib-general at openib.org>
List-Help: <mailto:openib-general-request at openib.org?subject=help>
List-Subscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Feb 2005 22:18:29 -0000
Status: O

On Tue, Feb 22, 2005 at 12:57:00PM -0800, Tom Duffy wrote:
> On Tue, 2005-02-22 at 12:20 -0800, Libor Michalek wrote:
> >   Thanks Tom, I've applied and commited the patches.
> 
> Did you choose not to apply these changes, or was this a simple
> oversight?

  Yup, it was an oversight. I had to hand apply some of the patches
that failed, becasue of the connection lock name change patch I had
made. Thanks for catching it, I've checked them in now.

-Libor

> Index: sdp_inet.c
> ===================================================================
> --- sdp_inet.c	(revision 1865)
> +++ sdp_inet.c	(working copy)
> @@ -410,7 +410,6 @@ static int _sdp_inet_release(struct sock
>  
>  			while (0 < timeout &&
>  			       0 == (SDP_ST_MASK_CLOSED & conn->istate)) {
> -
>  				sdp_conn_unlock(conn);
>  				timeout = schedule_timeout(timeout);
>  				sdp_conn_lock(conn);
> @@ -522,7 +521,6 @@ static int _sdp_inet_bind(struct socket 
>  
>  	result = sdp_inet_port_get(conn, bind_port);
>  	if (0 > result) {
> -
>  		sdp_dbg_warn(conn, "Error <%d> getting port during bind",
>  			     result);
>  
> @@ -1174,7 +1172,6 @@ static int _sdp_inet_ioctl(struct socket
>  		sdp_conn_lock(conn);
>  
>  		if (SDP_SOCK_ST_LISTEN != conn->istate) {
> -
>  			value = conn->send_qud;
>  			result = put_user(value, (int __user *) arg);
>  		} else
> Index: sdp_event.c
> ===================================================================
> --- sdp_event.c	(revision 1865)
> +++ sdp_event.c	(working copy)
> @@ -512,44 +512,32 @@ int sdp_cm_event_handler(struct ib_cm_id
>  	 * lookup the connection, on a REQ_RECV the sk will be empty.
>  	 */
>  	conn = sdp_conn_table_lookup(hashent);
> -	if (NULL != conn) {
> -
> +	if (NULL != conn)
>  		sdp_conn_lock(conn);
> -	}
> -	else {
> -
> -		if (IB_CM_REQ_RCVD != cm_id->state) {
> -
> +	else
> +		if (IB_CM_REQ_RCVD != cm_id->state)
>  			sdp_dbg_warn(NULL, 
>  				     "No conn <%d> CM state <%d> event <%d>",
>  				     hashent, cm_id->state, event->event);
> -		}
> -	}
>  
>  	switch (cm_id->state) {
>  	case IB_CM_REQ_RCVD:
> -
>  		result = sdp_cm_req_handler(cm_id, event);
>  		break;
>  	case IB_CM_REP_RCVD:
> -
>  		result = sdp_cm_rep_handler(cm_id, event, conn);
>  		break;
>  	case IB_CM_IDLE:
> -
>  		result = _sdp_cm_idle(cm_id, event, conn);
>  		break;
>  	case IB_CM_ESTABLISHED:
> -
>  		result = _sdp_cm_established(cm_id, event, conn);
>  		break;
>  	case IB_CM_DREQ_RCVD:
>  	case IB_CM_TIMEWAIT:
> -
>  		result = _sdp_cm_timewait(cm_id, event, conn);
>  		break;
>  	default:
> -
>  		sdp_dbg_warn(conn, "Unexpected CM state <%d>", cm_id->state);
>  		result = -EINVAL;
>  	}



More information about the general mailing list