[openib-general] Re: [PATCH][SDP] AIO buffer corruption

Michael S. Tsirkin mst at mellanox.co.il
Wed May 4 12:00:35 PDT 2005


Quoting r. Libor Michalek <libor at topspin.com>:
> Subject: [PATCH][SDP] AIO buffer corruption
> 
> 
>   Patch to fix the problem a few people reported as ttcp.aio.c 
> aborting with an error (-104) on longer AIO runs.
> 
>   The bug is in the calculation of an AIO buffers starting address. 
> It would cause data to potentially be written past the end of the 
> AIO buffer corrupting whatever happen to be there. In the case of
> ttcp.aio.c this happen to be the iocb array, which once corrupted
> would generate this error when passed to io_submit.
> 
> -Libor
> 
> Signed-off-by: Libor Michalek <libor at topspin.com>
> 
> Index: sdp_recv.c
> ===================================================================
> --- sdp_recv.c	(revision 2220)
> +++ sdp_recv.c	(working copy)
> @@ -674,14 +674,16 @@
>  #ifndef _SDP_DATA_PATH_NULL
>  		memcpy((addr + offset), buff->data, copy);
>  #endif
> -    
> +
>  		buff->data += copy;
>  		iocb->post += copy;
>  		iocb->len  -= copy;
>  
>  		offset     += copy;
>  		offset     &= (~PAGE_MASK);
> -		
> +
> +		iocb->io_addr += copy;
> +
>  		sdp_kunmap(iocb->page_array[counter++]);
>  	}
>  	/*
> @@ -1443,7 +1445,8 @@
>  			iocb->size = size;
>  			iocb->req  = req;
>  			iocb->key  = req->ki_key;
> -			iocb->addr = (unsigned long)msg->msg_iov->iov_base;
> +			iocb->addr = ((unsigned long)msg->msg_iov->iov_base -
> +				      copied);
>  
>  			req->ki_cancel = sdp_inet_read_cancel;
>  
> Index: sdp_send.c
> ===================================================================
> --- sdp_send.c	(revision 2220)
> +++ sdp_send.c	(working copy)
> @@ -751,6 +751,7 @@
>  		buff->tail      += copy;
>  		iocb->post      += copy;
>  		iocb->len       -= copy;
> +		iocb->io_addr   += copy;
>  
>  		offset += copy;
>  		offset &= (~PAGE_MASK);
> @@ -2195,7 +2196,7 @@
>  		iocb->size = size;
>  		iocb->req  = req;
>  		iocb->key  = req->ki_key;
> -		iocb->addr = (unsigned long)msg->msg_iov->iov_base;
> +		iocb->addr = (unsigned long)msg->msg_iov->iov_base - copied;
>        
>  		req->ki_cancel = sdp_inet_write_cancel;
>  

Unfortunately I still see data corruptions sometimes with this patch applied.
The result for me is the server reporting verification error, closing the
socket, and client printing the 104 event.
I'm still debugging, but wanted to ask if someone else is seeing this too.

Libor, on an unrelated note, could you please generate diffs with -p flag
to make it easier to see which function got changed?

Thanks,

-- 
MST - Michael S. Tsirkin



More information about the general mailing list