[openib-general] Re: [PATCH][SDP] AIO buffer corruption

Libor Michalek libor at topspin.com
Wed May 11 18:01:08 PDT 2005


On Wed, May 04, 2005 at 10:00:35PM +0300, Michael S. Tsirkin wrote:
> Quoting r. Libor Michalek <libor at topspin.com>:
> > Subject: [PATCH][SDP] AIO buffer corruption
> > 
> >   Patch to fix the problem a few people reported as ttcp.aio.c 
> > aborting with an error (-104) on longer AIO runs.
> > 
> >   The bug is in the calculation of an AIO buffers starting address. 
> > It would cause data to potentially be written past the end of the 
> > AIO buffer corrupting whatever happen to be there. In the case of
> > ttcp.aio.c this happen to be the iocb array, which once corrupted
> > would generate this error when passed to io_submit.
> 
> Unfortunately I still see data corruptions sometimes with this patch applied.
> The result for me is the server reporting verification error, closing the
> socket, and client printing the 104 event.
> I'm still debugging, but wanted to ask if someone else is seeing this too.

Michael,

  I found a couple issues, the one above is with ttcp.aio.c itself. When
running with verification the receive side is incorrectly calculating
which value it should be seeing in the received buffer. It is effectively
one buffer ahead because the num_bytes variable is being updated too early.
The patch is below. I also found a buffer mismatch problem when using
buffers below the zcopy threshold.

Signed-off-by: Libor Michalek <libor at topspin.com>


Index: ttcp.aio.c
===================================================================
--- ttcp.aio.c	(revision 2248)
+++ ttcp.aio.c	(working copy)
@@ -538,9 +538,7 @@
 
 		for (ev = 0; 
 		     ev < result && 0 < (long)events[ev].res;
-		     ev++, ex++, cu--) {
-		    
-			num_bytes += events[ev].res;
+		     num_bytes += events[ev++].res, ex++, cu--) {
 
 			if (!verify)
 				continue;
@@ -563,7 +561,7 @@
 		}
  
 		if (0 > (long)events[ev].res)
-			fprintf(stderr, "ttcp%s: Event error <%ld> <%ld>\n",
+			fprintf(stderr,	"ttcp%s: Event error <%ld> <%ld>\n",
 				trans ? "-t" : "-r",
 				(long)events[ev].res,
 				(long)events[ev].data);




More information about the general mailing list