[openib-general] [PATCH] librdmacm/examples/rping.c

Steve Wise swise at opengridcomputing.com
Fri Jun 16 08:23:31 PDT 2006


On Fri, 2006-06-16 at 11:20 -0400, amith rajith mamidala wrote:
> Hi Steve,
> 
> The rping also doesn't exit after printing these error messages. Is this
> expected?
> 

It should exit!  :-(

Maybe rping is not acking all the CM or Async events?  Or we've got a
bug in our refcnts on the iw_cm_ids in the kernel.  Can you get a gdb
stack trace when its stalled?   And if you kdb, a kernel mode stack
trace of the same thread would be nice too...

What systems/distros/etc are you running this on?

Thanks,

Stevo.



> Thanks,
> Amith
> 
> On Thu, 15 Jun 2006, Steve Wise wrote:
> 
> > This is the normal output for rping...
> >
> > The status error on the completion is 5 (FLUSHED), which is normal.
> >
> > Steve.
> >
> >
> > On Thu, 2006-06-15 at 17:24 -0400, amith rajith mamidala wrote:
> > > Hi,
> > >
> > > With the latest rping code (Revision: 8055) I am still able to see this
> > > race condition.
> > >
> > > server side:
> > >
> > > [@k62-oib examples]$ ./rping -s -vV -C10 -S26 -a 0.0.0.0 -p 9997
> > > server ping data: rdma-ping-0: ABCDEFGHIJKL
> > > server ping data: rdma-ping-1: BCDEFGHIJKLM
> > > server ping data: rdma-ping-2: CDEFGHIJKLMN
> > > server ping data: rdma-ping-3: DEFGHIJKLMNO
> > > server ping data: rdma-ping-4: EFGHIJKLMNOP
> > > server ping data: rdma-ping-5: FGHIJKLMNOPQ
> > > server ping data: rdma-ping-6: GHIJKLMNOPQR
> > > server ping data: rdma-ping-7: HIJKLMNOPQRS
> > > server ping data: rdma-ping-8: IJKLMNOPQRST
> > > server ping data: rdma-ping-9: JKLMNOPQRSTU
> > > server DISCONNECT EVENT...
> > > wait for RDMA_READ_ADV state 9
> > > cq completion failed status 5
> > >
> > > Client side:
> > >
> > > [@k63-oib examples]$ ./rping -c -vV -C10 -S26 -a 192.168.111.66 -p 9997
> > > ping data: rdma-ping-0: ABCDEFGHIJKL
> > > ping data: rdma-ping-1: BCDEFGHIJKLM
> > > ping data: rdma-ping-2: CDEFGHIJKLMN
> > > ping data: rdma-ping-3: DEFGHIJKLMNO
> > > ping data: rdma-ping-4: EFGHIJKLMNOP
> > > ping data: rdma-ping-5: FGHIJKLMNOPQ
> > > ping data: rdma-ping-6: GHIJKLMNOPQR
> > > ping data: rdma-ping-7: HIJKLMNOPQRS
> > > ping data: rdma-ping-8: IJKLMNOPQRST
> > > ping data: rdma-ping-9: JKLMNOPQRSTU
> > > cq completion failed status 5
> > > client DISCONNECT EVENT...
> > >
> > >
> > > Thanks,
> > > Amith
> > >
> > >
> > > On Tue, 13 Jun 2006, Steve Wise wrote:
> > >
> > > > Thanks, applied.
> > > >
> > > > iwarp branch: r7964
> > > > trunk: r7966
> > > >
> > > >
> > > > On Tue, 2006-06-13 at 11:24 -0500, Boyd R. Faulkner wrote:
> > > > > This patch resolves a race condition between the receipt of
> > > > > a connection established event and a receive completion from
> > > > > the client.  The server no longer goes to connected state but
> > > > > merely waits for the READ_ADV state to begin its looping.  This
> > > > > keeps the server from going back to CONNECTED from the later
> > > > > states if the connection established event comes in after the
> > > > > receive completion (i.e. the loop starts).
> > > > >
> > > > > Signed-off-by: Boyd Faulkner <faulkner at opengridcomputing.com>
> > > >
> > > >
> > > > _______________________________________________
> > > > openib-general mailing list
> > > > openib-general at openib.org
> > > > http://openib.org/mailman/listinfo/openib-general
> > > >
> > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > > >
> >





More information about the general mailing list