[openib-general] Re: Problems with SDP on Itanium
Michael S. Tsirkin
mst at mellanox.co.il
Wed Nov 2 14:18:50 PST 2005
Quoting r. Bob Woodruff <robert.j.woodruff at intel.com>:
> Subject: RE: Problems with SDP on Itanium
>
> Michael wrote,
> >No, dont think I've seen that one, but its been a while
> >since I last run anything on Itanium.
> >Can you try to debug it a little? What does it mean that
> >an application "hangs"? Is some data sent from one side not received
> >by another one?
>
> >--
> >MST
>
> Looks like it is stuck in the write()system call.
>
> 103: 1048573 bytes 21 times --> 3853.24 Mbps in 2076.17 usec
> 104: 1048576 bytes 24 times --> 3854.65 Mbps in 2075.42 usec
> 105: 1048579 bytes 24 times --> 3847.86 Mbps in 2079.08 usec
> 106: 1572861 bytes 24 times -->
> Program received signal SIGINT, Interrupt.
> 0xa000000000010641 in ?? ()
> (gdb) bt
> #0 0xa000000000010641 in ?? ()
> #1 0x20000000001bf9c0 in write () from /lib/tls/libc.so.6.1
> #2 0x4000000000004920 in SendData ()
> #3 0x40000000000036e0 in main ()
>
> Here is the gdb traceback from the other side after it hangs.
> It is blocked in a read() system call.
>
> (gdb) run
> Starting program: /home/exports/NetPIPE_3.5-SDP/NPtcp
> Failed to read a valid object file image from memory.
> (no debugging symbols found)
> (no debugging symbols found)
> (no debugging symbols found)
> Send and receive buffers are 135168 and 135168 bytes
> (A bug in Linux doubles the requested buffer sizes)
>
> Program received signal SIGINT, Interrupt.
> 0xa000000000010641 in ?? ()
> (gdb) bt
> #0 0xa000000000010641 in ?? ()
> #1 0x20000000001bf8c0 in read () from /lib/tls/libc.so.6.1
> #2 0x4000000000004a50 in RecvData ()
> #3 0x4000000000003aa0 in main ()
>
Interesting. I'll try to look at this next week - shouldnt be too hard
to debug if I manage to reproduce it here.
Meanwhile, could you please try to enable sdp data debugging, and post the
resulting log if the problem reproduces there?
--
MST
More information about the general
mailing list