[ofa-general] using SDP for block device traffic: several problems

Lars Ellenberg lars.ellenberg at linbit.com
Tue Jul 7 10:13:26 PDT 2009


On Thu, Jul 02, 2009 at 03:23:04PM +0200, Lars Ellenberg wrote:
> On Thu, Jul 02, 2009 at 11:30:25AM +0300, Amir Vadai wrote:
> > Please attach the perl script to reproduce and I will check it.
> > As to the second problem - I did notice such behavior but couldn't
> > find a scenario to reproduce it. I guess it happen when the socket is
> > closed due to error.
> > 
> > Tell me if you notice any message in dmesg.
> 
> My former test cluster has been reassigned.
> The new test hardware will be available earliest tomorrow.
> But I don't think there was anything relevant in dmesg.
> 
> I'll try to reproduce on the new test hardware then,
> and will get back to you as soon as possible.

ok.
finally new test hardware working.

this is on debian lenny,
userland ofed from http://pkg-ofed.alioth.debian.org/apt/ofed/

kernel git://git.openfabrics.org/ofed_1_4/linux-2.6.git
merged with upstream stable
git://git4.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.27.y.git

both as of today:
  ofed_kernel            08acda8 sdp: Fix memory leak in bzcopy
  linux-v2.6.27.y/master 49cbf40 Linux 2.6.27.26


kernel config: very many "kernel debugging" things enabled.
if you want me to try a certain .config, or anything,
this can be arranged.

two very ugly perl scripts attached,
one tcp server,
one tcp client,
adapted from the perlipc man page.

client connects,
sends a package,
receives a package
in an endless loop.

package format:
4 byte magic, 2 byte ignored, 2 byte payload length
indicated length of payload, all same bytes,
but the trailing 4 byte, which again is a magic number.

with ethernet, or IPoIB: runs endless.

with LD_PRELOAD=libsdp.so runs for very few iterations,
and errors out on one of the sanity checks.


sample output:
root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:08 my_client.pl 4300: recv hdr [25]: invalid magic:  32 30 30 39 2d 30 37 2d

which hapeens to be the hexdump of the string "2009-07-".
where did it copy_user() that from? wtf?

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:08 my_client.pl 4301: recv payload [35]: expected 12296, but received 12295 byte; last bytes received:  55 e4 e3 e2
                                 ^^^^ pid,           ^^ seq number.
so for only 35 ping/pongs it did work ok.

exactly: one byte too short.
the trailing magic expected is e4 e3 e1 e1

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:09 my_client.pl 4302: recv payload [33]: expected 4131, but received 4130 byte; last bytes received:  55 e4 e3 e2

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:10 my_client.pl 4303: recv payload [29]: expected 16401, but received 16400 byte; last bytes received:  55 e4 e3 e2

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:12 my_client.pl 4304: recv payload [21]: expected 4110, but received 4109 byte; last bytes received:  55 e4 e3 e2

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:13 my_client.pl 4305: recv payload [4]: expected 20495, but received 20494 byte; last bytes received:  55 e4 e3 e2

root at kugel:/home/lars/DRBD/IB_SDP# LD_PRELOAD=libsdp.so perl my_client.pl rum-ib0
2009-07-07 19:07:14 my_client.pl 4306: recv payload [12]: expected 22530, but received 22529 byte; last bytes received:  55 e4 e3 e2


any suggestions how to proceed from here?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: my_client.pl
Type: text/x-perl
Size: 2726 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090707/345f27e7/attachment.pl>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: my_server.pl
Type: text/x-perl
Size: 4012 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090707/345f27e7/attachment-0001.pl>


More information about the general mailing list