[openib-general] [uDAPL] dtest server never ends when using the dapl provider "OpenIB-scm1"

Dotan Barak dotanb at mellanox.co.il
Tue Apr 11 05:06:18 PDT 2006


Hi.


I'm using the dtest from the dapl example folder with the following command line:
./dtest
./dtest -h IP1    (IP1 is the IP of the IPoIB I/F in the remote side)

the output of the test is:
server output:
----------------------
1074 CONNECTED!

1074 Send RMR to remote: snd_msg: r_key_ctx=1320439,pad=0,va=50aa40,len=0x40
1074 Waiting for remote to send RMR data
1074 remote RMR data arrived!
1074 Received RMR from remote: r_iov: r_key_ctx=e00436,pad=0,va=50aa40,len=0x40

 1074 RDMA WRITE DATA with SEND MSG

1074 Sending completion message
1074 inbound rdma_write; send message arrived!
1074 Received RMR from remote: r_iov: ctx=e00436,pad=0,va=0x50aa40,len=0x40
1074 SERVER: RDMA write buffer contains: client written data...

 1074 RDMA READ DATA with SEND MSG

1074 Sending completion message
1074 Waiting for inbound message....
1074 inbound rdma_read; send message arrived!
1074 Received RMR from remote: r_iov: ctx=e00436,pad=0,va=0x50aa40,len=0x40
1074 SERVER: RCV RDMA read buffer contains: client read data...

 1074 PING DATA with SEND MSG

client output:
------------------
13210 CONNECTED!

13210 Send RMR to remote: snd_msg: r_key_ctx=e00436,pad=0,va=50aa40,len=0x40
13210 Waiting for remote to send RMR data
13210 remote RMR data arrived!
13210 Received RMR from remote: r_iov: r_key_ctx=1320439,pad=0,va=50aa40,len=0x40

 13210 RDMA WRITE DATA with SEND MSG

13210 Sending completion message
13210 inbound rdma_write; send message arrived!
13210 Received RMR from remote: r_iov: ctx=1320439,pad=0,va=0x50aa40,len=0x40
13210 CLIENT: RDMA write buffer contains: server written data...

 13210 RDMA READ DATA with SEND MSG

13210 Sending completion message
13210 Waiting for inbound message....
13210 inbound rdma_read; send message arrived!
13210 Received RMR from remote: r_iov: ctx=1320439,pad=0,va=0x50aa40,len=0x40
13210 CLIENT: RCV RDMA read buffer contains: server read data...

 13210 PING DATA with SEND MSG

DAT Registry: dat_ia_close () called
dat_get_ia_handle from 1 to 0x5124e0
dat_get_ia_handle from 1 to 0x5124e0
dats_free_ia_handle 1
DAT Registry: IA OpenIB-scm1, unloading library /usr/local//lib64/libdaplscm.so
DAT Registry: dat_registry_remove_provider () called

13210: DAPL Test Complete.

13210: Message RTT: Total=    523.81 usec, 10 bursts, itime=     52.38 usec, pc=0
13210: RDMA write:  Total=     24.08 usec, 10 bursts, itime=      2.41 usec, pc=0
13210: RDMA read:   Total=    138.52 usec,   4 bursts, itime=     48.88 usec, pc=0
13210: RDMA read:   Total=    138.52 usec,   4 bursts, itime=     28.85 usec, pc=0
13210: RDMA read:   Total=    138.52 usec,   4 bursts, itime=     32.90 usec, pc=0
13210: RDMA read:   Total=    138.52 usec,   4 bursts, itime=     27.89 usec, pc=0
13210: open:        54162.98 usec
13210: close:      277697.09 usec
13210: PZ create:      35.05 usec
13210: PZ free:         9.06 usec
13210: LMR create:    215.05 usec
13210: LMR free:       95.13 usec
13210: EVD create:     36.95 usec
13210: EVD free:      152.11 usec
13210: EP create:     494.96 usec
13210: EP free:       298.02 usec
13210: TOTAL:        1292.23 usec
DAT Registry: Stopped (dat_fini)


when i'm using the dapl provider: OpenIB-cma-ip, everything is working and both sides of the dtest ends (without any error).
here is my dat.conf:
OpenIB-cma u1.2 nonthreadsafe default /usr/local//lib64/libdaplcma.so mv_dapl.1.2 "ib0 0" ""
OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local//lib64/libdaplcma.so mv_dapl.1.2 "11.4.3.86 0" ""				<<<-- the IP of the local IPoIB I/F in each host
OpenIB-cma-name u1.2 nonthreadsafe default /usr/local//lib64/libdaplcma.so mv_dapl.1.2 "svr1-ib0 0" ""
OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/local//lib64/libdaplcma.so mv_dapl.1.2 "ib0 0" ""
OpenIB-scm1 u1.2 nonthreadsafe default /usr/local//lib64/libdaplscm.so mv_dapl.1.2 "mthca0 1" ""
OpenIB-scm2 u1.2 nonthreadsafe default /usr/local//lib64/libdaplscm.so mv_dapl.1.2 "mthca0 2" ""

Here is some info of the host i'm using (both of the hosts are identical):
Host Name         : sw086
Host Architecture : x86_64
Linux Distribution: Fedora Core release 4 (Stentz)
Kernel Version    : 2.6.14.3
Memory size       : 4039344 kB
Driver Version    : IBED-1.0-rc3:
HCA ID(s)         : mthca0
HCA model(s)      : 25208
FW version(s)     : 4.7.600
Board(s)          : MT_00A0010001

can anyone help me with this issue?

thanks
Dotan



More information about the general mailing list