[ofa-general] SDP and iWARP

Craig Prescott prescott at hpc.ufl.edu
Thu Jan 10 07:07:17 PST 2008


Steve Wise wrote:
> 
> First make sure the sdp kernel module uses the rdma cma.  Then I'd add 
> printk hooks in cma.c, addr.c, and iwcm.c to see what's going on and 
> where things are failing.  Also a wire trace is good if we're getting 
> that far (like at least doing arp resolution).
> 

Small update - a little progress.  printk's spinkled liberally and
ib_sdp debug options turned on.  The initial problem was on the
listener during an IW_CM_EVENT_CONNECT_REQUEST event; the SDP  hello 
header was rejected in sdp_cma.c:sdp_connect_handler() because its
max_adverts field was zero, which is not permissible.  In fact, all
of the sdp_hh fields were zero.

Comparing with the RDMA_TRANSPORT_IB case, I saw that 
cma.c:cma_connect_ib() does some work to create the SDP header
via cma_format_hdr().  But cma_connect_iw() did not.

I patched cma_connect_iw() to create the SDP header as
cma_connect_ib() does.  This gets us farther - examining the
SDP header on the listener side looks right now, and the
listener at least enters rdma_accept(), but iw_cm_accept()
fails due to cm_id->device->iwcm->accept(cm_id, iw_param)
returning -104.  The above call also emits a couple of messages
into the listener's syslog now :

Jan  9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 
opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000
Jan  9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode 14 
status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000

In the end, we still end up in rdma_reject().  Will keep digging.

Cheers,
Craig



More information about the general mailing list