[openib-general] ib_send_cm_req failes with error -22
susan
ssbyrn at yahoo.com
Wed Apr 19 11:27:21 PDT 2006
hello,
i'm writing sample kernel ulp driver to get me acquainted
with openib stack on linux kernel 2.6.16.2 (fedora 5) with
openib gen 2 stack checkout from openib.org website.
the setup is two nodes with point-to-point connection, viz.
primary & secondary node. the secondary node starts in a
listen mode, until primary node makes a connection and start
exchanging messages.
the problem that i am running into is that the ib_send_cm_req
api fails with errorno 22. i'm using local id to make the
connection on port 1. ib_send_cm_req() api calls function
cm_init_av_by_path(), which calls ib_find_cached_gid().
function ib_find_cached_gid() fails because it can't locate
cached gid in device's cache table. below is full control
flow from both primary & secondary node.
priamry node secondary node
-------------- -----------------
ib_register_client()
using active port = 1
ib_create_cm_id()
ib_cm_listen()
listening .... (waiting)
ib_register_client()
using active port = 1
ib_create_cm_id()
ib_alloc_pd()
ib_create_cq()
ib_req_notify_cq()
ib_get_dma_mr()
ib_create_qp()
ib_query_gid()
source.lid = 0x1
dest.lid = 0x2
ib_sa_path_rec_get()
sa_path_rec handler returned success
ib_send_cm_req()
ib_send_cm_req() failed with error -22
how would update device's cache to get cached gid? am i missing
any steps?
here is output from ib* commands:
from primary node:
root at copa:~:23> sminfo
sminfo: sm lid 0x2 sm guid 0x5ad0000030655, activity count 1707
priority 1 state SMINFO_MASTER 3
root at copa:~:26> ibhosts
Ca : 0x0005ad0000030654 ports 2 "Topspin HCA"
Ca : 0x0005ad0000030860 ports 2 "Topspin HCA"
root at copa:~:28> ibstatus
Infiniband device 'mthca0' port 1 status:
default gid: fe80:0000:0000:0000:0005:ad00:0003:0861
base lid: 0x1
sm lid: 0x2
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (4X)
Infiniband device 'mthca0' port 2 status:
default gid: fe80:0000:0000:0000:0005:ad00:0003:0862
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 2.5 Gb/sec (1X)
root at copa:~:29> ibnetdiscover
#
# Topology file: generated on Tue Apr 18 12:36:39 2006
#
# Max of 1 hops discovered
# Initiated from node 0005ad0000030860 port 0005ad0000030861
vendid=0x5ad
devid=0x5a44
sysimgguid=0x5ad000100d050
caguid=0x5ad0000030654
Ca 2 "H-0005ad0000030654" # Topspin HCA
[1] "H-0005ad0000030860"[1] # lid 2 lmc 0
vendid=0x5ad
devid=0x5a44
sysimgguid=0x5ad000100d050
caguid=0x5ad0000030860
Ca 2 "H-0005ad0000030860" # Topspin HCA
[1] "H-0005ad0000030654"[1] # lid 1 lmc
from secondary node:
root at bana:~:8> sminfo
sminfo: sm lid 0x2 sm guid 0x5ad0000030655, activity count 1738
priority 1 state SMINFO_MASTER 3
root at bana:~:15> ibhosts
Ca : 0x0005ad0000030860 ports 2 "Topspin HCA"
Ca : 0x0005ad0000030654 ports 2 "Topspin HCA"
root at bana:~:16> ibstatus
Infiniband device 'mthca0' port 1 status:
default gid: fe80:0000:0000:0000:0005:ad00:0003:0655
base lid: 0x2
sm lid: 0x2
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (4X)
Infiniband device 'mthca0' port 2 status:
default gid: fe80:0000:0000:0000:0005:ad00:0003:0656
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 2.5 Gb/sec (1X)
root at bana:~:17> ibnetdiscover
#
# Topology file: generated on Tue Apr 18 12:36:15 2006
#
# Max of 1 hops discovered
# Initiated from node 0005ad0000030654 port 0005ad0000030655
vendid=0x5ad
devid=0x5a44
sysimgguid=0x5ad000100d050
caguid=0x5ad0000030860
Ca 2 "H-0005ad0000030860" # Topspin HCA
[1] "H-0005ad0000030654"[1] # lid 1 lmc 0
vendid=0x5ad
devid=0x5a44
sysimgguid=0x5ad000100d050
caguid=0x5ad0000030654
Ca 2 "H-0005ad0000030654" # Topspin HCA
[1] "H-0005ad0000030860"[1] # lid 2 lmc 0
do you know what's wrong?
thanks,
susan
More information about the general
mailing list