<br><font size=2 face="sans-serif">Hal,</font>
<br><font size=2><tt> <br>
> Yes, that is very useful. I had been working on trying to come up
with<br>
> what the problem was but this narrows it down to something I was<br>
> thinking might be going on.<br>
> <br>
> It looks like you are running back to back HCAs, right ?</tt></font>
<br>
<br><font size=2><tt>Yes, the HCAs are 4X DDR, connected back to back.</tt></font>
<br><font size=2><tt><br>
> <br>
> It also looks to me like your remote (in terms of OpenSM) CA node
is not<br>
> responding to SMA requests like SubnGet NodeInfo yet the link is active.<br>
> Can you describe what state that node is in (what modules are loaded,<br>
> etc.) ? Can you do an ibstat/ibstatus on that node ?</tt></font>
<br>
<br><font size=2><tt>Both systems are booted and the link appears active.
Here is the information you asked for:</tt></font>
<br>
<br><font size=2><tt>>>>>>>>>>>>>>>>>>>></tt></font>
<br>
<br><font size=2><tt>Local System (where OpenSM is attempting to run)</tt></font>
<br>
<br><font size=2><tt>[koa] (ib) ib> ibstat</tt></font>
<br><font size=2><tt>CA 'mthca0'</tt></font>
<br><font size=2><tt> CA type: MT25204</tt></font>
<br><font size=2><tt> Number of ports: 1</tt></font>
<br><font size=2><tt> Firmware version: 1.0.800</tt></font>
<br><font size=2><tt> Hardware version: a0</tt></font>
<br><font size=2><tt> Node GUID: 0x0002c90200216dc4</tt></font>
<br><font size=2><tt> System image GUID: 0x0002c90200216dc7</tt></font>
<br><font size=2><tt> Port 1:</tt></font>
<br><font size=2><tt>
State: Initializing</tt></font>
<br><font size=2><tt>
Physical state: LinkUp</tt></font>
<br><font size=2><tt>
Rate: 20</tt></font>
<br><font size=2><tt>
Base lid: 0</tt></font>
<br><font size=2><tt>
LMC: 0</tt></font>
<br><font size=2><tt>
SM lid: 0</tt></font>
<br><font size=2><tt>
Capability mask: 0x02510a68</tt></font>
<br><font size=2><tt>
Port GUID: 0x0002c90200216dc5</tt></font>
<br><font size=2><tt>[koa] (ib) ib> ibstatus</tt></font>
<br><font size=2><tt>Infiniband device 'mthca0' port 1 status:</tt></font>
<br><font size=2><tt> default gid:
fe80:0000:0000:0000:0002:c902:0021:6dc5</tt></font>
<br><font size=2><tt> base lid:
0x0</tt></font>
<br><font size=2><tt> sm lid:
0x0</tt></font>
<br><font size=2><tt> state:
2: INIT</tt></font>
<br><font size=2><tt> phys state:
5: LinkUp</tt></font>
<br><font size=2><tt> rate:
20 Gb/sec (4X DDR)</tt></font>
<br>
<br><font size=2><tt>[koa] (ib) ib> /sbin/lsmod</tt></font>
<br><font size=2><tt>Module
Size Used by</tt></font>
<br><font size=2><tt>parport_pc
28008 0</tt></font>
<br><font size=2><tt>lp
12872 0</tt></font>
<br><font size=2><tt>parport
37260 2 parport_pc,lp</tt></font>
<br><font size=2><tt>ib_ipath
58392 0</tt></font>
<br><font size=2><tt>ipath_core 154596
1 ib_ipath</tt></font>
<br><font size=2><tt>pcmcia
34864 0</tt></font>
<br><font size=2><tt>yenta_socket 25484
0</tt></font>
<br><font size=2><tt>rsrc_nonstatic 12160 1
yenta_socket</tt></font>
<br><font size=2><tt>pcmcia_core 38068
3 pcmcia,yenta_socket,rsrc_nonstatic</tt></font>
<br><font size=2><tt>button
7328 0</tt></font>
<br><font size=2><tt>battery
10120 0</tt></font>
<br><font size=2><tt>ac
5512 0</tt></font>
<br><font size=2><tt>uhci_hcd
31776 0</tt></font>
<br><font size=2><tt>hw_random
6824 0</tt></font>
<br><font size=2><tt>i2c_i801
10260 0</tt></font>
<br><font size=2><tt>i2c_core
20992 1 i2c_i801</tt></font>
<br><font size=2><tt>ib_mthca
109744 0</tt></font>
<br><font size=2><tt>ib_ipoib
48792 0</tt></font>
<br><font size=2><tt>ib_uverbs
34128 0</tt></font>
<br><font size=2><tt>ib_umad
14000 0</tt></font>
<br><font size=2><tt>ib_ucm
16520 0</tt></font>
<br><font size=2><tt>ib_sa
13884 1 ib_ipoib</tt></font>
<br><font size=2><tt>ib_cm
30144 1 ib_ucm</tt></font>
<br><font size=2><tt>ib_mad
35896 4 ib_mthca,ib_umad,ib_sa,ib_cm</tt></font>
<br><font size=2><tt>ib_core
45952 9 ib_ipath,ib_mthca,ib_ipoib,ib_uverbs,ib_umad,ib_ucm,ib_sa,ib_cm,ib_mad</tt></font>
<br><font size=2><tt>floppy
67400 0</tt></font>
<br>
<br><font size=2><tt>>>>>>>>>>>>>>>>>>>></tt></font>
<br>
<br><font size=2><tt>Remote system (no OpenSM instance)</tt></font>
<br>
<br><font size=2><tt>[jatoba] (ib) ib> ibstat</tt></font>
<br><font size=2><tt>CA 'mthca0'</tt></font>
<br><font size=2><tt> CA type: MT25204</tt></font>
<br><font size=2><tt> Number of ports: 1</tt></font>
<br><font size=2><tt> Firmware version: 1.0.800</tt></font>
<br><font size=2><tt> Hardware version: a0</tt></font>
<br><font size=2><tt> Node GUID: 0x0002c90200216e40</tt></font>
<br><font size=2><tt> System image GUID: 0x0002c90200216e43</tt></font>
<br><font size=2><tt> Port 1:</tt></font>
<br><font size=2><tt>
State: Initializing</tt></font>
<br><font size=2><tt>
Physical state: LinkUp</tt></font>
<br><font size=2><tt>
Rate: 20</tt></font>
<br><font size=2><tt>
Base lid: 0</tt></font>
<br><font size=2><tt>
LMC: 0</tt></font>
<br><font size=2><tt>
SM lid: 0</tt></font>
<br><font size=2><tt>
Capability mask: 0x02510a68</tt></font>
<br><font size=2><tt>
Port GUID: 0x0002c90200216e41</tt></font>
<br><font size=2><tt>[jatoba] (ib) ib> ibstatus</tt></font>
<br><font size=2><tt>Infiniband device 'mthca0' port 1 status:</tt></font>
<br><font size=2><tt> default gid:
fe80:0000:0000:0000:0002:c902:0021:6e41</tt></font>
<br><font size=2><tt> base lid:
0x0</tt></font>
<br><font size=2><tt> sm lid:
0x0</tt></font>
<br><font size=2><tt> state:
2: INIT</tt></font>
<br><font size=2><tt> phys state:
5: LinkUp</tt></font>
<br><font size=2><tt> rate:
20 Gb/sec (4X DDR)</tt></font>
<br>
<br><font size=2><tt>[jatoba] (ib) ib> /sbin/lsmod</tt></font>
<br><font size=2><tt>Module
Size Used by</tt></font>
<br><font size=2><tt>parport_pc
28008 0</tt></font>
<br><font size=2><tt>lp
12872 0</tt></font>
<br><font size=2><tt>parport
37260 2 parport_pc,lp</tt></font>
<br><font size=2><tt>ib_ipath
58392 0</tt></font>
<br><font size=2><tt>ipath_core 154596
1 ib_ipath</tt></font>
<br><font size=2><tt>pcmcia
34864 0</tt></font>
<br><font size=2><tt>yenta_socket 25484
0</tt></font>
<br><font size=2><tt>rsrc_nonstatic 12160 1
yenta_socket</tt></font>
<br><font size=2><tt>pcmcia_core 38068
3 pcmcia,yenta_socket,rsrc_nonstatic</tt></font>
<br><font size=2><tt>button
7328 0</tt></font>
<br><font size=2><tt>battery
10120 0</tt></font>
<br><font size=2><tt>ac
5512 0</tt></font>
<br><font size=2><tt>uhci_hcd
31776 0</tt></font>
<br><font size=2><tt>hw_random
6824 0</tt></font>
<br><font size=2><tt>i2c_i801
10260 0</tt></font>
<br><font size=2><tt>i2c_core
20992 1 i2c_i801</tt></font>
<br><font size=2><tt>ib_mthca
109744 0</tt></font>
<br><font size=2><tt>ib_ipoib
48792 0</tt></font>
<br><font size=2><tt>ib_uverbs
34128 0</tt></font>
<br><font size=2><tt>ib_umad
14000 2</tt></font>
<br><font size=2><tt>ib_ucm
16520 0</tt></font>
<br><font size=2><tt>ib_sa
13884 1 ib_ipoib</tt></font>
<br><font size=2><tt>ib_cm
30144 1 ib_ucm</tt></font>
<br><font size=2><tt>ib_mad
35896 4 ib_mthca,ib_umad,ib_sa,ib_cm</tt></font>
<br><font size=2><tt>ib_core
45952 9 ib_ipath,ib_mthca,ib_ipoib,ib_uverbs,ib_umad,ib_ucm,ib_sa,ib_cm,ib_mad</tt></font>
<br><font size=2><tt>floppy
67400 0</tt></font>
<br>
<br><font size=2><tt>>>>>>>>>>>>>>>>>>>></tt></font>
<br><font size=2><tt><br>
> <br>
> Can you try this patch to see if it gets you further and let me know
?<br>
> Note that this is just a potential workaround right now.<br>
> </tt></font>
<br><font size=2><tt><br>
I will try rebuilding with the patch and let you know the results.</tt></font>
<br>
<br><font size=2><tt>Thanks,</tt></font>
<br><font size=2><tt> -Don Albert-<br>
</tt></font>