[openib-general] got ipoib up once but not twice :-)
Ronald G. Minnich
rminnich at lanl.gov
Fri Jan 14 12:07:43 PST 2005
OK, I had all of bluesteel up yesterday. It all "just worked"
insmod the right stuff on front end, i.e.
ib_ipoib 53856 0
ib_sa 12564 1 ib_ipoib
ib_umad 12224 5
ib_mthca 90976 9
ib_mad 29872 3 ib_sa,ib_umad,ib_mthca
ib_core 43264 67 ib_ipoib,ib_sa,ib_umad,ib_mthca,ib_mad
insmod the right stuff on the nodes, i.e.
ib_mthca 90976 0
ib_umad 12224 0
ib_ipoib 53856 0
ib_sa 12564 1 ib_ipoib
ib_mad 29872 3 ib_mthca,ib_umad,ib_sa
ib_core 43264 5 ib_mthca,ib_umad,ib_ipoib,ib_sa,ib_mad
run opensm -v
(get lots of messages that look ok)
(you probably don't want to see this ...)
ipconfig front end
ib0 Link encap:UNSPEC HWaddr
00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.4.0.1 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::202:c901:8a0:3e61/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:73 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:0 (0.0 b) TX bytes:4388 (4.2 Kb)
ipconfig bproc slaves
ib0 Link encap:UNSPEC HWaddr
00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.4.2.10 Bcast:10.255.255.255 Mask:255.0.0.0
UP BROADCAST MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
All this was fine.
As of today, however, I've got opensm up ok, and the messages all look ok;
the kernel messages on slave nodes look fine. But, sadly, no joy on ipoib.
I'm not sure where to start looking given that all the positive indicators
are still positive. What's a sensible thing to do at this point?
many thanks
ron
More information about the general
mailing list