[ofa-general] IB interfaces occasionally go down & come up for no reason

Sumeet Lahorani Sumeet.Lahorani at oracle.com
Thu Dec 18 00:28:53 PST 2008


Hi,

We sometimes see our IB interfaces go down and come back up within 2 or 
3 seconds for apparently no reason.

Dec 17 14:47:23 dscbax14s kernel: ib0: multicast join failed for 
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
Dec 17 14:47:23 dscbax14s kernel: ib1: multicast join failed for 
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
Dec 17 14:47:23 dscbax14s kernel: bonding: bond0: link status down for 
idle  interface ib0, disabling it in 5000 ms.
Dec 17 14:47:23 dscbax14s kernel: bonding: bond0: link status down for 
idle  interface ib1, disabling it in 5000 ms.
Dec 17 14:47:25 dscbax14s kernel: bonding: bond0: link status up again 
after 2000 ms for interface ib0.
Dec 17 14:47:25 dscbax14s kernel: bonding: bond0: link status up again 
after 2000 ms for interface ib1.

To mask these we've set downdelay & updelay to 5000. But can anybody 
tell me why these interfaces could be bouncing down & up like this? We 
are not pulling any cables, resetting ports or resetting switches when 
this happens. We are using Voltaire ISR9024  switches & Mellanox 
Technologies MT25418 [ConnectX IB DDR] HCAs.

- Sumeet




More information about the general mailing list