[ofa-general] ibcheckerrors give error 5691 within OFED 1.3.1

Wen Hao Wang wangwhao at cn.ibm.com
Wed Sep 17 01:25:00 PDT 2008



Hi all:

I had one IB cluster with eight IBM HS21 blades, mixed with RHEL5.2 Server
and SLES10 SP2. All of them connected to one IB switch. opensm was running
as subnet manager on one blade. Command ibcheckerrors finished smoothly.
Last week I got another eight IBM LS21 blades connected to another IB
switch. But after I connected two switches and turned on all the IB
adapters on new blades, ibcheckerrors gave error message:

[root at gaia-07 ~]# ibcheckerrors
#warn: counter RcvErrors = 5691         (threshold 10) lid 3 port 1
Error check on lid 3 (gaia-07 HCA-1) port 1:  FAILED

## Summary: 19 nodes checked, 0 bad nodes found
##          46 ports checked, 1 ports have errors beyond threshold
[root at gaia-07 ~]# ibv_devinfo
hca_id: mlx4_0
        fw_ver:                         2.3.000
        node_guid:                      0002:c903:0001:3370
        sys_image_guid:                 0002:c903:0001:3373
        vendor_id:                      0x02c9
        vendor_part_id:                 25418
        hw_ver:                         0xA0
        board_id:                       IBM08A0000001
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 15
                        port_lid:               3
                        port_lmc:               0x00

                port:   2
                        state:                  PORT_DOWN (1)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
[root at gaia-07 ~]# ibcheckport 3 1
[root at gaia-07 ~]# echo $?
0

I had closed the embeded subnet manager on two IB switches. The issue
always exist, even after I change subnet manager location to another
machine. ib0 of machine gaia-07 can communicate with other machines each
other. All installed IB adapters are ConnectX 4xSDR. Both switches are
Topspin Switches. Will anyone give some advice about this issue? Thanks in
advance!

Wen Hao Wang
Email: wangwhao at cn.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080917/a37fa080/attachment.html>


More information about the general mailing list