[ofa-general] Multiports single HCA uDAPL program problem

Davis, Arlin R arlin.r.davis at intel.com
Fri Jan 30 11:15:12 PST 2009


This looks like an ARP issue across your IPoIB interfaces. 

Please see section 6 of the uDAPL OFED BKM.

http://www.openfabrics.org/downloads/dapl/documentation/uDAPL_ofed_testing_bkm.pdf
 
6. Multi IB port configuration, IPoIB arp reply issues

When two interfaces running one interface may reply to an ARP
directed to the other interface on the system. The following
configuration will cause the interfaces to ignore ARP requests if
not specifically for their IP address.

Add the following lines to /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.ib0.arp_ignore=1
net.ipv4.conf.ib1.arp_ignore=1

or use sysctl:
sysctl -w net.ipv4.conf.all.arp_ignore=1
sysctl -w net.ipv4.conf.ib0.arp_ignore=1
sysctl -w net.ipv4.conf.ib1.arp_ignore=1

-arlin

>-----Original Message-----
>From: general-bounces at lists.openfabrics.org 
>[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jie Cai
>Sent: Thursday, January 29, 2009 10:53 PM
>To: general at lists.openfabrics.org
>Subject: [ofa-general] Multiports single HCA uDAPL program problem
>
>Hi All,
>
>I am kind of noob on IB and uDAPL program. Currently, I am trying to
>write a program with multirail that utilizes 2 ports on a 
>single Mallenox
>ConnectX HCA on both nodes.
>
>OFED1.3 has been installed on a SUSE 10.3 linux system.
>
>The current problem is that IB connection via uDAPL are very unstable,
>and sometime the connection can't be established.
>Error message is usually like:
>
>20350 Server waiting for connect request on port 45248
> accept: ERR dev(0x61d0e0!=0x61d0e0) or port mismatch(1!=2)
>20350 Error dat_cr_accept: DAT_INTERNAL_ERROR
>20350 Error connect_ep: DAT_INTERNAL_ERROR
>
>The status of both port are active:
>hca_id:    mlx4_0
>    fw_ver:                2.3.000
>    node_guid:            0003:ba00:0100:702c
>    sys_image_guid:            0003:ba00:0100:702f
>    vendor_id:            0x02c9
>    vendor_part_id:            25418
>    hw_ver:                0xA0
>    board_id:            SUN0070000001
>    phys_port_cnt:            2
>        port:    1
>            state:            PORT_ACTIVE (4)
>            max_mtu:        2048 (4)
>            active_mtu:        2048 (4)
>            sm_lid:            10
>            port_lid:        8
>            port_lmc:        0x00
>
>        port:    2
>            state:            PORT_ACTIVE (4)
>            max_mtu:        2048 (4)
>            active_mtu:        2048 (4)
>            sm_lid:            10
>            port_lid:        9
>            port_lmc:        0x00
>
>
>I haven't done any specific configuration for multi-port. I assume that
>OFED1.3 can do it automatically.
>
>Would please any one help me on this?
>
>Regards,
>Jie
>
>--
>Jie Cai
>
>
>
>_______________________________________________
>general mailing list
>general at lists.openfabrics.org
>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
>To unsubscribe, please visit 
>http://openib.org/mailman/listinfo/openib-general
>


More information about the general mailing list