[ewg] Soft-RoCE on NetEffect NE020 10Gb RNIC
Richard Croucher
richard at informatix-sol.com
Thu May 5 03:46:04 PDT 2011
As I understand it, Soft-RoCEE is only to allow developers to run
InfiniBand VERBS over Ethernet cards that do not have native RoCEE drivers.
The latter requires RDMA and CEE, whereas Soft-Rocee does not have these
dependencies. The NetEffect and Mellanox cards both have RoCEE drivers in
OFED and these should be used in preference. The maintainers will accept
bugs on these.
Soft RoCEE is not currently included or maintained by OFED and is maintained
separately by systemfabricworks
From: ewg-bounces at lists.openfabrics.org
[mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Bob Pearson
Sent: 05 May 2011 03:24
To: 'Tanin'; 'OpenFabrics EWG'; 'Dantong Yu'; 'fatfish'; 'Shudong Jin'
Subject: Re: [ewg] Soft-RoCE on NetEffect NE020 10Gb RNIC
Hi Mr Lee,
I will respond off list since rxe is not technically part of OFED.
Bob Pearson
From: ewg-bounces at lists.openfabrics.org
[mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Tanin
Sent: Wednesday, May 04, 2011 3:52 PM
To: OpenFabrics EWG; Dantong Yu; fatfish; Shudong Jin
Subject: [ewg] Soft-RoCE on NetEffect NE020 10Gb RNIC
Dear all,
I have installed the OFED-1.5.2-rxe on our linux host, which has three
network interfaces, Broadcom Corporation NetXtreme II BCM5709 Gigabit
Ethernet NIC, Mellanox Technologies MT26478 [ConnectX EN 40GigE, PCIe 2.0
5GT/s] and NetEffect NE020 10Gb Accelerated Ethernet Adapter (iWARP RNIC).
The Soft-RoCE can work on the first two cards, but when I applied the
Soft-RoCE to the Neteffect card and use "ibv_devinfo" to view the RDMA
device, I got following errors on some of the hosts in our cluster, and the
whole OFED stack doesnot work.
[root at netqos14 ~]# rxe_cfg status
Name Link Driver Speed MTU IPv4_addr S-RoCE RMTU
eth0 yes bnx2 1500 198.124.220.155
eth1 no bnx2 1500
eth2 no bnx2 1500
eth3 no bnx2 1500
eth4 yes iw_nes 1500 198.124.220.207 rxe0
rxe eth_proto_id: 0x8915
[root at netqos14 ~]# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.626
node_guid: 0002:c903:000b:f306
sys_image_guid: 0002:c903:000b:f309
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 6
port_lid: 4
port_lmc: 0x00
link_layer: IB
hca_id: nes0
transport: iWARP (1)
fw_ver: 3.16
node_guid: 0012:5502:f6ac:0000
sys_image_guid: 0012:5502:f6ac:0000
vendor_id: 0x1255
vendor_part_id: 256
hw_ver: 0x5
board_id: NES020 Board ID
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet
libnes: nes_ualloc_context: Invalid kernel driver version detected. Detected
0, should be 1
libnes: nes_ualloc_context: Failed to allocate context for device.
Failed to open device
However, some hosts in our cluster can make Soft-RoCE work on the iWARP RNIC
with the same configuration. The info is as follows,
[root at netqos13 rftp]# rxe_cfg status
Name Link Driver Speed MTU IPv4_addr S-RoCE RMTU
eth0 yes bnx2 1500 198.124.220.154
eth1 no bnx2 1500
eth2 no bnx2 1500
eth3 no bnx2 1500
eth4 yes iw_nes 1500 198.124.220.206 rxe0 1024
(3)
rxe eth_proto_id: 0x8915
[root at netqos13 rftp]# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.626
node_guid: 0002:c903:000b:f31e
sys_image_guid: 0002:c903:000b:f321
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 6
port_lid: 1
port_lmc: 0x00
link_layer: IB
hca_id: nes0
transport: iWARP (1)
fw_ver: 3.16
node_guid: 0012:5502:f208:0000
sys_image_guid: 0012:5502:f208:0000
vendor_id: 0x1255
vendor_part_id: 256
hw_ver: 0x5
board_id: NES020 Board ID
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet
hca_id: rxe0
transport: InfiniBand (0)
fw_ver: 0.0.0
node_guid: 0212:55ff:fe02:f208
sys_image_guid: 0000:0000:0000:0000
vendor_id: 0x0000
vendor_part_id: 0
hw_ver: 0x0
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
The two host are identical. The system info is as follows,
[root at netqos13 rftp]# uname -a
Linux netqos13 2.6.18-164.11.1.el5_lustre.1.8.3 #1 SMP Fri Apr 9 18:00:39
MDT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root at netqos13 rftp]# ifconfig
eth0 Link encap:Ethernet HWaddr A4:BA:DB:1E:CC:8D
inet addr:198.124.220.154 Bcast:198.124.220.63
Mask:255.255.255.192
inet6 addr: fe80::a6ba:dbff:fe1e:cc8d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:105558250 errors:0 dropped:0 overruns:0 frame:0
TX packets:137816731 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:95088704022 (88.5 GiB) TX bytes:156759141516 (145.9 GiB)
Interrupt:98 Memory:d2000000-d2012800
eth4 Link encap:Ethernet HWaddr 00:12:55:02:F2:08
inet addr:198.124.220.206 Bcast:198.124.220.255
Mask:255.255.255.192
inet6 addr: fe80::212:55ff:fe02:f208/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:59487544 errors:0 dropped:0 overruns:0 frame:0
TX packets:55691374 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:82372790409 (76.7 GiB) TX bytes:34462883454 (32.0 GiB)
Interrupt:130
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:192.168.1.13 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::202:c903:b:f31f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:4461 errors:0 dropped:0 overruns:0 frame:0
TX packets:17 errors:0 dropped:9 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:264959 (258.7 KiB) TX bytes:3267 (3.1 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:19040792 errors:0 dropped:0 overruns:0 frame:0
TX packets:19040792 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:147810608491 (137.6 GiB) TX bytes:147810608491 (137.6
GiB)
So, my question is why is that Soft-RoCE does not work on some of the
NetEffect iWARP RNIC's, but does work on the other NetEffect iWARP RNIC's?
All iWARP RNIC's are on different hosts of the same cluster, and connected
via a Juniper EX 2500 switch.
Any help will be greatly appreciated.
--
Best regards,
----------------------------------------------------------------------------
-------------------
Li, Tan
PhD Candidate & Research Assistant,
Electrical Engineering,
Stony Brook University, NY
Personal Web Site: https://sites.google.com/site/homepagelitan/Home
Email: fanqielee at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20110505/9d16135b/attachment.html>
More information about the ewg
mailing list