[ofw] Setting up Infiniband over WinXp- Help Needed
Ashwath Narasimhan
an2355 at columbia.edu
Thu Jun 11 18:14:37 PDT 2009
Hi Everyone,
Thank you so much for your replies. Still the same problem.. able to ping
sucessfully from one side but not from the other.
Hi Tzachi and Leonid,
a. I followed your steps. I am able to view the infiniband data when I run
the server (ib_send_bw -a) on computer 2 and I connect to this from computer
1 (ib_send_bw -a <ip>). However, I do not view this data when I run server
on computer 1 and connect from computer 2. I get a pp_connect_sock<ip,port>
failed in the latter case.
b. I disabled and enabled network interfaces on both ports, but no luck. It
still doesnt work.
c. I know that its *not* a hardware issue because the same problem persists
when I interchange the infiniband cards i.e. the card that was actually
plugged into computer 2 is now plugged into computer 1 and vice versa. I get
the same issue in this case too.
d. I then installed Ultra VNC and ran one end as server and the other as
client.. *And it worked perfectly fine!!!!!!!*.. Both from computer 1 to
computer 2 and computer 2 to computer 1. I then installed WIRESHARK on both
computers. I could see the Computer 2 send the Ping requests to computer 1
in Computer 2's Wireshark window but for some bizzare reason computer 1 was
rejecting these ping requests. When I checked the connection status of
Computer 1, I could see the number of received packets also increasing but
Computer 1 did not send back any packets.
e. I suspect this issue is arising because of some win xp setting in
Computer 1. There is no difference between the two PC's. both are brand new
PC's having Xp. The only difference is that I have a wifi driver on computer
1. all my firewall settings are disabled. I even uninstalled my wifi driver,
but still the same problem persists.
f. ipconfig and vstat return the correct values.. here's the output of these
commands on computer 1
Windows IP Configuration
Host Name . . . . . . . . . . . . : LENOVO-CF61BEED
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Unknown
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : ee.columbia.edu
Ethernet adapter Local Area Connection:
Connection-specific DNS Suffix . : ee.columbia.edu
Description . . . . . . . . . . . : Marvell Yukon 88E8056 PCI-E
Gigabit
Ethernet Controller
Physical Address. . . . . . . . . : 00-21-97-CB-64-97
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 128.59.65.132
Subnet Mask . . . . . . . . . . . : 255.255.252.0
Default Gateway . . . . . . . . . : 128.59.64.1
DHCP Server . . . . . . . . . . . : 128.59.64.59
DNS Servers . . . . . . . . . . . : 128.59.64.59
128.59.16.20
Lease Obtained. . . . . . . . . . : Thursday, June 11, 2009 5:39:25
PM
Lease Expires . . . . . . . . . . : Saturday, June 13, 2009 12:39:25
PM
Ethernet adapter Local Area Connection 7:
Media State . . . . . . . . . . . : Media disconnected
Description . . . . . . . . . . . : Mellanox IPoIB Adapter #4
Physical Address. . . . . . . . . : 00-05-AD-04-E7-C6
Ethernet adapter Local Area Connection 6:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Mellanox IPoIB Adapter #3
Physical Address. . . . . . . . . : 00-05-AD-04-E7-C5
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
Autoconfiguration IP Address. . . : 169.254.53.191
Subnet Mask . . . . . . . . . . . : 255.255.0.0
Default Gateway . . . . . . . . . :
C:\<my directory>vstat
hca_idx=0
uplink={BUS=PCI_E, SPEED=2.5 Gbps,
vendor_id=0x05ad
vendor_part_id=0x6278
hw_ver=0xa0
fw_ver=0x400080395
node_guid=0005:ad00:0004:e7c4
num_phys_ports=2
port=1
port_state=PORT_ACTIVE (4)
link_speed=2.5 Gbps (1)
link_width=4x (2)
rate=10 Gbps
port_phys_state=LINK_UP (5)
active_speed=2.5 Gbps (1)
sm_lid=0x0001
port_lid=0x0002
port_lmc=0x0
max_mtu=2048 (4)
port=2
port_state=PORT_DOWN (1)
link_speed=NA
link_width=NA
rate=NA
port_phys_state=POLLING (2)
active_speed=2.5 Gbps (1)
sm_lid=0x0000
port_lid=0x0000
port_lmc=0x0
max_mtu=2048 (4)
P.S. I am using the first port.
regards,
Ashwath
On Thu, Jun 11, 2009 at 7:00 AM, Leonid Keller <leonid at mellanox.co.il>wrote:
> Hi Ashwath,
>
> If you still have problems, send us, please, the output of 'vstat -v' and
> 'ipconfig /all' on both machines.
>
> TIA
> Leonid
>
> ------------------------------
> *From:* ofw-bounces at lists.openfabrics.org [mailto:
> ofw-bounces at lists.openfabrics.org] *On Behalf Of *Tzachi Dar
> *Sent:* Thursday, June 11, 2009 4:43 PM
> *To:* Ashwath Narasimhan; Fab Tillier
> *Cc:* ofw at lists.openfabrics.org
> *Subject:* RE: [ofw] Setting up Infiniband over WinXp- Help Needed
>
> Hi Ashwath,
>
> There are a few things that I would like you to try:
>
> 1) Please run some low level IB test to see that traffic is indeed ok. On
> one computer please run
>
> ib_send_bw -a
>
> and on the other computer please run
> ib_send_bw -a 192.168.0.x (where x is the ip of the remote side.
> Please start this test with the *Ethernet* addresses of the ports).
>
> 2) Assuming all works well please try to disable and enable the network
> interfaces (ipoib) on both ports. Please see if this helps.
>
> 3) If this doesn't help, you will probably need to change the parameter of
> "Guid bitwise mask" to e7. To do this, please open the device manager,
> than go to "network adapters" select the ipoib interfaces and than right
> click properties. Select the "Guid bitwise mask" and change it to e7.
>
> If all doesn't help, can you give me remote access to these stations?
>
> Thanks
> Tzachi
>
> ------------------------------
> *From:* ofw-bounces at lists.openfabrics.org [mailto:
> ofw-bounces at lists.openfabrics.org] *On Behalf Of *Ashwath Narasimhan
> *Sent:* Thursday, June 11, 2009 5:04 AM
> *To:* Fab Tillier
> *Cc:* ofw at lists.openfabrics.org
> *Subject:* Re: [ofw] Setting up Infiniband over WinXp- Help Needed
>
> Hi Fab,
>
> I restarted opensm on the other node. I ran both opensm and
> ibdiagnet on the other node (not on the node where opensm is running). The
> logs are similar to the one I attached in my previous mail. (Computer 1
> :-192.168.0.1 logs in my previous mail). I have disabled firewall settings
> on both nodes. However, I still cannot get it to work. I cannot access the
> shared folder of each node from the other. Is there something else I can
> try?
>
> p.s. There is a typo in my previous mail. I had opensm running on computer
> 2 and not computer 1.
>
> regards,
> Ashwath
>
> On Wed, Jun 10, 2009 at 9:42 PM, Fab Tillier <
> ftillier at windows.microsoft.com> wrote:
>
>> Hi Ashwath,
>>
>> >I am new to the world of infiniband and I am trying to set up an
>> >infiniband network between two Lenovo x86 Desktops (Windows Xp).
>>
>> Welcome!
>>
>> >Problem:-
>> >I am able to ping 192.168.0.2 from 192.168.0.1 however ping does not
>> >work the other way around i.e. from 192.168.0.2 to 192.168.0.1. I don't
>> >understand why this is not happening. I see that the "bind" fails but I
>> >dont understand why. Shouldn't it be two way? (I am using one cable to
>> >connect the two adaptors) Please help me. Thanks.
>>
>> Check your firewall settings on the 192.168.0.1 box. Can you access the
>> administrative share on each node from the other (\\192.168.0.1\c$, and
>> \\192.168.0.2\c$?)
>>
>> >========================================================================
>> >Computer No2: 192.168.0.2
>> >when I ran osmtest here:
>> >
>> > C:\<mydirectory>osmtest -f -a
>> >Command Line Arguments
>> >Done with args
>> > Flow = All Validations
>> >Using default guid 0x5ad000004e7c6
>> >[17:59:17:437][0388] -> osm_vendor_bind: Binding to port
>> >0x5ad000004e7c6.
>> >[17:59:17:437][0388] -> osm_vendor_bind: ERR 3B21: Unable to register
>> >QP0 MAD se
>> >rvice (IB_INSUFFICIENT_MEMORY).
>> >[17:59:17:437][0388] -> osmv_bind_sa: ERR 0506: Fail to bind to vendor
>> >SMI.
>> >[17:59:17:437][0388] -> osmtest_bind: ERR 0137: Unable to bind to SA
>>
>> You probably have OpenSM running on this node, yes? You can't run osmtest
>> on the same port where OpenSM is running.
>>
>> >when I ran ibdiagnet here
>> >
>> >C:\<my directory>ibdiagnet
>> >Loading IBIS from: C:/Program
>> >Files/Mellanox/MLNX_WinOF/Tools/ibdiagnet.exe/lib/
>> >ibis1.0
>> >Loading IBDM from: C:/Program
>> >Files/Mellanox/MLNX_WinOF/Tools/ibdiagnet.exe/lib/
>> >ibdm1.0
>> >-W- Topology file is not specified.
>> > Reports regarding cluster links will use direct routes.
>> >-I- Using port 2 as the local port.
>> >-E- Fail to ibsac_bind.
>>
>> Don't know the details of this tool, maybe it's running into the same
>> problems as osmtest? Try running OpenSM on the other node and see if the
>> problem follows the SM or not.
>>
>> -Fab
>>
>
>
>
> --
> regards,
> Ashwath
>
>
--
regards,
Ashwath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090611/4a26ed3b/attachment.html>
More information about the ofw
mailing list