[ofw] RE: installation/connectivity problems on hpc server

Anatoly Greenblatt anatolyg at voltaire.com
Thu Nov 12 03:43:37 PST 2009


Hi,

 

We had 2 (and now even more) problems on these systems.

 

1st is the problem of installation but we have a workaround.

2nd is the ipoib problem system log shows: "Mellanox IPoIB Adapter #3:
Subnet Administrator failed query for broadcast group information."

3rd two systems had bsod (minidump is attached). After bsod, one system
stops booting and the text screen shows that "system failed to boot
because critical system driver is missing: mlx4_hca.sys

 

Regardsm

Anatoly.

 

________________________________

From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
Sent: Thursday, November 12, 2009 12:19 PM
To: Smith, Stan; Anatoly Greenblatt; ofw at lists.openfabrics.org
Subject: RE: [ofw] RE: installation/connectivity problems on hpc server

 

IMHO, if vstat shows the links as up, this is not an installation
problem but rather an ipoib problem.

 

Can you please run ipoib with trace and send us the logs (also, do you
have anything in the event viewer)?

 

Thanks

Tzachi

	 

	
________________________________


	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
	Sent: Thursday, November 12, 2009 12:35 AM
	To: Anatoly Greenblatt; ofw at lists.openfabrics.org
	Subject: [ofw] RE: installation/connectivity problems on hpc
server

	Hello,

	  Please see inline comments.

	 

	
________________________________


	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Anatoly
Greenblatt
	Sent: Wednesday, November 11, 2009 1:05 PM
	To: ofw at lists.openfabrics.org
	Subject: [ofw] installation/connectivity problems on hpc server

	Hi,

	 

	Should we have any problems installing Winof 2.1 on server 2008
hpc edition sp2? Or am I missing something. 

	 

	Have not attempted a WinOF install on Svr 2008 HPC sp2, although
I would not anticipate any problems.

	Try installing via 'start/wait msiexec /I WinOF_2-1_wlh_x64.msi
/Lv msi.log'. Grep log file for error.

	 

	The installation ends prematurely claiming that previous
installation was detected. 

	Is this a MSFT installer error message or a WinOF installer
error message?

	 

	I've exctracted the drivers from winof hpc x64 msi and installed
manually. 

	 

	Since you have already installed HCA drivers by hand, you want
to install WinOF with NO devices installed 

	  'start/wait msiexec /I WinOF_2-1_wlh_x64.msi NODEV=1'   # just
take the default install, no devices will be installed. 

	 

	The bus/hca/ipoib drivers were installed successfully, however
the ipoib network adapter shows status "disconnected" 

	State of cable disconnected indicates the SM has not
seen/configured --> Active port state. 

	 

	Opsnsm is running on linux node and shows all ports/nodes as
connected. Vstat on the nodes shows that ports that are physically
connected are up.   UP == Active port state? 

	 

	It is c-class hp blade with connectx gen2. firmware 2.6.1.

	 

	Any ideas how to fix ipoib connection?

	 

	Thanks,

	Anatoly.

	 

	 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20091112/b2b92c8c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Mini111109-02.dmp
Type: application/octet-stream
Size: 334495 bytes
Desc: Mini111109-02.dmp
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20091112/b2b92c8c/attachment.obj>


More information about the ofw mailing list