[ofa-general] Infiniband Card Trouble

Shue, David CTR USAF AFMC AFRL/RITB David.Shue.ctr at rl.af.mil
Tue May 20 08:54:21 PDT 2008


UPDATE

I was given a reflash image from a MELLANOX rep and it worked for me!

Thanks for everyone's effort to help me.

-Dave

-----Original Message-----
From: Mike Heinz [mailto:michael.heinz at qlogic.com] 
Sent: Thursday, May 01, 2008 11:21 AM
To: Shue, David CTR USAF AFMC AFRL/RITB; general at lists.openfabrics.org
Subject: RE: [ofa-general] Infiniband Card Trouble

#6 makes it sound like it's an ofed installation issue rather than the
HCA itself.
 
Could you post the relevant /var/log/messages? Messages from ib_mthca
would be especially important. In addition, the output from
 
mstflint -d <mypciaddress> q 
 
could also be useful.
 
--
Michael Heinz
Principal Engineer, Qlogic Corporation
King of Prussia, Pennsylvania
 

________________________________

From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Shue, David
CTR USAF AFMC AFRL/RITB
Sent: Thursday, May 01, 2008 9:09 AM
To: general at lists.openfabrics.org
Subject: [ofa-general] Infiniband Card Trouble



Hello,

 

I have used the OFED-1.3 software to communicate with the current cards
I have.  These cards come up as "MT23108" in the logs, and I am not sure
whom the manufacturer is.  I was able to program the cards, and even
install MPICH2 and run tests.

 

I have recently obtained new IB cards from HP "HP PCI-X 2-port 4X Fabric
(HPC) Adapter"
http://h20000.www2.hp.com/bizsupport/TechSupport/Home.jsp?lang=en&cc=id&
prodTypeId=12883&prodSeriesId=460713&lang=en&cc=id and these cards do
not work the same.  The machine boots up fine with the card in, and
shows the card as Mellanox "MT23108" also?  The two cards are visibly
different in every way.  Is the MT23108 a certain platform for IB?  I am
new to the entire IB technology.  This is the history of what I did.  


 

1)     Staged the machine RH EL v5

2)     Install the IB card

3)     Boot machine up

4)     Can see the card looking at "lspci" and "dmesg" but nothing in
the network area or under "ifconfig"  (Just like with the first cards)

5)     I then install the OFED-1.3 software to communicate and configure
the card

6)     When I go to start the card (instead of reboot but have tried
both ways) /etc/init.d/openib start, it all fails.  I then look in the
log file and see a bunch of "unknown symbol..." and "disagrees..."  for
all items of ib_uverbs, ib_umad,iw_cxgb3,ib_path, mlx_ib, and so on.

7)     When I reboot, the machine reaches "UDEV" of the reboot stage,
hangs for a little bit, and then many errors show and the machine won't
boot, unless I take the card out.  If I uninstall the OFED software, it
will reboot fine with the card still in.  The card from HP giving me
problems, does not appear to have any drivers for it.  It looks like HP
supports it to work on Windows, and HPUX.  

 

I'm look for any help you can provide.

 

Thanks in advance,

Dave  

 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 

  David Shue                      

  Systems Specialist        

  Computer Sciences Corporation                                     

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 

 




More information about the general mailing list