***SPAM*** Re: [ofa-general] sminfo report iberror in the first configuration on RHEL5.3

Wen Hao Wang wangwhao at cn.ibm.com
Sun Feb 15 17:29:30 PST 2009



Wen Hao Wang (王文昊)

Software Engineer
IBM China Software Development Laboratory
Email: wangwhao at cn.ibm.com
Tel: 86-10-82451055
Fax: 86-10-82782244 ext. 2312
Address: 1/F, IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software
Park,No.8 Dong Bei Wang West Road, Haidian District Beijing, 100193,
P.R.China


Doug Ledford <dledford at redhat.com> 写于 2009-02-14 00:13:32:

> On Fri, 2009-02-13 at 08:05 +0800, Wen Hao Wang wrote:
> > Doug Ledford <dledford at redhat.com> 写于 2009-02-12 21:20:30:
> >
> > > On Thu, 2009-02-12 at 13:20 +0200, Tziporet Koren wrote:
> > > > Wen Hao Wang wrote:
> > > > >
> > > > > Hi all:
> > > > >
> > > > > I changed my blade OS to RHEL5.3 yesterday and installed OFED
> > (shipped
> > > > > in RHEL5.3 image) by "yum groupisntall". Then I load some
> > drivers and
> > > > > wrote network interface configuration file ifcfg-ib0. ifup ib0
> > also
> > > > > succeeded. But IB utilites report Connetion timed out.
> > > > >
> > > > >
> > > > > [root at xblade06 network-scripts]# sminfo
> > > > > ibwarn: [32593] _do_madrpc: recv failed: Connection timed out
> > > > > ibwarn: [32593] mad_rpc: _do_madrpc failed; dport (Lid 9)
> > > > > sminfo: iberror: failed: query
> > > > >
> > > > > I had to reboot the blade and rerun "openibd start". Then
> > sminfo
> > > > > reported correct contents. I do not suppose this reboot is
> > required.
> > > > > Did I miss any configuration step?
> > >
> > > There was an unintentional bug in the rhel5.2 openibd init script in
> > > that it automatically turned itself on during install (generally,
> > most
> > > init scripts should default to *not* turning themselves on during
> > > install of the package, nor should they start themselves during
> > install
> > > of the package...this is for security reasons, imagine if you
> > installed
> > > the bind name server on your box and it automatically started up
> > before
> > > you had a chance to configure it).  In rhel5.3 we fixed that bug.
> >  So,
> >
> > Yeah. I heard of this bug.
> >
> > > you may need to 'chkconfig --level 2345 openibd on' to make sure
> > openibd
> > > starts up each time.  The error you list above is consistent with
> > not
> > > all of the kernel modules being loaded when you tried to use the
> > sminfo
> > > program.
> >
> > Even after reboot, service openibd is not started automatically.
> > [root at xblade06 ~]# chkconfig --list openibd
> > openibd         0:off   1:off   2:off   3:off   4:off   5:off   6:off
>
> That's because you have to run the command I listed in my first email to
> turn it on.
>

I totally agree with this. But I am still confused why sminfo gave errors
before reboot, or which steps I should take for the first OFED usage before
reboot. As far as I can see, whether the service is added into system
runlevel DB is not related to the sminfo error. Please correct me if that
is not the case.

> > I agree with you that maybe some modules were not loaded. But what's
> > that?
> > Before reboot, I run "/etc/init.d/openibd start" and
> > "/etc/init.d/network
> > restart". No error was reported. "openibd status" also looked good.
>
> Running start on a service does not enable that service at the next
> reboot.  You must specifically enable the service in order for it to
> start automatically.
>
> > >
> > > > > Moreover, "openibd start" report one warning message about
> > hwconf.
> > > > > Anyone has comments about this?
> > > > >
> > > > > [root at xblade07 ~]# /etc/init.d/openibd start
> > > > > Loading OpenIB kernel modules:grep: /etc/sysconfig/hwconf: No
> > such
> > > > > file or directory
> > > > > [ OK ]
> > >
> > > Can you see if the kudzu package is installed on your machine?  The
> > > openib package uses this config file written by kudzu to determine
> > what
> > > hardware drivers to load.  I suppose I should put a specific
> > requires in
> > > the rpm for that.
> >
> > kudzu is installed.
> > [root at xblade06 ~]# rpm -q kudzu
> > kudzu-1.2.57.1.21-1
>
> Make sure kudzu has been run at least once then (it would appear to be
> turned off on your machine or else /etc/sysconfig/hwconf would exist).
> You can run it manually from the command line and that should be
> sufficient for the openibd init script's needs.
>

Yes. After kudza created the file on my machine, openibd script had no
error>
this time. I want to know in my scenario, is "openibd restart"
needed/required?

Many thanks!

Wen Hao Wang
Email: wangwhao at cn.ibm.com

> --
> Doug Ledford <dledford at redhat.com>
>               GPG KeyID: CFBFF194
>               http://people.redhat.com/dledford
>
> Infiniband specific RPMs available at
>               http://people.redhat.com/dledford/Infiniband
>
> [附件 "signature.asc" 被 Wen Hao Wang/China/IBM 删除]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090216/3ef2fa67/attachment.html>


More information about the general mailing list