[openfabrics-ewg] [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD

Moshe Kazir moshek at voltaire.com
Wed Sep 27 22:38:31 PDT 2006


Michael wrote :
> Since I don't consider this a critical fix (there's no reason driver 
> won't go up, and if it does not, there's a simple workaround by 

Michael , 
The mstflint operated in the "classic way"  in OFED-1.1 is not working
on PPC64 sles10  !!!

Telling the customer to use a workaround (open /proc...) if there
platform is PPC64 is not nice !!   

We need to fix the bug in the code !

Frank wrote :
>  The patch can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1.
CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for
others

This define keeps the program from been damaged when running on other
platforms.

Can you have a look at the code once more and write how you want us (me
and Frank ) to refine it ?

It's  o.k. for us if the fix will be enter to the OFED-1.2 but we need
it in the code ! 

Moshe


____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  


-----Original Message-----
From: Tseng-Hui (Frank) Lin [mailto:thlin at us.ibm.com] 
Sent: Wednesday, September 27, 2006 7:46 PM
To: Michael S. Tsirkin
Cc: Moshe Kazir; Tseng-hui Lin; openib-general at openib.org
Subject: Re: [openib-general] FW: Mstflint - not working on ppc64
andwhendriver is not loaded on AMD


On Wed, 2006-09-27 at 18:19 +0300, Michael S. Tsirkin wrote:
> Quoting r. Moshe Kazir <moshek at voltaire.com>:
> > Subject: FW: [openib-general] Mstflint - not working on ppc64 and 
> > whendriver is not loaded on AMD
> > 
> > Michael,
> >  
> > Frank new version was tested once more in Voltaire and is working 
> > o.k. . I tested  `./mstflint -d <lspci output> q`  when drivers are 
> > loaded and when drivers are not loaded. in all cases it worked o.k.
> 
> Thanks for testing, but I'd like to get a handle on what's going on 
> first.
> 
> First, I'm pretty sure when driver is loaded things work OK on all 
> systems. When driver is not loaded - could you please answer whether 
> using /sys/bus/pci/devices/0000\:03\:00.0/resource0
> works for you (on systems that have resource0)?
> 

It doesn't work.

> >  
> > Test was ferformed on the following environments :
> >  
> > -    IBM js21 ppc64 sles10 PCI-E
> > -    IBM js21 ppc64 sles9 sp3 PCI-E
> > -    IBM hs21 em64t redhat as 4 u3 PCI-E
> > -    IBM hs21 em64t sles 9 sp3 PCI-E
> > -    x86_64 sles10  PCI-E
> > -    MAC ppc64 sles10 PCI-X
> > -    MAC ppc64 sles10 PCI-E
> >
> > Please consider inserting the patch to OFED .
> >  
> > Moshe
> 
> Since I don't consider this a critical fix (there's no reason driver 
> won't go up, and if it does not, there's a simple workaround by 
> specifying the /proc interface, that is slower but works), I don't 
> think this should go into OFED 1.1.
> 
> Unfortunately, I never got a small bugfix patch against the latest 
> mstflint - the patch I saw posted touches all kind of things all over 
> the code - so I can't insert it in trunk, either.
> 

I agree this is not critical. The patch changes nothing but the way of
opening the device.

On some ppc64 and x86_64 machines, the I/O memory mapped by mmap() is
not accessable (return 0xFFFFFFFF) unless the kernel code (usually the
device driver) does an ioremap. This is why mmap resource0 does not work
on these machines. There is no way I am aware of can do ioremap from
user space code like mstflint. The only thing I can think of is to fall
back to use the config space file in /proc/bus/pci/.

The (big) patch I made checks if the faster way (mmap resource0) works.
It it doesn't, the patch tries other slower ways and use the fastest
working way it can find. That's all the patch does. It does not make big
fix. It just save the users trouble of trying all possible ways of
opening a devices manually.

I understand applying big patch is risky unless it can be throughly
tested. Unfortunately, no one has all the machines to test the patch.
Moshe and I have tested the patch on Power MAC, Squadrons, JS20, and
JS21 (almost all living ppc64 machines) as well as a few x86_64
machines. We believe this patch is safe for these machines. The patch
can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1.
CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for
others. We can enable this patch on other machines when people who have
these machines tested the patch.

I agree this is no a critical patch, but it is a useful one. Moreover,
it is well tested on the machines with the patch enabled and change
nothing on the machines with the patch disabled. I believe this is a
safe patch. Please re-consider adding it. Thanks.






More information about the ewg mailing list