[openfabrics-ewg] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD

Moshe Kazir moshek at voltaire.com
Thu Sep 28 06:59:14 PDT 2006


Michael,

Frank found the cause to the problem in the implementation of
arch/ppc/kernel/pci.c , 
and asked the IBM kernel group to send a bug fix to the Linux kernel
group.

The problem is :

1. This bug fix will not enter SLES10 as it is closed.
2. It also will not enter SLES9 :-) or Redhate as4 u4 .

So we need a bug fix that will enable the use of mstflint on js21 PPC64
+ backport to old systems  .

Franks fix is based on two points (if I understand the code with no
errors) -

1. It opens /proc/bus/pci... And not /sys/bus/pci/...
2. It perform an ictl(fd, PCIIOC_MMAP_IS_MEM) ;

Frank - am I write ?

Can we enter these two small changes to the mstflint to have it working
on the PPC64 js21 ?

Moshe 





____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  


-----Original Message-----
From: Michael S. Tsirkin [mailto:mst at mellanox.co.il] 
Sent: Thursday, September 28, 2006 4:41 PM
To: Moshe Kazir
Cc: Tseng-Hui (Frank) Lin; openfabrics-ewg at openib.org;
openib-general at openib.org
Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not
loaded on AMD


Quoting r. Moshe Kazir <moshek at voltaire.com>:
> 
> Quoting r. Moshe Kazir <moshek at voltaire.com>:
> > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is 
> > not
> > loaded on AMD
> > 
> > 
> >  # ls /sys/class/infiniband/mthca0/device/resource0
> > /sys/class/infiniband/mthca0/device/resource0
> 
> OK, so can you try this please:
> 
> strace -f -v -o log  mstflint -d 
> /sys/class/infiniband/mthca0/device/resource0 q
> 
> cat log
> 
> --
> MST



> 30463 open("/sys/class/infiniband/mthca0/device/resource0",
O_RDWR|O_SYNC|O_LARGEFILE) = 3
> 30463 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) =
-1 EINVAL (Invalid argument)

So we see that mmap is failing with EINVAL.
But why? We seem to be passing all valid parameters to it.

I'm looking at arch/ppc/kernel/pci.c at the moment.
It seems that EINVAL is returned if __pci_mmap_make_offset
fails, and that seems to be only looking for a valid resource size.

Are you up to finding the root cause of the problem in
arch/ppc/kernel/pci.c?

Maybe the resource offsets are wrong? What does
cat /sys/class/infiniband/mthca0/device/resource
show?

Maybe there's some problem to map a full megabyte?
Here's a test that only maps 4K. Could you strace it please?

>>>>>>>>>>>

#define _XOPEN_SOURCE 500
#define _FILE_OFFSET_BITS 64

#include <stdio.h>

#include <unistd.h>

#include <netinet/in.h>
#include <endian.h>
#include <byteswap.h>
#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>

#include <sys/pci.h>
#include <sys/ioctl.h>

#include <sys/mman.h>
#include <sys/pci.h>
#include <sys/stat.h>
/* #include <sys/ioctl.h>
 * #include <sys/types.h> */

int main()
{
        int fd;
        unsigned value;
        volatile void *ptr;
        fd = open("/proc/bus/pci/00/00.0" ,O_RDWR | O_SYNC);

        /* ioctl(fd, PCIIOC_MMAP_IS_MEM); */
        ptr = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
0xf0000);
        memcpy(&value, (void*)(ptr + 0x14), sizeof value);
        printf("0x%x\n");
        return 0;
}



-- 
MST




More information about the ewg mailing list