[openib-general] EEH: MMIO Failure on Power5

Pradeep Satyanarayana pradeep at us.ibm.com
Tue Sep 20 11:20:42 PDT 2005






Yes, it should be possible to interpret the EEH logs. I tried that once
(and that was the first time) and got lost. Once, we discovered the
workaround, the impetus to pursue it sort of came down and we did not look
into it, especially since we were not sure of the extent of the problem.

Now that others are seeing it, it looks like it may be worth a second look
into this issue. Let me talk to some folks within IBM and see if they can
help with the EEH  specifics.

Pradeep
pradeep at us.ibm.com


                                                                           
             Roland Dreier                                                 
             <rolandd at cisco.co                                             
             m>                                                         To 
                                       Pradeep                             
             09/20/2005 11:01          Satyanarayana/Beaverton/IBM at IBMUS   
             AM                                                         cc 
                                       openib-general at openib.org,          
                                       openib-general-bounces at openib.org,  
                                       Thaddeus Ternes <tternes at gmail.com> 
                                                                   Subject 
                                       Re: [openib-general] EEH: MMIO      
                                       Failure on Power5                   
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




    Pradeep> We did see some similar issues on a p570 (and yes I have
    Pradeep> heard others report no problems on 710 systems). The
    Pradeep> work-around that we discovered was to not load the IB
    Pradeep> modules at boot time. Suspect there could be some
    Pradeep> sequencing issue that Roland points out.

This is really interesting.  Is there any way you can interpret the
EEH information to find out what is going wrong?

>From what you're saying, it sounds like the mthca module is being
loaded before the ppc64 PCI core code is done setting up, which sounds
like a bug in the kernel that we would want to fix.

Thanks,
  Roland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050920/3aa366ad/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050920/3aa366ad/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic00140.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050920/3aa366ad/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050920/3aa366ad/attachment-0002.gif>


More information about the general mailing list