[ofa-general] mpi failures on large ia64/ofed/IB clusters
    Roland Dreier 
    rdreier at cisco.com
       
    Fri Oct  5 15:51:21 PDT 2007
    
    
  
 > I don't really see anything racy in the FW command stuff, but it's
 > possible that there's something like an mmiowb() missing somewhere (I
 > have a hard time spotting that type of race for some reason).
Another possibility (independent of the hack I suggested before) would
be to add an mmiowb() before the mutex_unlock() in mthca_cmd_post().
I actually have a good feeling about this theory....
 - R.
    
    
More information about the general
mailing list