[ofw] RE: bugcheck in mlx4_bus

Tzachi Dar tzachid at mellanox.co.il
Sun Aug 23 05:47:41 PDT 2009


It seems that the problem the verifier complains about is that
MmProbeAndLockPages is being called without a matching call to
MmUnlockPages.

Can you please count the number of calls to these functions? Also, can
you compare that the same MDL is being used?

Thanks
Tzachi  

> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org 
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Sean Hefty
> Sent: Friday, August 21, 2009 1:18 AM
> To: Hefty, Sean; 'Fab Tillier'; ofw at lists.openfabrics.org
> Subject: RE: [ofw] RE: bugcheck in mlx4_bus
> 
> I added tracking to the winverbs driver to 
> increment/decrement a counter when creating/destroying any of 
> the following: CQ, device, endpoint (CM structure), PD, MW, 
> MR, AH, QP, and SRQ.  All counters end at 0 after cleaning up 
> when the file is closed (done in the WDF file cleanup callback).
> 
> Some rank from MPI PingPong occasionally crashes while 
> starting up a test.  The crash occurs running the DAPL 
> rdma_cm provider, but the kernel bug may or may not be 
> related to the use of the rdma_cm.  The user space code may 
> just crash at the wrong (or right) time with that provider to 
> trigger this error.  The kernel crash doesn't occur every time.
> 
> Anyone have any other ideas to help isolate?
> 
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> 



More information about the ofw mailing list