[ofw] RE: bugcheck in mlx4_bus
Tzachi Dar
tzachid at mellanox.co.il
Sun Aug 23 05:47:41 PDT 2009
It seems that the problem the verifier complains about is that
MmProbeAndLockPages is being called without a matching call to
MmUnlockPages.
Can you please count the number of calls to these functions? Also, can
you compare that the same MDL is being used?
Thanks
Tzachi
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Sean Hefty
> Sent: Friday, August 21, 2009 1:18 AM
> To: Hefty, Sean; 'Fab Tillier'; ofw at lists.openfabrics.org
> Subject: RE: [ofw] RE: bugcheck in mlx4_bus
>
> I added tracking to the winverbs driver to
> increment/decrement a counter when creating/destroying any of
> the following: CQ, device, endpoint (CM structure), PD, MW,
> MR, AH, QP, and SRQ. All counters end at 0 after cleaning up
> when the file is closed (done in the WDF file cleanup callback).
>
> Some rank from MPI PingPong occasionally crashes while
> starting up a test. The crash occurs running the DAPL
> rdma_cm provider, but the kernel bug may or may not be
> related to the use of the rdma_cm. The user space code may
> just crash at the wrong (or right) time with that provider to
> trigger this error. The kernel crash doesn't occur every time.
>
> Anyone have any other ideas to help isolate?
>
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
>
More information about the ofw
mailing list