[ofw] Access Violation while running ND test over winverb

Uri Habusha urih at mellanox.co.il
Mon Nov 8 13:03:14 PST 2010


It was happened us few times in last week so I guess it reproducible. I'll try to dig in and late you know if I find something

Uri

-----Original Message-----
From: Hefty, Sean [mailto:sean.hefty at intel.com] 
Sent: Monday, November 08, 2010 10:00 PM
To: Uri Habusha; ofw at lists.openfabrics.org; Smith, Stan
Subject: RE: Access Violation while running ND test over winverb

> Following is some information I take from the debugger. Hopefully it will
> help.
> 
> Please let me know if mini dump will help.
> 
> Uri
> 
> WvDeviceEventHandler()
> 
> 4: kd> ??pEvent
> struct _ib_event_rec * 0xfffffa60`01c5baf0
>    +0x000 context          : 0xfffffa80`16f41170
>    +0x000 context_padding  : 0xfffffa80`16f41170
>    +0x008 type             : 16 ( IB_AE_PORT_DOWN )
>    +0x010 vendor_specific  : 0x20206f49
>    +0x018 port_number      : 0x2 ''

I've examined the kernel winverbs code, and I don't see anything that looks like it would be a problem.  Everything looks like it would be initialized okay.  The crash is occurring basically as soon as winverbs tries to dereference the event content that was passed in, but the mlx4 driver looks okay to me as well.  Is this easily repeatable for you?

In WvDeviceEventHandler(), can you dump the WV_DEVICE *dev variable?  The port_number in the event looks okay.  Maybe something corrupted the WV_DEVICE *dev->pPorts array..?  (completely guessing at this point).  I guess you could also dump dev->pPorts[0] to dev->pPorts[dev->PortCount - 1], but not sure that will help.  This is the code in winverbs where the crash occurs:

static void WvDeviceEventHandler(ib_event_rec_t *pEvent)
{
	WV_DEVICE	*dev;
	...
	dev = CONTAINING_RECORD(pEvent->context, WV_DEVICE, EventHandler);
	...
		WvDeviceCompleteRequests(&dev->pPorts[pEvent->port_number - 1],
								 STATUS_SUCCESS, event);
	...
}

static void WvDeviceCompleteRequests(WV_PORT *pPort, NTSTATUS ReqStatus, UINT32 Event)
{
	...
	WdfObjectAcquireLock(pPort->Queue);

	^^^ The crash is here

I'm guessing that dev->PortCount is probably 2, which means I have no idea what the problem may be.  And I'm not sure if there's anything else we can get out of the debugger that might be useful.

Btw, the libibverbs test app ibv_asyncwatch should exercise this same code path, and that apps works fine for me generating port up/down events across multiple ports.

- Sean



More information about the ofw mailing list