[ofw] RE: Bugfix for IPOIB failure

Sun Jun 29 00:19:49 PDT 2008

Hi Fab,

Please note that the general problem that exists here is harder.
When running in a DPC, there are two main problems: 1) other DPCs won't
run 2) user mode threads that were running when the DPC has started will
not get time to execute.

The first problem blocks the computer from responding to other hardware
events. For example this can mean that the mouse DPC is not getting it's
time to run or that a different NIC is not running. This problem can be
solved by stopping your DPC and queuing another DPC.

The second problem is somewhat harder to understand, we have seen this
happening with a tcp benchmark that didn't reach wire speed. While
analyzing this issue we have reached to a conclusion that the
application can not post recv buffer since the thread that was posting
the recive was stopped by a DPC that was bringing more packets. Please
also note that this machine had 8 cores, so the application did have a
place to run.

In any case, in the following days I'll make a patch to the mlx4 driver
that will allow the different user to crate their own EQ and will give
them a much better control of what they are doing.

I also agree that all ULPS should make sure not to stay in DPC for a
long time. I believe that IPOIB already does so.

Thanks
Tzachi

________________________________

	From: Fab Tillier [mailto:ftillier at windows.microsoft.com] 
	Sent: Thursday, June 26, 2008 7:35 PM
	To: Alex Naslednikov; ofw at lists.openfabrics.org; Tzachi Dar
	Subject: RE: Bugfix for IPOIB failure

	Note that all kernel ULPs should do something along these lines
- SRP, IPoIB, VNIC, as well as internal IBAL services-should all limit
how much time they spend processing completions and requeue their DPC if
they exceed it.

	The natural progression of this idea is that a ULP would provide
the DPC object to queue in response to a CQ event, rather than
callbacks.

	-Fab

	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Alex Naslednikov
	Sent: Thursday, June 26, 2008 5:01 AM
	To: ofw at lists.openfabrics.org; Tzachi Dar
	Subject: [ofw] Bugfix for IPOIB failure

________________________________

	Hi all,

	The original problem was IPoIB failure (link up/down) during the
operation of  'heavy' applications.

	After our investigation, we found when one execute application
with heavy load on the HCA, driver always have non-empty CQs, so it
continues to work with its DPCs and not allows to other DPC to be
performed.

	The solution

	Based on an anologous solution for mthca driver, driver has to
enable other than its own DPC to be performed. It's possible to count
certain amount of time that driver spent on DPC handling, than exit,
thus allowing other DPC to run :

	Index: hw/mlx4/kernel/bus/net/eq.c

===================================================================

	--- hw/mlx4/kernel/bus/net/eq.c (revision 2634)

	+++ hw/mlx4/kernel/bus/net/eq.c (revision 2635)

	@@ -160,6 +160,9 @@

	int cqn;

	int eqes_found = 0;

	int set_ci = 0;

	+ static const uint32_t cDpcMaxTime = 10000; //max time to spend
in a while loop

	+ 

	+ uint64_t start = cl_get_time_stamp();

	while ((eqe = next_eqe_sw(eq))) {

	/*

	@@ -222,6 +225,7 @@

	default:

	mlx4_warn(dev, "Unhandled event %02x(%02x) on EQ %d at index
%u\n",

	eqe->type, eqe->subtype, eq->eqn, eq->cons_index);

	+ 

	break;

	};

	@@ -244,6 +248,10 @@

	eq_set_ci(eq, 0);

	set_ci = 0;

	}

	+ 

	+ if (cl_get_time_stamp() - start > cDpcMaxTime ) {

	+ break; //allow other DPCs as well

	+ }

	}

	eq_set_ci(eq, 1);

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20080629/5b04681a/attachment.html>