[ofw] MTHCA registry parameters

Tzachi Dar tzachid at mellanox.co.il
Wed Apr 11 13:20:52 PDT 2007


See below:
 
Thanks
Tzachi


________________________________

	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Fab Tillier
	Sent: Wednesday, April 11, 2007 10:14 PM
	To: ofw at lists.openfabrics.org
	Cc: Jeff Baxter; Xavier Pillons
	Subject: [ofw] MTHCA registry parameters
	
	

	Hi folks,

	 

	What do the following registry parameters for MTHCA do?

	 

	HKR,"Parameters","TunePci",%REG_DWORD%,0

	-          What does PCI tuning do?

	-          Why is it off by default?

	-          Does turning it on result in higher bandwidth or
lower latency?

	-          What are the risks in turning it on?

	 

	HKR,"Parameters","ProcessorAffinity",%REG_DWORD%,0

	-          Is this ISR or DPC affinity, or both? 

	This is for the ISR, but it is also being forced by windows for
the same DPC. on the same processor. I have noticed that on some
benchmarks this gives a 5% improvement but on other it takes of up to
20%. So, one has to use it carefully.

	-          Does this apply to kernel mode only? 

	yes. Only for the ISR and DPC that are created by mthca. 

	 

	HKR,"Parameters","MaxDpcTimeUs",%REG_DWORD%,10000

	-          What does this do? 

	There have been rare cases in which a DPC was running "for ever"
(well not really for ever but for 30 seconds). This had the effect that
the mouse was not moving and so were other devices. this registry key is
a limit for the time one DPC might be running. If this limit is reached,
we stop the DPC and create another one at the end of DPC queue for that
processor.

	-          How does it affect performance when the value is
increased or lowered? 

	-          Is this default value tuned for performance or system
responsiveness? 

	Usually, this has no effect at all as for most normal scenarios
DPCs end much faster than the default value. In the case that they don't
this means that by default every 10ms we stop handling the IB tasks and
doing other things. If other things are "normal" things that don't take
a lot of time (compared to 10 ms) (such as moving the mouse) than there
is no problem, the system is responsive and is also efficient. in the
case that things take more time then IB suffers but I guess that one has
to do the other things as well. One way to solve the problem is using
the ProcessorAffinity in order to lock the system to one processor and
the other on the other processor. A better way to solve this is use MSI
(LH only) or make sure that DPCs are being created on all processors and
not only one.

	[here is the scenario that we saw DPCs that were running for 30
seconds: The system was running ipoib stress both inbound and outbound.
What happens is that packets arrive on the networks, so there is a DPC
that is pooling, until there are no packets on the receive side. mean
time there are send completions so the send CQ creates an EQE. the DPC
finds this EKE, and go to handle the send CQ. All packets sent are
handled and then there are more packets on the receive CQ, so it creates
another EQE. so the same DPC finds the EQE and handles it. This goes on
forever, there are packets on the send side and while there are handled
packets come on the receive side and so on. They are all handled by the
same DPC. This registry variable is responsible for breaking the chain.
Please note that from the remote side things look well as ipoib packets
are handled well. But as time pass, it is clear that DPCS can not wait
that long]

	 

	Thanks!

	-Fab

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20070411/dd833a10/attachment.html>


More information about the ofw mailing list