[ofa-general][PATCH] Re: mlx4: Completion EQ per cpu (MP support, Patch 10)

Yevgeny Petrilin yevgenyp at mellanox.co.il
Mon May 19 08:22:52 PDT 2008


Roland Dreier wrote:
>  > > I would just like to see an approach that is fully thought through and
>  > > gives a way for applications/kernel drivers to choose a CQ vector based
>  > > on some information about what CPU it will go to.
> 
>  > Isn't the decision of which CPU an MSI-X is routed to (and hence, to
>  > which CPI an EQ is bound to) determined by userspace? (either by the irq
>  > balancer process or by manually setting /proc/irq/<vec>/smp_affinity)?
> 
> Yes, but how can anything tell which IRQ number corresponds to a given
> "CQ vector" number?  (And don't be too stuck on MSI-X, since ehca uses
> some completely different GX-bus related thing to get multiple interrupts)
> 
>  > What are we risking in making the default action to spread interrupts?
> 
> There are fairly plausible scenarios like a multi-threaded app where
> each thread creates a send CQ and a receive CQ, which should both be
> bound to the same CPU as the thread.  If we spread all CQs then it's
> impossible to get thread-locality.
> 
> I'm not saying that round-robin is necessarily a bad default policy, but
> I do think there needs to be a complete picture of how that policy can
> be overridden before we go for multiple interrupt vectors.
> 
>  - R.

Hello Roland,
We can add the multiple interrupt vectors support in two stages:
1. The low level driver can create multiple interrupt vectors. Their name would include a
serial number from 0 to #CPU's-1. The number of completion vectors can
be populated through ib_device.num_comp_vectors. Then each ulp can ask for a specific
completion vector when creating CQ, which means that passing vector=0 while creating CQ
will assign it to completion vector #0.

2. As the second stage, we can create a "don't care" value which would mean that the driver can
can attach the CQ to any completion vector. In this case the policy shouldn't necessary be
round-robin. We can manage the number of "clients" for each completion vector and then assign the CQ
to the least busy one.

What is your opinion on this solution?

Thanks,
Yevgeny



More information about the general mailing list