[ofa-general] Re: [PATCH RFC] sharing userspace IB objects

Dror Goldenberg gdror at dev.mellanox.co.il
Mon Jul 2 12:42:22 PDT 2007


Jeff Squyres wrote:
> On Jul 2, 2007, at 5:15 PM, Galen Shipman wrote:
>
>> While I think the SRC design makes sense I also have concerns 
>> regarding SSQ.
>> As Gleb has pointed out, if the hardware doesn't do the demux then 
>> the application has to.  It sounds like there are two proposals to 
>> deal with this hardware limitation in software (sigh).
>>
>> 1) Process A polls CQ, if WQE belongs to Process B, Process A will 
>> drop the WQE in a shared memory region that Process B  will poll. [snip]
>> 2) Process A peeks CQ, if WQE belongs to Process B, it doesn't 
>> process it [snip]
>>
>> In my opinion the demux belongs in the hardware, otherwise we end up 
>> complicating an already complicated code base to support a feature 
>> which unless I am missing something will have no benefit to real 
>> applications.
I agree about this deficiency and unfortunately I don't think we can do 
anything about it with the current generation. As I said before, I don't 
have a quantitative data about how this might affect the overall 
performance of the application. If polling the CQ of the SQ is not in 
the critical performance path, it may end up having a negligible impact. 
But it might as well turn up to have some impact on performance.
>
> I agree.  I cannot see how SSQ will be useful in Open MPI -- it makes 
> the code *much* more complicated and effectively guarantees to add 
> latency for the common case.  I don't see how to explain it better 
> than Gleb/Galen already did.
>
> If Mellanox wants to implement SSQ for other reasons, fine.  But based 
> on the explanations so far, I don't see us using it in [Open] MPI.
The main intention of SSQ is for MPI which dominates the large/huge 
clusters. The intention is to help in scalability, which may have some 
impact on performance in some cases. I think that for now we should 
first start with SRC, and thus significantly improve the scalability. 
Let us worry a bit later about SSQ.
>
> --Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
>




More information about the general mailing list