[libfabric-users] 1-n connected endpoints

Maurizio Drocco drocco at di.unito.it
Mon Jun 6 14:00:36 PDT 2016


Thank you Dave,

Il 01/06/16 16:32, Dave Goodell (dgoodell) ha scritto:
> Hi Maurizio,
>
> On Jun 1, 2016, at 4:06 AM, Maurizio Drocco <drocco at di.unito.it> wrote:
>> I am submitting another question raised from integrating libfabric into FastFlow.
>>
>> I need to setup a 1-N communication, in which a single "master" process talks to N "slave" processes in a connection-oriented fashion (the a3cube provider only supports connection-oriented for message passing).
>>
>> I found a working solution, but I am not sure it is the best one: the master process has N endpoints (created as accepted connections) and only a pair (TX/RX) of completion queues. Master posts requests to the RX queue and includes as context the index of the respective endpoint, so when it "consumes" a RX event it knows the source slave and it can re-posts another RX request from the same endpoint.
> This solution seems fine to me.  You might hit obvious scalability issues at the master depending on the resources available in the a3cube network hardware, your system's memory, and your desired scale.
>
> Note that you could instead pass a pointer to the endpoint as the context instead of an index into an array, but they both have about the same result in the end.
>
> It's also common to use the context pointer to point to an application-level RX request structure which in turn contains a pointer to the endpoint on which it was posted.  Most applications need the request pointer for other reasons besides simply reposting to the EP, such as signaling completion a higher layer in the stack.
OK, I was not sure the context was there just as an arbitrary pointer,
now I got the point.
>> I guess this works since completion queues are not FIFO, right?
> I'm not sure what you mean by this... completion queues should definitely have FIFO ordering.  Earlier completion events should not be perceived after later ones, especially for completions coming from the same EP (or TX/RX context if using shared/scalable EPs).
You are right, I was not precise. I meant the order in which completions
are perceived (i.e. popped from the TX/RX completion queues) is not
necessarily the same order in which requests are posted (i.e. pushed to
the TX/RX "data" queues). This is nothing surprising since
completion/data queues should be actually independent, I was just asking
for confirmation :)

Thank you again,
Maurizio
>
> -Dave
>

-- 
Maurizio Drocco
PhD Student
University of Torino, department of Computer Science
Via Pessinetto 12, 10149 Torino - Italy




More information about the Libfabric-users mailing list