[libfabric-users] 1-n connected endpoints
Dave Goodell (dgoodell)
dgoodell at cisco.com
Wed Jun 1 07:32:32 PDT 2016
On Jun 1, 2016, at 4:06 AM, Maurizio Drocco <drocco at di.unito.it> wrote:
> I am submitting another question raised from integrating libfabric into FastFlow.
> I need to setup a 1-N communication, in which a single "master" process talks to N "slave" processes in a connection-oriented fashion (the a3cube provider only supports connection-oriented for message passing).
> I found a working solution, but I am not sure it is the best one: the master process has N endpoints (created as accepted connections) and only a pair (TX/RX) of completion queues. Master posts requests to the RX queue and includes as context the index of the respective endpoint, so when it "consumes" a RX event it knows the source slave and it can re-posts another RX request from the same endpoint.
This solution seems fine to me. You might hit obvious scalability issues at the master depending on the resources available in the a3cube network hardware, your system's memory, and your desired scale.
Note that you could instead pass a pointer to the endpoint as the context instead of an index into an array, but they both have about the same result in the end.
It's also common to use the context pointer to point to an application-level RX request structure which in turn contains a pointer to the endpoint on which it was posted. Most applications need the request pointer for other reasons besides simply reposting to the EP, such as signaling completion a higher layer in the stack.
> I guess this works since completion queues are not FIFO, right?
I'm not sure what you mean by this... completion queues should definitely have FIFO ordering. Earlier completion events should not be perceived after later ones, especially for completions coming from the same EP (or TX/RX context if using shared/scalable EPs).
More information about the Libfabric-users