[libfabric-users] TX/RX data structures and data processing mode

Михаил Халилов miharulidze at gmail.com
Sat Mar 17 06:50:19 PDT 2018


Hi Sean,

Thanks for answer!

> Please look at the code in prov/util for help.  The socket code was
designed around using it as a development tool, so I wouldn't recommend
trying to copy its implementation.

I'm decided to analyze sockets provider, because it also implements full
tx/rx processing on provider-level that we also need.

> If you are attempting to implement reliable-datagram semantics, then the
use of lists may be better than a queue.  Messages may complete out of
order when targeting different peers.

Yes, I'm working on FI_EP_RDM implementation. I didn't think about out of
order messaging before your advice, but the usage of lists fits really good
in this case. Thank you!

> Depending on your provider, you may also be able to take advantage of the
utility providers.

I paid attention to the availability of a util provider and try to use its
functions where possible.

> Yes, the app calling cq_read needs to be sufficient to drive progress.
Note that this is expected by the app in the manual progress mode even if
no completions are expected.

As far as I understand, this is sufficient, but not necessary to call
cq_read, for application, since we can use other functions of libfabric API
for data progress (fi_cntr_read) in FI_PROGRESS_MANUAL.
Maybe I should ask this question in MPICH/OpenMPI development
mailing-lists, but is it possible to run these MPI implementations over
provider, which has only fi_cq part of API for manual data progress?

BR,
Mikhail Khalilov

2018-03-16 19:50 GMT+03:00 Hefty, Sean <sean.hefty at intel.com>:

> copying ofiwg -- that mail list is better suited for your questions.
>
> > My group works on implementing of new libfabric provider for our HPC
> > interconnect. Our current main goal is to run MPICH and OpenMPI over
> > this provider.
>
> welcome!
>
> > The problem is, that this NIC haven't any software and hardware rx/tx
> > queues for send/recv operations. We're decided to implement it on
> > libfabric provider-level. So, I'm looking for data structure for queue
> > store and processing.
> >
> > I took a look in sockets provider code. As far as I understand, tx_ctx
> > stores pointers to all information (flags, data, src_address and etc.)
> > about every message to send in ring buffer, but rx_ctx stores every
> > rx_entry in double-linked list. What was the motivation for choosing
> > such data structures when implementing these queues are different used
> > to process tx and rx?
>
> Please look at the code in prov/util for help.  The socket code was
> designed around using it as a development tool, so I wouldn't recommend
> trying to copy its implementation.
>
> The udp provider is a good place to start for how to construct a very
> simple software provider.  You may also want to scan the include/ofi_xxx.h
> files for helpful abstractions.  There's a slightly out of date document in
> docs/providers that describes what's available.  ofi_list.h and ofi_mem.h
> both have useful abstractions.
>
> > Maybe you can give advice on the implementation of queues or give some
> > useful information on this topic?
>
> If you are attempting to implement reliable-datagram semantics, then the
> use of lists may be better than a queue.  Messages may complete out of
> order when targeting different peers.
>
> Depending on your provider, you may also be able to take advantage of the
> utility providers.  RxM will implement reliable-datagram support over
> reliable-connections.  That is functional today.  RxD targets
> reliable-datagram over unreliable-datagram.  That is a work in progress,
> however.
>
> > The second problem is about suitable way for progress model. For CPU
> > performance reasons I want to choose FI_PROGRESS_MANUAL as primary
> > mode for the processing of an asynchronous requests, but I do not
> > quite understand how an application thread provides data progress. For
> > example, is it enough to call fi_cq_read() from MPI implementation
> > always when it wants to make a progress?
>
> Yes, the app calling cq_read needs to be sufficient to drive progress.
> Note that this is expected by the app in the manual progress mode even if
> no completions are expected.
>
> - Sean
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20180317/70d0c25a/attachment.html>


More information about the Libfabric-users mailing list