[libfabric-users] Detecting errors/flow control with FI_SELECTIVE_COMPLETION

D'Alessandro, Luke K ldalessa at iu.edu
Mon Aug 24 12:38:28 PDT 2020


Hi all,

I’m trying to use libfabric to get some baseline performance numbers for some research that we’re doing.

The functionality that I need is simply to transfer (address, value) pairs via remote completion in an all-to-all setting.

I have sequential ranks, and have coopted RDM/sockets/fi_inject_writedata with 0-length messages to do this, and it implements the required semantics correctly.

The problem that I have is that I can’t figure out how to implement flow control in this setting. Obviously I should have resource issues in both the local and remote endpoints, and I’m able to slow down and/or buffer on the TX side if need be.

I am using a TX:FI_SELECTIVE_COMPLETION cq, an RX cq, and a TX counter. I definitely see <warn> messages if I spam tx or don’t complete rx fast enough, but I can’t seem to detect any errors/failures on the TX side that would let me slow down (neither the cq nor the counter ever reports an error).

> libfabric:290798:sockets:cq:_sock_cq_write():181<warn> Not enough space in CQ
> rank 0 posting 4279 to 0
> rank [0] fi error: Resource temporarily unavailable
> libfabric:290798:sockets:ep_data:sock_rx_new_buffered_entry():109<warn> Exceeded buffered recv limit
> [portland:290798] *** Process received signal ***
> libfabric:290798:sockets:ep_data:sock_pe_new_tx_entry():2251<warn> Invalid operation type
> libfabric:290798:sockets:ep_data:sock_pe_progress_tx_ctx():2546<warn> failed to progress TX ctx
> [portland:290798] Signal: Aborted (6)
> [portland:290798] Signal code:  (-6)
> libfabric:290798:sockets:ep_data:sock_pe_progress_thread():2650<warn> failed to progress TX


Any ideas on what I can do here to discover these <warn>s eagerly at the user level? 

I can easily set up some hard limits on the number of outstanding TX operations (computed via the TX counter), but I don’t know where to find out what the right number for that would be. Also, given the all-to-all nature of the communication I can dedicate point-to-point remote RX accounting if that’s something I need to do manually.

Thanks,
Luke


More information about the Libfabric-users mailing list