[libfabric-users] FI_PROGRESS_AUTO on GNI?

Biddiscombe, John A. john.biddiscombe at cscs.ch
Mon Mar 1 23:28:21 PST 2021


I've noticed that if I enable FI_PROGRESS_AUTO on the cray, I get some deadlocks/timeouts on my tests that do not appear when I use FI_PROGRESS_MANUAL is there a reason why this might be the case?


Since the code is actually identical other than the flag value, If one enables PROGESS_AUTO - does this change the behaviour of calling the cq_read functions in any way?


I'm assuming that when one polls the cq's, this triggers progress inside the library  - does enabling AUTO mode in some way modify this so that deadlocks that were not previously present, might appear? As far as I can tell, I only get deadlocks when I use multiple endpoints and PROGRESS_AUTO, so it might be that the internal progress is not happening on all endpoints, and by enabling it, I am somehow impacting what happens when I manually poll the endpoints (?)


Thanks for any pointers - I don't really need PROGRESS_AUTO, but wanted to see if it made things faster or slower (it appears to be slower on gni when using it when deadlocks don't happen)


JB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20210302/3473c285/attachment.htm>


More information about the Libfabric-users mailing list