[libfabric-users] FI_PROGRESS_AUTO on GNI?
Biddiscombe, John A.
john.biddiscombe at cscs.ch
Mon Mar 1 23:28:21 PST 2021
I've noticed that if I enable FI_PROGRESS_AUTO on the cray, I get some deadlocks/timeouts on my tests that do not appear when I use FI_PROGRESS_MANUAL is there a reason why this might be the case?
Since the code is actually identical other than the flag value, If one enables PROGESS_AUTO - does this change the behaviour of calling the cq_read functions in any way?
I'm assuming that when one polls the cq's, this triggers progress inside the library - does enabling AUTO mode in some way modify this so that deadlocks that were not previously present, might appear? As far as I can tell, I only get deadlocks when I use multiple endpoints and PROGRESS_AUTO, so it might be that the internal progress is not happening on all endpoints, and by enabling it, I am somehow impacting what happens when I manually poll the endpoints (?)
Thanks for any pointers - I don't really need PROGRESS_AUTO, but wanted to see if it made things faster or slower (it appears to be slower on gni when using it when deadlocks don't happen)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Libfabric-users