[libfabric-users] [chuck at ece.cmu.edu: utility provider breaks fi_wait()]

Chuck Cranor chuck at ece.cmu.edu
Mon Sep 28 13:31:23 PDT 2020


On Thu, Sep 17, 2020 at 04:10:52PM +0000, Hefty, Sean wrote:
> From your comment below, you are running rxd over psm.

yes, we are bringing up a re-purposed ~600 node Cray cluster here at CMU.
Each compute node has a Qlogic/Intel IB HCA network card in it.
We want to run Mercury RPC using libfabric for networking.


> > Here's the problem: when an application using RXD calls fi_wait()
> > it goes directly to the util_wait_fd_run() function and blocks
> > in epoll_wait() without ever calling the underlying layer's
> > "wait" function from the "fi_ops_wait" structure (i.e. psmx_wait_wait(),
> > psmx2_wait_wait(), gnix_wait_wait(), ... are never called!).

> Util_wait_fd_run() calls the underlying provider's wait_try()
> function.  The intent is that any work the core provider needs to
> do prior to waiting should be done there.

ok.  I looked at this code more.   The underlying layer's psmx_wait_trywait()
is called, so it could do work prior to waiting.   But the provider
also needs to do work _after_ the waiting is complete.   trywait/wait_try
is only called before the wait.   it seems like a piece is missing?


> AFAIK, RxD has not been used over psm, psm2, or gni providers.
> I don't believe those providers are setup to handle such layering,
> as they support RDM endpoints directly.

The reason we ended up using RxD with PSM was because PSM by itself
was rejecting FI_TAGGED -- a feature that the Mercury RPC wants.

Further debugging shows that PSM rejects (FI_TAGGED|FI_DIRECTED_RECV)
but allows FI_TAGGED without FI_DIRECTED_RECV.   The Mercury code
can get by without FI_DIRECTED_RECV, so we should be able to use
PSM directly without RxD and that will allow us to avoid the fi_wait()
problem.


But there is one other problem: 

    PSM rejects setting the domain threading attribute to "FI_THREAD_SAFE" 
even though the documentation in man/fi_domain.3.md says 

	"All providers are required to support FI_THREAD_SAFE"

is this an oversight in the PSM code?   I added FI_THREAD_SAFE
to psmx_init.c and it seems to work.   Is the following patch ok,
or are there more threading issues I should be aware of?


diff --git a/prov/psm/src/psmx_init.c b/prov/psm/src/psmx_init.c
index c0a93f44a..6b1263cef 100644
--- a/prov/psm/src/psmx_init.c
+++ b/prov/psm/src/psmx_init.c
@@ -453,6 +453,7 @@ static int psmx_getinfo(uint32_t version, const char *node, const char *service,
 			case FI_THREAD_ENDPOINT:
 			case FI_THREAD_COMPLETION:
 			case FI_THREAD_DOMAIN:
+			case FI_THREAD_SAFE:
 				threading = hints->domain_attr->threading;
 				break;
 			default:



chuck


More information about the Libfabric-users mailing list