[libfabric-users] Address not reachable

Isaac Nuñez isaacnez at outlook.com
Fri May 15 16:46:22 PDT 2020



> Am 16.05.2020 um 01:16 schrieb Hefty, Sean <sean.hefty at intel.com>:
> 
> 
>> 
>> Is there a reason why an addr is not reachable? Following my case previously described.
>> One application runs on x node and it is started by srun (from another node), if then i
>> connect that with a single application (my test case was fi_rdm_rma_simple), it
>> connects, BUT, if connect fi_rdm_rma_simple  to my process running in other node, it
>> says error -61. Anyone has any idea why would that happen? Keep in mind that my
>> application started by srun is just one process.
> 
> It's hard to say.  If the server wasn’t running by the time the client tried to connect, you might get this failure.  -61 is connection refused.  You can use fi_strerror() to convert errno into strings. 

The server was running. It only happened when the server was launched using srun. If the server was launched as ./server and the client with srun, it would connect. I tested my application and fi_rdm_rma_simple and they presented the same behavior. 


More information about the Libfabric-users mailing list