[ofa-general] Re: [OMPI users] openMPI over uDAPL doesn't work

Boris Bierbaum boris at lfbs.RWTH-Aachen.DE
Wed May 9 00:24:56 PDT 2007


It has been explained in a different thread on [ofa-general] that the
problem lies in a combination of the OpenIB-cma provider not setting the
local and remote port numbers on endpoints correctly and Open MPI
stepping over the IA to save the port number to circumvent this problem,
thereby confusing the provider.

I commented out line 197 in ompi/mca/btl/udapl/btl_udapl.c (Open MPI
1.2.1 release) and this fixes the problem. As the problem in the
provider is currently being fixed, the whole saving of the port number
in the uDAPL BTL code will be unnecessary in the future.

Steve Wise wrote:
>>> Can the UDAPL OFED wizards shed any light on the error messages that  
>>> are listed below?  In particular, these seem to be worrysome:
>>>
>>>>  setup_listener Permission denied
>>>  setup_listener Address already in use
>> These failures are from rdma_cm_bind indicating the port is already 
>> bound to this IA address. How are you creating the service point?
>> dat_psp_create or dat_psp_create_any? If it is psp_create_any then you 
>> will see some failures until it  gets to a free port. That is normal. 
>> Just make sure your create call returns DAT_SUCCESS.
>>
> 
> Arlin, why doesn't dapl_psp_create_any() just pass a port of zero down
> and let the rdma-cma pick an available port number?
> 
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


-- 
|  _  RWTH | Boris Bierbaum
|_|_`_     | Lehrstuhl fuer Betriebssysteme
   | |_) _  | RWTH Aachen D-52056 Aachen
     |_)(_` | Tel: +49-241-80-27805
        ._) | Fax: +49-241-80-22339




More information about the general mailing list