[libfabric-users] sockets provider question

Dave Goodell (dgoodell) dgoodell at cisco.com
Wed Jul 6 10:18:06 PDT 2016


Howard,

You might want to add checking in the sockets and/or GNI providers (at least for debug builds) to ensure that EPs have been enabled at fi_getname time, along with other functions.  That would help us and users more easily catch these sorts of errors in the future.

-Dave

> On Jul 6, 2016, at 11:44 AM, Howard Pritchard <hppritcha at gmail.com> wrote:
> 
> Hi Jithin,
> 
> Thanks for the response.  We just did a little more digging and it turns out that
> the test case wasn't calling fi_enable before calling fi_getname.  For the
> GNI provider that happens to be okay (currently).    The test now works with
> the sockets provider.
> 
> Thanks,
> 
> Howard
>   
> 
> 2016-07-06 10:29 GMT-06:00 Jose, Jithin <jithin.jose at intel.com>:
> Hi Howard,
> 
> >libfabric:sockets:av:sock_check_table_in():228<debug> AV-INSERT:dst_addr: family: 2, IP is 10.128.0.9, port: 0
> 
> From the logs, it looks like the address inserted to AV table has port = 0. Can you double check this?
> May be we should add some error-checking here to catch these.
> 
> I think the connect() call is failing because of this:
> 
> >libfabric:sockets:ep_ctrl:sock_ep_connect():419<debug> Connecting to:
> >10.128.0.9:0 <http://10.128.0.9:0>
> 
> >Are there known problems with the sockets provider and send-to-self style send/recv
> >using FI_EP_RDM?
> 
> I am not aware of any issues with send-to-self send/recv with sockets-provider.
> 
> - Jithin
> 
> -----Original Message-----
> From: Libfabric-users <libfabric-users-bounces at lists.openfabrics.org> on behalf of Howard Pritchard <hppritcha at gmail.com>
> Date: Tuesday, July 5, 2016 at 10:36 AM
> To: "libfabric-users at lists.openfabrics.org" <libfabric-users at lists.openfabrics.org>, Henry Cooney <hacoo36 at gmail.com>
> Subject: [libfabric-users] sockets provider question
> 
> >Hi Folks,
> >
> >
> >We have a simple libfabric test we are trying to get to work.
> >It uses FI_EP_RDM endpoint type.  A single endpoint is
> >created.
> >
> >
> >The test is intended to send/recv messages on this single
> >endpoint.  This test works with GNI provider, but hangs
> >with the sockets provider.
> >
> >
> >We are working from master at 33cad4b
> >
> >
> >If we turn on FI_LOG_LEVEL_DEBUG we see odd things about
> >trying to use port 0 - which doesn't sound good:
> >
> >
> >libfabric:sockets:av:sock_check_table_in():228<debug> AV-INSERT:dst_addr: family: 2, IP is 10.128.0.9, port: 0
> >libfabric:sockets:ep_data:sock_pe_add_tx_ctx():2435<debug> TX ctx added to PE
> >libfabric:sockets:ep_data:sock_pe_add_rx_ctx():2453<debug> RX ctx added to PE
> >libfabric:sockets:ep_ctrl:sock_conn_listen():311<debug> Binding listener thread to port: 0
> >libfabric:sockets:ep_ctrl:sock_conn_listen():337<debug> Bound to port: 56165 - 39144
> >and later on when we try to send a message:
> >Test Fabric object created.
> >Attempting to send a message. You should see some output.
> >
> >
> >Sending...
> >libfabric:sockets:ep_ctrl:sock_ep_connect():419<debug> Connecting to:
> >10.128.0.9:0 <http://10.128.0.9:0>
> >libfabric:sockets:ep_ctrl:sock_ep_connect():421<debug> Connecting using address:10.128.0.9
> >libfabric:sockets:ep_ctrl:sock_ep_connect():443<debug> Error in connection() 111 - Connection refused - 16
> >libfabric:sockets:ep_ctrl:sock_ep_connect():445<debug> Connecting to:
> >10.128.0.9:0 <http://10.128.0.9:0>
> >libfabric:sockets:ep_ctrl:sock_ep_connect():447<debug> Connecting using address:10.128.0.9
> >Are there known problems with the sockets provider and send-to-self style send/recv
> >using FI_EP_RDM?
> >Thanks for any help,
> >Howard
> >
> >
> 
> 
> _______________________________________________
> Libfabric-users mailing list
> Libfabric-users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/libfabric-users




More information about the Libfabric-users mailing list