[libfabric-users] Thread question about gni provider

Biddiscombe, John A. biddisco at cscs.ch
Wed Mar 15 01:43:50 PDT 2017


Apologies – I found the FI_PROGRESS_AUTO flag as I stepped through the code and I see that I can use MANUAL progress to prevent thread creation.

I will experiment with that.

JB


On 15.03.17, 09:36, "Libfabric-users on behalf of Biddiscombe, John A." <libfabric-users-bounces at lists.openfabrics.org on behalf of biddisco at cscs.ch> wrote:

    Dear list
    
    I have a problem with thread creation during initialization – best summed up by the stack trace below.
    
    At start, I create a memory pool of registered blocks (some could be removed in principle when using some of the new features of libfabric), but the call to mr_reg results in a pthread create. I’ve hijacked the pthread API using an HPX emulation layer so that instead it creates an HPX thread. But since I’m doing this at startup, the runtime is not yet ready and I get an error (I’ll have to work around this). Does this thread create only occur on very first use of the gnix_ api?
    
    The broader question is : When does libfabric create pthreads - and how many of them – is it possible to know when libfabric needs its own threads – and when it will use those threads – I can understand the code taking a lock – but spawning threads without the user’s permission seems a trifle cheeky. (can it be disabled and force the library to only progress events when the user specifically requests it?)
    
    Can I control which threads are used – or better still, is there a document anywhere I can read that will tell me about this? I presume it is provider dependent? (For example, the verbs API is entirely thread safe, so can I assume that libfabric, never takes locks, and never spawns new threads using verbs?). 
    
    Thanks
    
    JB
    
    #9  0x00002aaaae2cf0d8 in hpx::applier::register_thread(hpx::util::unique_function<void (hpx::threads::thread_state_ex_enum), false>&&, hpx::util::thread_description const&, hpx::threads::thread_state_enum, bool, hpx::threads::thread_priority, unsigned long, hpx::threads::thread_stacksize, hpx::error_code&) (func=<unknown type in /scratch/snx3000/biddisco/build/hvtkm/lib/libhpxd.so.1, CU 0x1c3ea0f, DIE 0x1c9cf9c>, desc=..., state=hpx::threads::pending, run_now=0x1, priority=hpx::threads::thread_priority_normal, os_thread=0xffffffffffffffff, stacksize=hpx::threads::thread_stacksize_small, ec=...) at /scratch/snx3000/biddisco/src/hvtkm/hpx/src/runtime/applier/applier.cpp:107
    #10 0x00002aaaaeae2db5 in hpxc_thread_create (thread=0x2aaab5ee0068, attr=0x0, thread_function=0x2aaaaeb5c519 <__gnix_nic_prog_thread_fn>, arguments=0x2aaab5ee0000) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/hpxc/src/threads/thread.cpp:298
    #11 0x00002aaaaeb645cb in gnix_nic_alloc (domain=0x2aaab5e6e000, attr=0x0, nic_ptr=0x7fffffff2708) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_nic.c:1261
    #12 0x00002aaaaeb372f9 in __gnix_generic_register (domain=0x2aaab5e6e000, md=0x2aaab5e64670, address=0x2aaab5ee7000, length=0x2000, dst_cq_hndl=0x0, flags=0x0, vmdh_index=0xffffffff) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:371
    #13 0x00002aaaaeb37ffe in __gnix_register_region (handle=0x2aaab5e64670, address=0x2aaab5ee7000, length=0x2000, fi_reg_context=0x7fffffff2b90, context=0x2aaab5e6e000) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:436
    #14 0x00002aaaaeb413a8 in __mr_cache_create_registration (cache=0x2aaab5e64580, address=0x2aaab5ee7000, length=0x2000, entry=0x7fffffff2a78, key=0x7fffffff2a80, fi_reg_context=0x7fffffff2b90) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr_cache.c:1447
    #15 0x00002aaaaeb41fde in _gnix_mr_cache_register (cache=0x2aaab5e64580, address=0x2aaab5ee7000, length=0x2000, fi_reg_context=0x7fffffff2b90, handle=0x7fffffff2bb8) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr_cache.c:1565
    #16 0x00002aaaaeb38715 in __cache_reg_mr (domain=0x2aaab5e6e000, address=0x2aaab5ee7000, length=0x2000, fi_reg_context=0x7fffffff2b90, handle=0x7fffffff2bb8) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:714
    #17 0x00002aaaaeb368a7 in __mr_reg (fid=0x2aaab5e6e000, buf=0x2aaab5ee7000, len=0x2000, access=0x3f00, offset=0x0, requested_key=0x0, flags=0x0, mr_o=0x2aaab5e2e8d0, context=0x0) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:216
    #18 0x00002aaaaeb36c88 in gnix_mr_regattr (fid=0x2aaab5e6e000, attr=0x7fffffff2cd0, flags=0x0, mr=0x2aaab5e2e8d0) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:298
    #19 0x00002aaaaeb36b36 in gnix_mr_reg (fid=0x2aaab5e6e000, buf=0x2aaab5ee7000, len=0x2000, access=0x3f00, offset=0x0, requested_key=0x0, flags=0x0, mr=0x2aaab5e2e8d0, context=0x0) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/prov/gni/src/gnix_mr.c:262
    #20 0x00002aaaaebacb08 in fi_mr_reg (domain=0x2aaab5e6e000, buf=0x2aaab5ee7000, len=0x2000, access=0x3f00, offset=0x0, requested_key=0x0, flags=0x0, mr=0x2aaab5e2e8d0, context=0x0) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric/include/rdma/fi_domain.h:274
    #21 0x00002aaaaebaed4f in hpx::parcelset::policies::libfabric::libfabric_memory_region::allocate (this=0x2aaab5e2e8d0, pd=0x2aaab5e6e000, length=0x2000) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/libfabric_memory_region.hpp:97
    #22 0x00002aaaaebaf352 in hpx::parcelset::policies::libfabric::memory_region_allocator::malloc (pd=0x2aaab5e6e000, bytes=0x2000) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/rdma_memory_pool.hpp:119
    #23 0x00002aaaaebb7f2e in hpx::parcelset::policies::libfabric::pool_container<hpx::parcelset::policies::libfabric::memory_region_allocator, hpx::parcelset::policies::libfabric::pool_tiny, 1024ul, 8ul>::allocate_pool (this=0x2aaab5ebf818, num_chunks=0x8) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/rdma_memory_pool.hpp:154
    #24 0x00002aaaaebaf670 in hpx::parcelset::policies::libfabric::rdma_memory_pool::rdma_memory_pool (this=0x2aaab5ebf810, pd=0x2aaab5e6e000) at /scratch/snx3000/biddisco/src/hvtkm/hpx/plugins/parcelport/libfabric/rdma_memory_pool.hpp:285
    
    
    
    
    
    _______________________________________________
    Libfabric-users mailing list
    Libfabric-users at lists.openfabrics.org
    http://lists.openfabrics.org/mailman/listinfo/libfabric-users
    



More information about the Libfabric-users mailing list