[Users] Communication (Send/Recv) error according to message size(65535)

Jens Domke jens.domke at riken.jp
Sun Jan 3 22:35:01 PST 2021


Hello Kihang,

I have seen similar errors when there are potential routing deadlocks in
the topology.

Could you provide more details regarding:

a) topology (graph type: CLOS, dragonfly, torus, etc?) and size (tree
levels, #switches, etc? whatever is applicable)

b) routing engine specified in the (M)OFED config file for the subnet
manager?

If I get the info, I might be able to rule out deadlocks, and people
can focus on other parts of the stack.

Best,
  Jens


On 04/01/2021 15:08, Kihang Youn wrote:
> Hello,
> 
> I am testing the newly upgraded OFED (5.1-0.6.6) and corresponding OpenMPI (4.0.2, 4.0.4).
> 
> I don't know for what reason, but I get a communication error. (There is no error in the combination of OFED(4.6-1.0.1) & OpenMPI(4.0.2))
> 
> When communicating between compute nodes(inter-nodes), if the size of send/recv messages exceeds 65535, the following error occurs.
> This does not happen when using one compute node.
> 
> If there are any points worth checking, it would be appreciated if you could tell us even a trivial thing.
> 
> 
> Best Regards,
> Kihang
> 
> 
> 
> Part of the error message:
> 
> [pduru18:351568:0:351568] ib_mlx5_log.c:143  Transport retry count exceeded on mlx5_2:1/RoCE (synd 0x15 vend 0x81 hw_synd 0/0)
> [pduru18:351568:0:351568] ib_mlx5_log.c:143  RC QP 0x139d4 wqe[0]: RDMA_READ s-- [rva 0x2b9827e90a40 rkey 0x182ab] [va 0x2b270e05ca00 len 219136 lkey 0x3c2b]
> [pduru18:351565:0:351565] ib_mlx5_log.c:143  Transport retry count exceeded on mlx5_2:1/RoCE (synd 0x15 vend 0x81 hw_synd 0/0)
> [pduru18:351565:0:351565] ib_mlx5_log.c:143  RC QP 0x139d3 wqe[0]: RDMA_READ s-- [rva 0x2ac9d73be980 rkey 0x8b395] [va 0x2b464c51bc00 len 223232 lkey 0x5e4b]
> [pduru18:351571:0:351571] ib_mlx5_log.c:143  Transport retry count exceeded on mlx5_2:1/RoCE (synd 0x15 vend 0x81 hw_synd 0/0)
> [pduru18:351571:0:351571] ib_mlx5_log.c:143  RC QP 0x139d2 wqe[0]: RDMA_READ s-- [rva 0x2b0072dd1980 rkey 0x55fea] [va 0x2b70590d8c00 len 223232 lkey 0x715b]
> 
> Executable file error message:
> 
> ==== backtrace (tid: 351569) ====
>   0 0x000000000004ed85 ucs_debug_print_backtrace()  ???:0
>   1 0x000000000001f9c2 uct_ib_mlx5_completion_with_err()  ???:0
>   2 0x000000000002e736 uct_rc_mlx5_iface_is_reachable()  ???:0
>   3 0x0000000000030481 uct_rc_mlx5_iface_progress()  ???:0
>   4 0x0000000000022f3a ucp_worker_progress()  ???:0
>   5 0x0000000000038574 opal_progress()  /export/home/nwp/OFED_TEST/KMALIB/src/openmpi/openmpi-4.0.4/opal/runtime/opal_progress.c:231
>   6 0x00000000000569f7 ompi_request_wait_completion()  /export/home/nwp/OFED_TEST/KMALIB/src/openmpi/openmpi-4.0.4/ompi/../ompi/request/request.h:415
>   7 0x00000000000569f7 ompi_request_default_wait()  /export/home/nwp/OFED_TEST/KMALIB/src/openmpi/openmpi-4.0.4/ompi/request/req_wait.c:42
>   8 0x0000000000084772 PMPI_Wait()  /export/home/nwp/OFED_TEST/KMALIB/src/openmpi/openmpi-4.0.4/ompi/mpi/c/profile/pwait.c:74
>   9 0x000000000005b26f ompi_wait_f()  /export/home/nwp/OFED_TEST/KMALIB/src/openmpi/openmpi-4.0.4/ompi/mpi/fortran/mpif-h/profile/pwait_f.c:76
> 10 0x00000000005b1642 swap3d_()  ???:0
> 11 0x00000000004a6eb4 hdiff_()  ???:0
> 12 0x000000000046bf81 sciproc_()  ???:0
> 13 0x0000000000462418 MAIN__()  ???:0
> 14 0x000000000040bfde main()  ???:0
> 15 0x00000000000223d5 __libc_start_main()  ???:0
> 16 0x000000000040bee9 _start()  ???:0
> 
> 
>> ucx_info -v
> # UCT version=1.9.0 revision 1d0a420
> # configured with: --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-optimizations --disable-logging --disable-debug --disable-assertions --enable-mt --disable-params-check --without-java --enable-cma --without-cuda --without-gdrcopy --with-verbs --without-cm --with-knem --with-rdmacm --without-rocm --without-xpmem --without-ugni
> 
>> cat /etc/redhat-release
> CentOS Linux release 7.6.1810 (Core)
> 
>> uname -a
> Linux boot2 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
>> ofed_info -s
> MLNX_OFED_LINUX-5.1-0.6.6.0:
> 
>> ibstat
> CA 'mlx5_0'
>          CA type: MT4123
>          Number of ports: 1
>          Firmware version: 20.28.1002
>          Hardware version: 0
>          Node GUID: 0xb8599f0300b84da6
>          System image GUID: 0xb8599f0300b84da6
>          Port 1:
>                  State: Active
>                  Physical state: LinkUp
>                  Rate: 100
>                  Base lid: 4
>                  LMC: 0
>                  SM lid: 4
>                  Capability mask: 0x2651e84a
>                  Port GUID: 0xb8599f0300b84da6
>                  Link layer: InfiniBand
> 
>> ibv_devinfo -v
> hca_id: mlx5_0
>          transport:               InfiniBand (0)
>          fw_ver:                  20.28.1002
>          node_guid:               b859:9f03:00b8:4da6
>          sys_image_guid:          b859:9f03:00b8:4da6
>          vendor_id:               0x02c9
>          vendor_part_id:          4123
>          hw_ver:                  0x0
>          board_id:                LNV0000000016
>          phys_port_cnt:           1
>          max_mr_size:             0xffffffffffffffff
>          page_size_cap:           0xfffffffffffff000
>          max_qp:                  262144
>          max_qp_wr:               32768
>          device_cap_flags:        0xe97e1c36
>                                   BAD_PKEY_CNTR
>                                   BAD_QKEY_CNTR
>                                   AUTO_PATH_MIG
>                                   CHANGE_PHY_PORT
>                                   PORT_ACTIVE_EVENT
>                                   SYS_IMAGE_GUID
>                                   RC_RNR_NAK_GEN
>                                   MEM_WINDOW
>                                   UD_IP_CSUM
>                                   XRC
>                                   MEM_MGT_EXTENSIONS
>                                   MEM_WINDOW_TYPE_2B
>                                   MANAGED_FLOW_STEERING
>                                   Unknown flags: 0xC8480000
>          max_sge:                 30
>          max_sge_rd:              30
>          max_cq:                  16777216
>          max_cqe:                 4194303
>          max_mr:                  16777216
>          max_pd:                  16777216
>          max_qp_rd_atom:          16
>          max_ee_rd_atom:          0
>          max_res_rd_atom:         4194304
>          max_qp_init_rd_atom:     16
>          max_ee_init_rd_atom:     0
>          atomic_cap:              ATOMIC_HCA (1)
>          max_ee:                  0
>          max_rdd:                 0
>          max_mw:                  16777216
>          max_raw_ipv6_qp:         0
>          max_raw_ethy_qp:         0
>          max_mcast_grp:           2097152
>          max_mcast_qp_attach:     240
>          max_total_mcast_qp_attach:      503316480
>          max_ah:                  2147483647
>          max_fmr:                 0
>          max_srq:                 8388608
>          max_srq_wr:              32767
>          max_srq_sge:             31
>          max_pkeys:               128
>          local_ca_ack_delay:      16
>          general_odp_caps:
>                                   ODP_SUPPORT
>                                   ODP_SUPPORT_IMPLICIT
>          rc_odp_caps:
>                                   SUPPORT_SEND
>                                   SUPPORT_RECV
>                                   SUPPORT_WRITE
>                                   SUPPORT_READ
>                                   SUPPORT_SRQ
>          uc_odp_caps:
>                                   NO SUPPORT
>          ud_odp_caps:
>                                   SUPPORT_SEND
>          xrc_odp_caps:
>                                   SUPPORT_SEND
>                                   SUPPORT_WRITE
>                                   SUPPORT_READ
>                                   SUPPORT_SRQ
>          completion timestamp_mask:                      0x7fffffffffffffff
>          hca_core_clock:          156250kHZ
>          device_cap_flags_ex:     0x30000051E97E1C36
>                                   PCI_WRITE_END_PADDING
>                                   Unknown flags: 0x3000004100000000
>          tso_caps:
>          max_tso:                 0
>          rss_caps:
>                  max_rwq_indirection_tables:                     0
>                  max_rwq_indirection_table_size:                 0
>                  rx_hash_function:                               0x0
>                  rx_hash_fields_mask:                            0x0
>          max_wq_type_rq:          0
>          packet_pacing_caps:
>                  qp_rate_limit_min:      0kbps
>                  qp_rate_limit_max:      0kbps
>          max_rndv_hdr_size:       64
>          max_num_tags:            127
>          max_ops:                 32768
>          max_sge:                 1
>          flags:
>                                   IBV_TM_CAP_RC
> 
>          cq moderation caps:
>                  max_cq_count:   65535
>                  max_cq_period:  4095 us
> 
>          maximum available device memory:        262144Bytes
> 
>                  port:   1
>                          state:          PORT_ACTIVE (4)
>                          max_mtu:        4096 (5)
>                          active_mtu:             4096 (5)
>                          sm_lid:         4
>                          port_lid:               4
>                          port_lmc:               0x00
>                          link_layer:             InfiniBand
>                          max_msg_sz:             0x40000000
>                          port_cap_flags:         0x2251e84a
>                          port_cap_flags2:        0x0032
>                          max_vl_num:             4 (3)
>                          bad_pkey_cntr:          0x0
>                          qkey_viol_cntr:         0x0
>                          sm_sl:          0
>                          pkey_tbl_len:           128
>                          gid_tbl_len:            8
>                          subnet_timeout:         18
>                          init_type_reply:        0
>                          active_width:           2X (16)
>                          active_speed:           50.0 Gbps (64)
>                          phys_state:             LINK_UP (5)
>                          GID[  0]:               fe80:0000:0000:0000:b859:9f03:00b8:4da6
> 
>> ompi_info
>                   Package: Open MPI root at boot2 Distribution
>                  Open MPI: 4.0.2
>    Open MPI repo revision: v4.0.2
>     Open MPI release date: Oct 07, 2019
>                  Open RTE: 4.0.2
>    Open RTE repo revision: v4.0.2
>     Open RTE release date: Oct 07, 2019
>                      OPAL: 4.0.2
>        OPAL repo revision: v4.0.2
>         OPAL release date: Oct 07, 2019
>                   MPI API: 3.1.0
>              Ident string: 4.0.2
>                    Prefix: /d1/home/nwp/OFED_TEST/KMALIB/apps/openmpi/4.0.2_ofed
>   Configured architecture: x86_64-unknown-linux-gnu
>            Configure host: boot2
>             Configured by: root
>             Configured on: Wed Dec 16 06:21:49 UTC 2020
>            Configure host: boot2
>    Configure command line: 'CC=icc' 'CFLAGS=-m64' 'FC=ifort' 'FCFLAGS=-m64'
>                            '--prefix=/d1/home/nwp/OFED_TEST/KMALIB/apps/openmpi/4.0.2_ofed'
>                            '--with-platform=mellanox/optimized'
>                            '--with-mxm=/opt/mellanox/mxm'
>                            '--with-knem=/opt/knem-1.1.4.90mlnx1/'
>                            '--with-zlib=/opt/kma/kma_lib/apps/zlib/1.2.11/'
>                            '--with-zlib-libdir=/opt/kma/kma_lib/apps/zlib/1.2.11/lib'
>                            '--with-lsf=/opt/ibm/lsfsuite/dcomp/lsf/10.1'
>                            '--with-lsf-libdir=/opt/ibm/lsfsuite/lsf/10.1/linux2.6-glibc2.3-x86_64/lib/'
>                  Built by: root
>                  Built on: Wed Dec 16 06:29:41 UTC 2020
>                Built host: boot2
>                C bindings: yes
>              C++ bindings: no
>               Fort mpif.h: yes (all)
>              Fort use mpi: yes (full: ignore TKR)
>         Fort use mpi size: deprecated-ompi-info-value
>          Fort use mpi_f08: yes
>   Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
>                            limitations in the ifort compiler and/or Open MPI,
>                            does not support the following: array subsections,
>                            direct passthru (where possible) to underlying Open
>                            MPI's C functionality
>    Fort mpi_f08 subarrays: no
>             Java bindings: no
>    Wrapper compiler rpath: runpath
>                C compiler: icc
>       C compiler absolute: /d1/home/nwp/OFED_TEST/KMALIB/apps/intel/20.2_ofed/compilers_and_libraries_2020/linux/bin/intel64/icc
>    C compiler family name: INTEL
>        C compiler version: 1910.20200623
>              C++ compiler: g++
>     C++ compiler absolute: /opt/kma/kma_lib/apps/gcc/7.5.0/bin/g++
>             Fort compiler: ifort
>         Fort compiler abs: /d1/home/nwp/OFED_TEST/KMALIB/apps/intel/20.2_ofed/compilers_and_libraries_2020/linux/bin/intel64/ifort
>           Fort ignore TKR: yes (!DEC$ ATTRIBUTES NO_ARG_CHECK ::)
>     Fort 08 assumed shape: yes
>        Fort optional args: yes
>            Fort INTERFACE: yes
>      Fort ISO_FORTRAN_ENV: yes
>         Fort STORAGE_SIZE: yes
>        Fort BIND(C) (all): yes
>        Fort ISO_C_BINDING: yes
>   Fort SUBROUTINE BIND(C): yes
>         Fort TYPE,BIND(C): yes
>   Fort T,BIND(C,name="a"): yes
>              Fort PRIVATE: yes
>            Fort PROTECTED: yes
>             Fort ABSTRACT: yes
>         Fort ASYNCHRONOUS: yes
>            Fort PROCEDURE: yes
>           Fort USE...ONLY: yes
>             Fort C_FUNLOC: yes
>   Fort f08 using wrappers: yes
>           Fort MPI_SIZEOF: yes
>               C profiling: yes
>             C++ profiling: no
>     Fort mpif.h profiling: yes
>    Fort use mpi profiling: yes
>     Fort use mpi_f08 prof: yes
>            C++ exceptions: no
>            Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
>                            OMPI progress: no, ORTE progress: yes, Event lib:
>                            yes)
>             Sparse Groups: no
>    Internal debug support: no
>    MPI interface warnings: yes
>       MPI parameter check: never
> Memory profiling support: no
> Memory debugging support: no
>                dl support: yes
>     Heterogeneous support: no
>   mpirun default --prefix: yes
>         MPI_WTIME support: native
>       Symbol vis. support: yes
>     Host topology support: yes
>              IPv6 support: no
>        MPI1 compatibility: no
>            MPI extensions: affinity, cuda, pcollreq
>     FT Checkpoint support: no (checkpoint thread: no)
>     C/R Enabled Debugging: no
>    MPI_MAX_PROCESSOR_NAME: 256
>      MPI_MAX_ERROR_STRING: 256
>       MPI_MAX_OBJECT_NAME: 64
>          MPI_MAX_INFO_KEY: 36
>          MPI_MAX_INFO_VAL: 256
>         MPI_MAX_PORT_NAME: 1024
>    MPI_MAX_DATAREP_STRING: 128
> 
>> ofed_info
> MLNX_OFED_LINUX-5.1-0.6.6.0 (OFED-5.1-0.6.6):
> ar_mgr:
> osm_plugins/ar_mgr/ar_mgr-1.0-0.2.MLNX20200630.g8577618.tar.gz
> 
> dpcp:
> /sw/release/sw_acceleration/dpcp/dpcp-1.0.0-1.src.rpm
> 
> dump_pr:
> osm_plugins/dump_pr//dump_pr-1.0-0.2.MLNX20200630.g8577618.tar.gz
> 
> fabric-collector:
> fabric_collector//fabric-collector-1.1.0.MLNX20170103.89bb2aa.tar.gz
> 
> hcoll:
> mlnx_ofed_hcol/hcoll-4.6.3125-1.src.rpm
> 
> ibdump:
> https://github.com/Mellanox/ibdump master
> commit 6355ebbd664cafb629edeadecd4096ac2a0304c3
> ibsim:
> mlnx_ofed_ibsim/ibsim-0.9.tar.gz
> 
> ibutils2:
> ibutils2/ibutils2-2.1.1-0.126.MLNX20200721.gf95236b.tar.gz
> 
> iser:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> isert:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> kernel-mft:
> mlnx_ofed_mft/kernel-mft-4.15.0-104.src.rpm
> 
> knem:
> knem.git mellanox-master
> commit 299ba51259c0947b71b762567bccf660513f8643
> libpka:
> mlnx_ofed_soc/libpka-1.0-1.gcc98895.src.rpm
> 
> libvma:
> vma/source_rpms/libvma-9.1.1-0.src.rpm
> 
> mlnx-dpdk:
> https://github.com/Mellanox/dpdk.org mlnx_dpdk_19.11_last_stable
> commit c8732df963abf855edf2447a0b8d8543e7924ba9
> mlnx-en:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> mlnx-ethtool:
> mlnx_ofed/ethtool.git mlnx_ofed_5_1
> commit a1f6f627af80b76b013b68ff57a3ae41ac7517f9
> mlnx-iproute2:
> mlnx_ofed/iproute2.git mlnx_ofed_5_1
> commit 9a007c2d912ce52ad5e3e9c6a9bc9fb4d20fd52c
> mlnx-nfsrdma:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> mlnx-nvme:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> mlnx-ofa_kernel:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> mlxbf-bootctl:
> https://github.com/Mellanox/mlxbf-bootctl bluefield-rel/3.0
> commit fda69b62ac4f2707a82da18f894b40120f686010
> mpi-selector:
> ofed-1.5.3-rpms/mpi-selector/mpi-selector-1.0.3-1.src.rpm
> 
> mpitests:
> mlnx_ofed_mpitest/mpitests-3.2.20-5d20b49.src.rpm
> 
> mstflint:
> mlnx_ofed_mstflint/mstflint-4.14.0-3.tar.gz
> 
> multiperf:
> mlnx_ofed_multiperf/multiperf-3.0-0.14.g5f0fd0e.tar.gz
> 
> mxm:
> /sw/release/mlnx_ofed/IBHPC/MLNX_OFED_LINUX-5.0-0.3.7/SRPMS/mxm-3.7.3112-1.50037.src.rpm
> 
> ofed-docs:
> docs.git mlnx_ofed-4.0
> commit 3d1b0afb7bc190ae5f362223043f76b2b45971cc
> 
> openmpi:
> mlnx_ofed_ompi_1.8/openmpi-4.0.4rc3-1.src.rpm
> 
> opensm:
> mlnx_ofed_opensm/opensm-5.7.0.MLNX20200721.7ccc6f6.tar.gz
> 
> openvswitch:
> openvswitch.git mlnx_ofed_5_1
> commit e8a86012636e058cfd48486c39afa8cbac9ed597
> perftest:
> mlnx_ofed_perftest/perftest-4.4-0.30.g9c50960.tar.gz
> 
> rdma-core:
> mlnx_ofed/rdma-core.git mlnx_ofed_5_1
> commit 77e7f704897a3bf94464d3c12ec508f1e26336fd
> 
> rshim:
> https://github.com/Mellanox/rshim-user-space master
> commit a70d84655d6e248141124bce1805f2c9b0426fe9
> 
> sharp:
> mlnx_ofed_sharp/sharp-2.2.0.MLNX20200721.2fd570a.tar.gz
> 
> sockperf:
> sockperf/sockperf-3.7-0.gita1e8e835a689.src.rpm
> 
> srp:
> mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_5_1
> commit c72091bb7f69243219dda60946342385c9766aa3
> 
> ucx:
> mlnx_ofed_ucx/ucx-1.9.0-1.src.rpm
> 
> 
> 
> 
> 
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> https://lists.openfabrics.org/mailman/listinfo/users
> 


More information about the Users mailing list