[openib-general] <NOOB> initial setup problems

Mark Salisbury msalisbury at interactivesupercomputing.com
Wed Feb 21 09:08:05 PST 2007


trying to setup ofed-1.1 on mellanox HW using Intel MPI.

trying to run an MPI hello world equivalent, I get most of the way 
through startup and then it bombs out.

I am unable to find any info about unexpected DAPL event 4008

here is the output of an example run:
running mpdallexit on raki1
LAUNCHED mpd on raki1  via
RUNNING: mpd on raki1
LAUNCHED mpd on raki2  via  raki1
LAUNCHED mpd on raki4  via  raki1
RUNNING: mpd on raki4
RUNNING: mpd on raki2
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT                                MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES                           (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION                     3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD               Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED            mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID          ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE                 ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME                 svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION                      3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION                         3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER    = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST        = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR     = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT        = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT                                MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES                           (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION                     3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD               Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED            mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID          ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE                 ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME                 svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION                      3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION                         3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER    = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST        = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR     = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT        = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT                                MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES                           (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION                     3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD               Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED            mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID          ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE                 ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME                 svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION                      3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION                         3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER    = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST        = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR     = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT        = NULL
I_MPI: [0] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [2] I_MPI_dlopen_dat(): I_MPI: [0] my_dlopen(): trying to dlopen: libdat.sotrying to dlopen default -ldat: libdat.so
I_MPI: [2] my_dlopen(): trying to dlopen: libdat.so

I_MPI: [1] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [1] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [0] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma
I_MPI: [1] MPIDI_CH3I_RDMA_init(): I_MPI: [2] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma
will use DAPL provider from registry: OpenIB-cma
I_MPI: [0] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [0] MPI_Init: The process (pid=17898) started on raki1
Greetings from process 17898(0)
I_MPI: [1] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [1] MPI_Init: The process (pid=16216) started on raki2
I_MPI: [2] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [2] MPI_Init: The process (pid=16330) started on raki4
[2:raki4] unexpected DAPL event 4008 from 0:raki1
[1:raki2] unexpected DAPL event 4008 from 0:raki1
rank 2 in job 1  raki1_37392   caused collective abort of all ranks
  exit status of rank 2: return code 254
rank 1 in job 1  raki1_37392   caused collective abort of all ranks
  exit status of rank 1: return code 254




More information about the general mailing list