[openib-general] <NOOB> initial setup problems
Mark Salisbury
msalisbury at interactivesupercomputing.com
Wed Feb 21 09:08:05 PST 2007
trying to setup ofed-1.1 on mellanox HW using Intel MPI.
trying to run an MPI hello world equivalent, I get most of the way
through startup and then it bombs out.
I am unable to find any info about unexpected DAPL event 4008
here is the output of an example run:
running mpdallexit on raki1
LAUNCHED mpd on raki1 via
RUNNING: mpd on raki1
LAUNCHED mpd on raki2 via raki1
LAUNCHED mpd on raki4 via raki1
RUNNING: mpd on raki4
RUNNING: mpd on raki2
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Fri Sep 15 14:32:24 MSD 2006
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20060915
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.62 2006/09/15 08:43:15 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20060915.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20060915 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20060915
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [2] I_MPI_dlopen_dat(): I_MPI: [0] my_dlopen(): trying to dlopen: libdat.sotrying to dlopen default -ldat: libdat.so
I_MPI: [2] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [1] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [1] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [0] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma
I_MPI: [1] MPIDI_CH3I_RDMA_init(): I_MPI: [2] MPIDI_CH3I_RDMA_init(): will use DAPL provider from registry: OpenIB-cma
will use DAPL provider from registry: OpenIB-cma
I_MPI: [0] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [0] MPI_Init: The process (pid=17898) started on raki1
Greetings from process 17898(0)
I_MPI: [1] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [1] MPI_Init: The process (pid=16216) started on raki2
I_MPI: [2] MPIDI_CH3_Init(): will use rdma configuration
I_MPI: [2] MPI_Init: The process (pid=16330) started on raki4
[2:raki4] unexpected DAPL event 4008 from 0:raki1
[1:raki2] unexpected DAPL event 4008 from 0:raki1
rank 2 in job 1 raki1_37392 caused collective abort of all ranks
exit status of rank 2: return code 254
rank 1 in job 1 raki1_37392 caused collective abort of all ranks
exit status of rank 1: return code 254
More information about the general
mailing list