[ofiwg] [libfabric-users] libfabric hangs on QEMU/KVM virtual cluster

Wilkes, John John.Wilkes at amd.com
Tue Feb 6 15:15:27 PST 2018


MPICH 3.3b1 and the CH4 device works. My MPI programs run to completion on my four node QEMU/KVM cluster.

Thanks for the help!

John

-----Original Message-----
From: Bland, Wesley [mailto:wesley.bland at intel.com] 
Sent: Tuesday, February 06, 2018 7:35 AM
To: Wilkes, John <John.Wilkes at amd.com>
Cc: Hefty, Sean <sean.hefty at intel.com>; libfabric-users at lists.openfabrics.org; ofiwg at lists.openfabrics.org
Subject: Re: [libfabric-users] libfabric hangs on QEMU/KVM virtual cluster

Your device configuration string should be `--with-device=ch4:ofi`. There is no nemesis in CH4.

Also, if you're using 3.2.1, there won't be a CH4 device at all. You'll need to pick up 3.3b1.

Thanks,
Wesley

> On Feb 6, 2018, at 9:26 AM, Wilkes, John <John.Wilkes at amd.com> wrote:
> 
> ./configure --prefix=/nfs/mpich3 --enable-g=dbg,log 
> --with-ofi=/nfs/libfabric --with-device=ch4:nemesis:ofi 
> LD_LIBRARY_PATH=/nfs/libfabric/lib: LDFLAGS=-Wl,-rpath 
> -Wl,/nfs/libfabric/lib
> 
> configure: error: Device ch4 is unknown
> 
> John
> 
> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Monday, February 05, 2018 5:04 PM
> To: Wilkes, John <John.Wilkes at amd.com>; 
> libfabric-users at lists.openfabrics.org; ofiwg at lists.openfabrics.org
> Subject: RE: libfabric hangs on QEMU/KVM virtual cluster
> 
>> I've run it with libfabric-1.5.3, which I think is the latest, and 
>> mpich-3.2. There is a mpich-3.2.1 now. I also tried OpenMPI-3.0.0, 
>> and I see there's OpenMPI-3.0.0-1 and 3.0.1rc2 available.
>> 
>> I'll grab the very latest MPICH and give it a try.
> 
> Can you also see if it's possible to use CH4 instead of CH3?
> _______________________________________________
> Libfabric-users mailing list
> Libfabric-users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/libfabric-users




More information about the ofiwg mailing list