[openib-general] mvapich2 pmi scalability problems
Matthew Koop
koop at cse.ohio-state.edu
Fri Jul 21 11:51:31 PDT 2006
Don,
Are you using the USE_MPD_RING flag when compiling? If not, can you give
that a try? It should very significantly decrease the number of PMI calls
that are made.
Thanks,
Matthew Koop
On Fri, 21 Jul 2006 Don.Dhondt at Bull.com wrote:
> We have been working with LLNL trying to debug a problem using slurm as
> our resource manager,
> mvapich2 as our MPI choice and OFED 1.0 as our infiniband stack. The
> mvapich2 version is mvapich2-0.9.3.
> The problem arises when we try to scale a simple mpi job. We can not go
> much above 128 tasks
> before we start timing out socket connections on the PMI exchanges.
> Can anyone at OSU comment?
>
> Processes PMI_KVS_Put PMI_KVS_Get PMI_KVS_Commit Num Procs ratio
> Calls ratio
> n32 1024 1248 1024 1 1
> n64 4096 4544 4096 2 4
> n96 9216 9888 9216 3 8
> n128 16384 17280 16384 4 16
>
> Comment from LLNL:
> ------------------
> That is interesting! The ratio for MPICH2 is constant, so clearly
> MVAPICH2 is doing something unusual (and unexpected, to me anyway).
>
> What will MVAPCH2 do with really large parallel jobs? We regularly
> run jobs with thousands to tens of thousands of tasks. If you have
> located an MVAPICH2 expert, this would definitely be worth asking
> about. It's use of PMI appears to be non-scalable.
> ----------------------
>
> Any help is appreciated.
>
> Regards,
> Don Dhondt
> GCOS 8 Communications Solutions Project Manager
> Bull HN Information Systems Inc.
> 13430 N. Black Canyon Hwy., Phoenix, AZ 85029
> Work (602) 862-5245 Fax (602) 862-4290
More information about the general
mailing list