[ofa-general] problem with mvapich2 over iwarp

Fri Jun 1 09:00:46 PDT 2007

Sundeep/Sean,

I'm helping a customer who is trying to run mvapich2 over chelsio's 
rnic.  They're running a simple program that does an mpi init, 1000 
barriers, then a finalize.  They're using ofed-1.2-rc3, mpiexec-0.82, 
and mvapich2-0.9.8-p2 (not the mvapich2 from the ofed kit).  Also they 
aren't using mpd to start up stuff.  They're using pmi I guess (I'm not 
sure what pmi is, but the mpiexec has -comm=pmi.  BTW: I can run the 
same program fine on my 8 node cluster using mpd and the ofa mvapich2 code.

On their cluster a 4 node/4 process job hangs in finalize almost always. 
  When it hangs, one process is always stuck in rdma_destroy_id().

Here's the stack:

(gdb) bt
#0 0x0000003c7cf0ae2b in __lll_mutex_lock_wait () from
/lib64/tls/libpthread.so.0
#1 0x000000000068db20 in ?? ()
#2 0x0000000060040a0a in ?? ()
#3 0x0000003c7cf08800 in pthread_cond_destroy@@GLIBC_2.3.2 () from
/lib64/tls/libpthread.so.0
#4 0x0000002a9579a09c in ucma_destroy_kern_id (fd=0, handle=6871424) at
src/cma.c:403
#5 0x0000002a9579a163 in rdma_destroy_id (id=0x68d980) at src/cma.c:425
#6 0x0000000000423ef9 in ib_finalize_rdma_cm ()
#7 0x00000000004183f6 in MPIDI_CH3I_CM_Finalize ()
#8 0x000000000044b03b in MPIDI_CH3_Finalize ()
#9 0x000000000043169e in MPID_Finalize ()
#10 0x000000000040c3ef in PMPI_Finalize ()
#11 0x0000000000403af4 in main ()
(gdb)

I'm not sure I belive this stack trace fully, because 
ucm_destroy_kern_id() doesn't call pthread_cond_destroy().  However 
rdma_destroy_id() does.  So I'm thinking that ucma_destroy_id() has 
already been executed and rdma_destroy_id() is freeing the cm_id and we 
get stuck in pthread_cond_destroy() destroying the pthread condition object.

I'm wondering if ya'll have ever seen this kind of hang?  I can kill the 
    process and it exits, so I don't think we're stuck down in the 
kernel IWCM or anything.

Any thoughts?

Thanks,

Steve.