Hi all -<br><br>I'm new to the list, and I hope this is the correct place to post this. I am running an MPI application which uses CUDA and NVIDIA GPUs to accelerate computation. I am using mvapich2 to get multi-thread-safe MPI with infiniband.
<br><br>If I run mvapich2 configured for tcp, my application runs fine (or if I run it in single node mode without MPI), but if I run it configured for infiniband, my application fails on GPU initialization about 80% of the time (the other 20% of the time, my application runs fine to completion). I'm not sure what could be happening.
<br><br>I'm not sure if somehow one of the infiniband drivers could be interacting with the nvidia driver?<br><br>Thanks for any help,<br> Brian<br>