[openib-general] MVAPICH viadev_post_recv issue

Sayantan Sur surs at cse.ohio-state.edu
Fri Mar 10 11:45:31 PST 2006


Ira,

Thanks for trying out MVAPICH-gen2. I have the 1.0 branch installed on
our machines (EM64T), pulled around 4 days back. I am not able to see
this error. Could you please tell us the following things:

1) Which platform you are using?
2) Is your copy of MVAPICH synced with the OpenIB svn?
3) Is this error coming from the OSU benchmarks or when you run a
specific MPI app?

I am CCing this email to mvapich-discuss mailing list so other members
of the MVAPICH community can respond to whether they've seen this error
or not.

Can you try to modify the error print using perror (patch below) to see
if we can see any useful debug info?

Thanks,
Sayantan.

Index: viapriv.c
===================================================================
--- viapriv.c   (revision 5745)
+++ viapriv.c   (working copy)
@@ -27,6 +27,7 @@
 #include "mpid.h"
 #include "ibverbs_header.h"
 #include "viapriv.h"
+#include <errno.h>

 /* Global structure containing information about ADI2 device */
 ibv_info_t ibv_dev;
@@ -169,6 +170,7 @@

     v->grank = c->global_rank;
     if(ibv_post_recv(c->qp_hndl, &(v->desc.u.rr), &bad_wr)) {
+        perror("Recv Post error");
         error_abort_all(IBV_RETURN_ERR,
                 "Error posting recv\n");
     }


* On Mar,1 Ira Weiny<weiny2 at llnl.gov> wrote :
> With the new 1.0 branch of stuff.  (Pulled from svn about a week ago.)  I have been getting the following error from mvapich.
> 
> 
> [0] Abort: Error posting recv 22 (0x735f88 ?= 0x735804)
>  at line 176 in file viapriv.c
> [1] Abort: Error posting recv 22 (0x735f88 ?= 0x735804)
>  at line 176 in file viapriv.c
> 
> The number there is the return code from ibv_post_recv.  I don't understand where that value is comming from.  ibv_rc_pingpong is working just fine.
> 
> What am I doing wrong?
> Ira
> 
> 
> I changed the print this way:
> 
> 16:42:58 > svn diff
> Index: mvapich-gen2/mpid/ch_gen2/viapriv.c
> ===================================================================
> --- mvapich-gen2/mpid/ch_gen2/viapriv.c (revision 51)
> +++ mvapich-gen2/mpid/ch_gen2/viapriv.c (working copy)
> @@ -166,11 +166,14 @@
>  void viadev_post_recv(ibv_connection_t * c, vbuf * v)
>  {   
>      struct ibv_recv_wr *bad_wr;
> +    int    rc = 0;
> 
>      v->grank = c->global_rank;
> -    if(ibv_post_recv(c->qp_hndl, &(v->desc.u.rr), &bad_wr)) {
> +    if((rc = ibv_post_recv(c->qp_hndl, &(v->desc.u.rr), &bad_wr)) != 0) {
>          error_abort_all(IBV_RETURN_ERR,
> -                "Error posting recv\n");
> +                "Error posting recv %d (%p ?= %p)\n",
> +                rc,
> +                &(v->desc.u.rr), bad_wr);
>      }
>  }
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
http://www.cse.ohio-state.edu/~surs



More information about the general mailing list