[libfabric-users] connection-less send/recv with verbs

Maurizio Drocco drocco at di.unito.it
Fri Jul 14 02:24:03 PDT 2017


> My guess is that the crash is coming from this line:
> 
> request->context->internal[0] = (void *)request;
> 
> Since context is null.
> 
> The verbs RDM code requires the use of FI_CONTEXT.  So all context parameters must point to a struct fi_context.  The provider should have indicated this through the fi_info->mode bits, and the app should have set it.  I don't know if this is actually the case.

Indeed, this was causing the crash: I introduced contexts to any send/recv call and the crash has gone.

Now the problem is that, with the following code:
========================================
ret = fi_send(ep, tx_buf, 32, NULL, to, ctx);
assert(!ret);
while(true) {
	ret = fi_cq_read(cq, &comp, 1);
	/*…*/
}
========================================
no completion event is ever found in the queue (always return FI_EAGAIN).

With gdb, I noticed that the functions used by fi_send and fi_cq_read are, respectively:
- fi_ibv_rdm_send
- fi_ibv_rdm_tagged_cq_read

Do you think there can be some other issue related to modes/capabilities?

Thank you again.

Maurizio

---
Maurizio Drocco
PhD Candidate
University of Torino, department of Computer Science
Via Pessinetto 12, 10149 Torino - Italy


More information about the Libfabric-users mailing list