[libfabric-users] Verbs send message slow down

Supun Kamburugamuve skamburugamuve at gmail.com
Tue Mar 21 10:38:58 PDT 2017


Thanks Sean for your quick response.

I found a workaround to the problem. Previously I was doing cq_read for
multiple completions. If I do cq_read for a single completion the problem
goes away. Is this the expected behavior or something wrong with my program
or a bug in libfabric?

The problem was on the receiver side. Once I start doing single cq_reads on
the receive side the transmissions started to complete.

The slowdown was pretty significant in my case.

Thanks,
Supun..



On Tue, Mar 21, 2017 at 12:50 PM, Hefty, Sean <sean.hefty at intel.com> wrote:

> > It seems when the number of nodes increases, some of the QPs become
> > slow randomly. I noticed this with the CQs for transmitting. It seems
> > the CQ's doesn't give the completion events in an adequate time. Some
> > of them basically take the abnormally long time to complete. The
> > application uses flow control etc, and it seems those aspects are fine.
>
> How many nodes does it take before you see the slow down?  Eventually the
> number of active connections will swamp the caching capabilities on the
> NIC/HCA, which will result in QP states being swapping to/from the card
> from memory.  You can also see slowdowns if receive side buffers are not
> being re-posted quickly enough.
>
> Other than those guesses, we'd need more information about the
> setup/application to know if this is something in the verbs provider or
> underlying hardware/software.
>
> - Sean
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20170321/ff0b98e2/attachment.html>


More information about the Libfabric-users mailing list