[libfabric-users] Verbs send message slow down

Ilango, Arun arun.ilango at intel.com
Wed Mar 22 10:53:36 PDT 2017


Okay. I’ll try to reproduce the issue with multiple endpoints each having their own CQs. If you have any code snippet / reproducer that would be helpful.

Thanks,
Arun.

From: Supun Kamburugamuve [mailto:skamburugamuve at gmail.com]
Sent: Tuesday, March 21, 2017 6:58 PM
To: Ilango, Arun <arun.ilango at intel.com>
Cc: Hefty, Sean <sean.hefty at intel.com>; libfabric-users at lists.openfabrics.org
Subject: Re: [libfabric-users] Verbs send message slow down

I use a separate CQ for each endpoint.

Thanks,
Supun..

On Tue, Mar 21, 2017 at 5:03 PM, Ilango, Arun <arun.ilango at intel.com<mailto:arun.ilango at intel.com>> wrote:
I see. Do you use a common CQ or a CQ per endpoint? We do have a lock in the CQ to serialize access to it.

-Arun.

From: Supun Kamburugamuve [mailto:skamburugamuve at gmail.com<mailto:skamburugamuve at gmail.com>]
Sent: Tuesday, March 21, 2017 1:07 PM
To: Ilango, Arun <arun.ilango at intel.com<mailto:arun.ilango at intel.com>>
Cc: Hefty, Sean <sean.hefty at intel.com<mailto:sean.hefty at intel.com>>; libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>

Subject: Re: [libfabric-users] Verbs send message slow down

Thanks, Arun, will try this. This happens with multiple connections, though. The program works when there is a small number of nodes sending data. For example, for 4 nodes it works. For 8 Nodes one or two QP's become slow.

Supun..

On Tue, Mar 21, 2017 at 3:26 PM, Ilango, Arun <arun.ilango at intel.com<mailto:arun.ilango at intel.com>> wrote:
Hi Supun,

I tried to reproduce the issue but I'm not hitting it. I have modified fabtests bandwidth test to read multiple completions (=window_size) at a time on the receive side. I have the patch on the following branch:
https://github.com/a-ilango/fabtests/tree/comp_debug

I'm using fi_cq_read. Please try out fi_msg_bw and see if you get the issue.

Thanks,
Arun.

-----Original Message-----
From: Libfabric-users [mailto:libfabric-users-bounces at lists.openfabrics.org<mailto:libfabric-users-bounces at lists.openfabrics.org>] On Behalf Of Ilango, Arun
Sent: Tuesday, March 21, 2017 11:07 AM
To: Hefty, Sean <sean.hefty at intel.com<mailto:sean.hefty at intel.com>>; Supun Kamburugamuve <skamburugamuve at gmail.com<mailto:skamburugamuve at gmail.com>>
Cc: libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>
Subject: Re: [libfabric-users] Verbs send message slow down

This might need some debugging. In the verbs provider, we call ibv_poll_cq with num_entries set to 1 and repeatedly call it when the app has requested multiple completions. I'm not sure if this is the issue but let me check.

-Arun.

-----Original Message-----
From: Libfabric-users [mailto:libfabric-users-bounces at lists.openfabrics.org<mailto:libfabric-users-bounces at lists.openfabrics.org>] On Behalf Of Hefty, Sean
Sent: Tuesday, March 21, 2017 10:53 AM
To: Supun Kamburugamuve <skamburugamuve at gmail.com<mailto:skamburugamuve at gmail.com>>
Cc: libfabric-users at lists.openfabrics.org<mailto:libfabric-users at lists.openfabrics.org>
Subject: Re: [libfabric-users] Verbs send message slow down

> I found a workaround to the problem. Previously I was doing cq_read
> for multiple completions. If I do cq_read for a single completion the
> problem goes away. Is this the expected behavior or something wrong
> with my program or a bug in libfabric?

This is not expected behavior.  Something else is going goofy here.
_______________________________________________
Libfabric-users mailing list
Libfabric-users at lists.openfabrics.org<mailto:Libfabric-users at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/libfabric-users
_______________________________________________
Libfabric-users mailing list
Libfabric-users at lists.openfabrics.org<mailto:Libfabric-users at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/libfabric-users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20170322/da609b00/attachment.html>


More information about the Libfabric-users mailing list