[ofiwg] RFC on error handling in fi_getinfo call
Mccormick, Patrick M
patrick.m.mccormick at intel.com
Thu Jan 15 15:02:45 PST 2015
The big con to 2) that Sean brought up in earlier discussions is that:
The sockets provider supports everything, so any error from another provider will result in the application silently choosing the sockets provider and getting poor performance.
From: ofiwg-bounces at lists.openfabrics.org [mailto:ofiwg-bounces at lists.openfabrics.org] On Behalf Of Coulter, Susan K
Sent: Thursday, January 15, 2015 2:26 PM
To: Hefty, Sean
Cc: ofiwg at lists.openfabrics.org
Subject: Re: [ofiwg] RFC on error handling in fi_getinfo call
On Jan 15, 2015, at 2:49 PM, "Hefty, Sean" <sean.hefty at intel.com>
> OFI has an fi_getinfo call, which is similar to rdma_getaddrinfo and getaddrinfo. It's used to query which endpoints are supported by the underlying providers. There's been discussion on github threads on how the call should behave in the presence of errors. Without changing the API, there are 2 basic choices.
> 1. If any provider fails unexpectedly (i.e. any error other than ENODATA), the entire call fails. The error is returned to the application.
> 2. If a provider fails, the failure is skipped. Any attributes gathered from other providers are returned.
> There are pros/cons to both, and wider community feedback is needed
I would prefer #2 - ( admitting that I am not fully aware of all the pros/cons) - otherwise, one provider's bug can bring down the whole shebang.
> - Sean
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
Increase the Peace...
An eye for an eye leaves the whole world blind ====================================
ofiwg mailing list
ofiwg at lists.openfabrics.org
More information about the ofiwg