[ofiwg] RFC on error handling in fi_getinfo call

Coulter, Susan K skc at lanl.gov
Thu Jan 15 15:34:42 PST 2015


On Jan 15, 2015, at 4:02 PM, "Mccormick, Patrick M" <patrick.m.mccormick at intel.com> wrote:

> The big con to 2) that Sean brought up in earlier discussions is that:
> 
> The sockets provider supports everything, so any error from another provider will result in the application silently choosing the sockets provider and getting poor performance.
> 

Thank you.  And I guess the operative word here is "silently".
Is that the bit that would require an API change?  ( not being silent about the sockets choice ? )
If so, this seems like a choice between okra and cauliflower.  Blech.
Curious to hear others opinions ...

> Patrick
> 
> -----Original Message-----
> From: ofiwg-bounces at lists.openfabrics.org [mailto:ofiwg-bounces at lists.openfabrics.org] On Behalf Of Coulter, Susan K
> Sent: Thursday, January 15, 2015 2:26 PM
> To: Hefty, Sean
> Cc: ofiwg at lists.openfabrics.org
> Subject: Re: [ofiwg] RFC on error handling in fi_getinfo call
> 
> 
> On Jan 15, 2015, at 2:49 PM, "Hefty, Sean" <sean.hefty at intel.com>
> wrote:
> 
>> OFI has an fi_getinfo call, which is similar to rdma_getaddrinfo and getaddrinfo.  It's used to query which endpoints are supported by the underlying providers.  There's been discussion on github threads on how the call should behave in the presence of errors.  Without changing the API, there are 2 basic choices.
>> 
>> 1. If any provider fails unexpectedly (i.e. any error other than ENODATA), the entire call fails.  The error is returned to the application.
>> 
>> 2. If a provider fails, the failure is skipped.  Any attributes gathered from other providers are returned.
>> 
>> There are pros/cons to both, and wider community feedback is needed
> 
> I would prefer #2 - ( admitting that I am not fully aware of all the pros/cons) - otherwise, one provider's bug can bring down the whole shebang.
> 
>> 
>> - Sean
>> _______________________________________________
>> ofiwg mailing list
>> ofiwg at lists.openfabrics.org
>> http://lists.openfabrics.org/mailman/listinfo/ofiwg
> 
> ====================================
> 
> Susan Coulter
> HPC-3 Network/Infrastructure
> 505-667-8425
> Increase the Peace...
> An eye for an eye leaves the whole world blind ====================================
> 
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg

====================================

Susan Coulter
HPC-3 Network/Infrastructure
505-667-8425
Increase the Peace...
An eye for an eye leaves the whole world blind
====================================




More information about the ofiwg mailing list