[ofa-general] ***SPAM*** Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets

Yossi Etigin yossi.openib at gmail.com
Fri Sep 5 04:04:58 PDT 2008


Thanks,
Unfortunately the signal solution does not look so good,
mainly because it creates a race with 'select', and interrupts
system calls. Looks like doing the fallback inside the signal handler
is not valid/compatible behaviour.


Amir Vadai wrote:
> Yossi Hi,
> 
> Because you need things fixed immediately I applied your "enable
> fallback to TCP..." patch.
> 
> And will fix it ASAP - not to break the non blocking semantics.
> 
> If your IO signals solution looks good I'll be happy to use it instead.
> 
> - Amir. 
> 
> On Thu, 2008-08-28 at 20:54 +0300, Yossi Etigin wrote:
>> Hi,
>>
>> I'm attempting to do this with IO signals - install a signal handler
>> that
>> will be called when the connect fails, and it will do the fallback.
>>
>> --Yossi
>>
>> Amir Vadai wrote:
>>> Yossi Hi,
>>>
>>> I'm on vacation till Monday.
>>> I'll check when can we have the full fix - and if it is not in the
>> near
>>> future
>>> we'll put your patch till the full fix be prepared.
>>>
>>> - Amir
>>>
>>> -----Original Message-----
>>> From: Yossi Etigin [mailto:yossi.openib at gmail.com]
>>> Sent: Mon 8/25/2008 6:18 PM
>>> To: Amir Vadai
>>> Cc: general list; Oren Duer; Olga Shern
>>> Subject: Re: [PATCH] libsdp: enable fallback to TCP for nonblocking
>> sockets
>>> Hi Amir,
>>>
>>> The single case in which we block connect() here (and only on SDP,
>> which
>>> is rather fast) is the case that is currenlty not supported anyway.
>> It can
>>> also be configurable.
>>>  Anyway, we have a client which uses non-blocking sockets and really
>> needs
>>> that feature. How about putting this to OFED now and writing
>> something
>>> better
>>> later on?
>>>
>>> --Yossi
>>>
>>>
>>> Amir Vadai wrote:
>>>  > See below
>>>  >
>>>  > On Thu, 2008-08-21 at 19:49 +0300, Yossi Etigin wrote:
>>>  >> Hi Amir,
>>>  >>
>>>  >> What you suggesting is to replace almost all socket functions,
>> and I
>>>  >> don't think that this is good either.
>>>  > I agree - but to break the non-blocking semantics is worse.
>>>  >
>>>  >> It would be write(), send(), recv(), sendto(), recvfrom(),
>> sendmsg(),
>>>  >> recvmsg(), and also need to change select() (to not return when
>>>  >> fallback
>>>  >> happens if SDP fails), and maybe also poll(). libsdp tries to
>> avoid
>>>  >> the fast path.
>>>  > I don't see another option. We could have a #ifdef to enable the
>> user
>>>  > to choose - non blocking support or cleaner fast-path.
>>>  >> Besides, how do we know when to do fallback - can we safely
>> assume
>>>  >> that if some socket operation fails, then it happened because
>>>  >> connect() failed?
>>>  >>From a brief look at connect man page, they say we should use
>> select for
>>>  > writing on the socket. after select indicates writability, use
>>>  > getsockopt to determine whether connect() completed successfully
>> or not.
>>>  >> Anyway, if I understand correctly, you suggest something like:
>>>  >>
>>>  >> int connect(fd, ...)
>>>  >> {
>>>  >>         ...
>>>  >>         set_state(fd, SDP)
>>>  >>         ...
>>>  >> }
>>>  >>
>>>  >>
>>>  >> int read(int fd, ...)
>>>  >> {
>>>  >>         int res = socket_funcs.read(shadow_fd(fd), ...);
>>>  >>         if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP)
>> {
>>>  >>                 sock_state = TCP;
>>>  >>                 sockt_funs.connect(fd,...);
>>>  >>                 close(shadow_fd(fd));
>>>  >>                 errno = EAGAIN;
>>>  >>         }
>>>  >>         return res;
>>>  >> }
>>>  >>
>>>  >>
>>>  > ... again, I don't like it too - but I don't think we should
>> block
>>>  > connect when the user asks not to.
>>>  > - Amir.
>>>  >> --Yossi
>>>  >>
>>>  >> Amir Vadai wrote:
>>>  >>> Yossi Hi,
>>>  >>>
>>>  >>> I think that breaking the semantic of non blocking socket is a
>> bad
>>>  >> idea.
>>>  >>> There is a solution that won't break this semantics:
>>>  >>>
>>>  >>> 1. User app calls connect().
>>>  >>>       - libsdp try to connect through sdp.
>>>  >>> 2. User app try another operation on the socket (e.g
>> read/write)
>>>  >>>       - if sdp connection established successfully - great
>>>  >>>       - if sdp still not established - return -EAGAIN. This is
>> the
>>>  >>> same behaviour as if the tcp connection wasn't connected yet.
>>>  >>>       - if sdp timedout - return -EAGAIN and initiate TCP
>> connect.
>>>  >>>       - if tcp connection established - use it
>>>  >>>       - if tcp connection timedout - return error.
>>>  >>>
>>>  >>> Maybe we could optimize it and initiate a tcp connection in
>> parallel
>>>  >>> with the sdp connection and use it only when the sdp connect is
>>>  >>> timedout.
>>>  >>>
>>>  >>> I will add only the second patch (the debug print fix).
>>>  >>>
>>>  >>> - Amir
>>>  >>>
>>>  >>>
>>>  >>
>>>  >>
>>>  >
>>>
>>
> 



More information about the general mailing list