[ofa-general] ***SPAM*** Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets
Yossi Etigin
yossi.openib at gmail.com
Fri Sep 5 04:04:58 PDT 2008
Thanks,
Unfortunately the signal solution does not look so good,
mainly because it creates a race with 'select', and interrupts
system calls. Looks like doing the fallback inside the signal handler
is not valid/compatible behaviour.
Amir Vadai wrote:
> Yossi Hi,
>
> Because you need things fixed immediately I applied your "enable
> fallback to TCP..." patch.
>
> And will fix it ASAP - not to break the non blocking semantics.
>
> If your IO signals solution looks good I'll be happy to use it instead.
>
> - Amir.
>
> On Thu, 2008-08-28 at 20:54 +0300, Yossi Etigin wrote:
>> Hi,
>>
>> I'm attempting to do this with IO signals - install a signal handler
>> that
>> will be called when the connect fails, and it will do the fallback.
>>
>> --Yossi
>>
>> Amir Vadai wrote:
>>> Yossi Hi,
>>>
>>> I'm on vacation till Monday.
>>> I'll check when can we have the full fix - and if it is not in the
>> near
>>> future
>>> we'll put your patch till the full fix be prepared.
>>>
>>> - Amir
>>>
>>> -----Original Message-----
>>> From: Yossi Etigin [mailto:yossi.openib at gmail.com]
>>> Sent: Mon 8/25/2008 6:18 PM
>>> To: Amir Vadai
>>> Cc: general list; Oren Duer; Olga Shern
>>> Subject: Re: [PATCH] libsdp: enable fallback to TCP for nonblocking
>> sockets
>>> Hi Amir,
>>>
>>> The single case in which we block connect() here (and only on SDP,
>> which
>>> is rather fast) is the case that is currenlty not supported anyway.
>> It can
>>> also be configurable.
>>> Anyway, we have a client which uses non-blocking sockets and really
>> needs
>>> that feature. How about putting this to OFED now and writing
>> something
>>> better
>>> later on?
>>>
>>> --Yossi
>>>
>>>
>>> Amir Vadai wrote:
>>> > See below
>>> >
>>> > On Thu, 2008-08-21 at 19:49 +0300, Yossi Etigin wrote:
>>> >> Hi Amir,
>>> >>
>>> >> What you suggesting is to replace almost all socket functions,
>> and I
>>> >> don't think that this is good either.
>>> > I agree - but to break the non-blocking semantics is worse.
>>> >
>>> >> It would be write(), send(), recv(), sendto(), recvfrom(),
>> sendmsg(),
>>> >> recvmsg(), and also need to change select() (to not return when
>>> >> fallback
>>> >> happens if SDP fails), and maybe also poll(). libsdp tries to
>> avoid
>>> >> the fast path.
>>> > I don't see another option. We could have a #ifdef to enable the
>> user
>>> > to choose - non blocking support or cleaner fast-path.
>>> >> Besides, how do we know when to do fallback - can we safely
>> assume
>>> >> that if some socket operation fails, then it happened because
>>> >> connect() failed?
>>> >>From a brief look at connect man page, they say we should use
>> select for
>>> > writing on the socket. after select indicates writability, use
>>> > getsockopt to determine whether connect() completed successfully
>> or not.
>>> >> Anyway, if I understand correctly, you suggest something like:
>>> >>
>>> >> int connect(fd, ...)
>>> >> {
>>> >> ...
>>> >> set_state(fd, SDP)
>>> >> ...
>>> >> }
>>> >>
>>> >>
>>> >> int read(int fd, ...)
>>> >> {
>>> >> int res = socket_funcs.read(shadow_fd(fd), ...);
>>> >> if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP)
>> {
>>> >> sock_state = TCP;
>>> >> sockt_funs.connect(fd,...);
>>> >> close(shadow_fd(fd));
>>> >> errno = EAGAIN;
>>> >> }
>>> >> return res;
>>> >> }
>>> >>
>>> >>
>>> > ... again, I don't like it too - but I don't think we should
>> block
>>> > connect when the user asks not to.
>>> > - Amir.
>>> >> --Yossi
>>> >>
>>> >> Amir Vadai wrote:
>>> >>> Yossi Hi,
>>> >>>
>>> >>> I think that breaking the semantic of non blocking socket is a
>> bad
>>> >> idea.
>>> >>> There is a solution that won't break this semantics:
>>> >>>
>>> >>> 1. User app calls connect().
>>> >>> - libsdp try to connect through sdp.
>>> >>> 2. User app try another operation on the socket (e.g
>> read/write)
>>> >>> - if sdp connection established successfully - great
>>> >>> - if sdp still not established - return -EAGAIN. This is
>> the
>>> >>> same behaviour as if the tcp connection wasn't connected yet.
>>> >>> - if sdp timedout - return -EAGAIN and initiate TCP
>> connect.
>>> >>> - if tcp connection established - use it
>>> >>> - if tcp connection timedout - return error.
>>> >>>
>>> >>> Maybe we could optimize it and initiate a tcp connection in
>> parallel
>>> >>> with the sdp connection and use it only when the sdp connect is
>>> >>> timedout.
>>> >>>
>>> >>> I will add only the second patch (the debug print fix).
>>> >>>
>>> >>> - Amir
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>
More information about the general
mailing list