[openib-general] Re: SDP: BUG 2034 workaround

Libor Michalek libor at topspin.com
Wed Feb 23 11:26:24 PST 2005


On Wed, Feb 23, 2005 at 11:49:25AM +0200, Michael S. Tsirkin wrote:
> Quoting r. Libor Michalek <libor at topspin.com>:
> > Subject: Re: SDP: BUG 2034 workaround
> > 
> > On Tue, Feb 22, 2005 at 12:14:45PM +0200, Michael S. Tsirkin wrote:
> > > 
> > > sdp_inet.c, inside _sdp_inet_listen, we have:
> > > 
> > > #if 0                           /* BUG 2034 workaround. */
> > >         conn->backlog_max = backlog;
> > > #else
> > >         conn->backlog_max = 1024;
> > > #endif
> > > 
> > > What gives? what would be the proper fix as opposed to a work-around?
> > 
> >   The Linux TCP listen uses two seperate values to control the listner
> > backlog, so I ignored the backlog parameter to get closer to the default
> > Linux behavior, until a really solution is devised.
> >  
> >   Basically TCP uses a two stage backlog to defend against SYN attacks.
> > When a SYN is received a small amount of state is kept until the full
> > handshake is completed, at which point a full socket is created and
> > queued onto the listen sockets accept queue. The second stage uses the
> > listen() backlog parameter to manage the accept queue. The first stage
> > queue size is managed using a sysctl, (net.pv4.tcp_max_syn_backlog)
> > which on a lot of systems defaults to 1024.
> > 
> >   SDP on the other hand creates the full socket on the REQ request, and
> > places it directly into the listen sockets accept queue. So the full
> > depth of the queue is governed by the listen backlog parameter. Worse
> > still the socket layer listen, above TCP and SDP, limits the listen()
> > backlog that is passed to the protocol to 128, and only recently did this
> > become adjustable using net.core.somaxconn. 
> > 
> >    Using just the backlog parameter to manage this queue, results in a
> > few programs that use a high connection volume to get rejected connection,
> > casued by a full backlog queue, even when upped to the full 128 that the
> > socket layer allows by default.
> 
> I see. A protocol bug.
> I wander if some application could get confused if it
> gets more connections it set the backlog up for. What do you think?

  I've not seen an application that has a hard reliance on the backlog
limit, while I have seen applications that complain when the limit is
smaller then what they're expecting. So I think as long as the behavior
is an approximation of the TCP behavior, it's better to make the queue
too big rather then too small.

> For now, a somewhat cleaner work-around would be:
> 1. Replace the /* BUG 2034 workaround. */ comment with the excellent
>    explanation above.
> 2. Take the max value between the socket backlog and sysctl_max_syn_backlog.

  That would be a better workaround, or you could use the sum of the two
values. At this point it would be overkill to use two seperate accept
queues for passive connections, one for established connections and one
for partial connections.

> Want a patch like that?

  Yes, that would be good.

Thanks.

-Libor



More information about the general mailing list