[openib-general] Re: RDS RX buf allocation why on RX callback flow?

Ranjit Pandit rpandit at silverstorm.com
Thu May 4 21:37:11 PDT 2006


On 5/4/06, Or Gerlitz <or.gerlitz at gmail.com> wrote:
> On 4/27/06, Ranjit Pandit <rpandit at silverstorm.com> wrote:
> > On 4/27/06, Leonid Arsh <leonida at voltaire.com> wrote:
>
> >> During the run we get error messages in dmesg on the server side.
> >> Have you seen anything like this?
> >> Please see the dmesg output below:
>
> > I will see if I can reproduce it.
>
> I think the issue here is not to reproduce it (easy)  but to understand/discuss
> the design. You are doing GFP_ATOMIC allocation in the rx callback
> flow which can
> fail ofcourse but since this is hard irq context you can't use GFP_KERNEL.

That is correct. The allocations have to be GFP_ATOMIC since they are
happening in interrupt context.

>
> Since RDS comes to offload Oracle IPC which is somehow transactional
> by nature (at least the cache fusions) does it make sense to you in
> the server side to post initial rx buffers and then for each TX before
> (after) post sending it just post another rx buffer. Same for the
> client side before (after) posting tx post an rx. This would make the
> rx posting from thread (process) context and you can use GFP_KERNEL.

There in no notion of client/server in RDS. Only passive/active based
on which node first initiated the connection.
Once the connection is established it's more peer-to-peer in terms of
data movement.

We can't depend on Tx path to refill the Rx queue as the application
could decide not to send anything but only receive.

We attempt to allocate a new Rx buffer in the interrupt context. If
the allocations fail, and we go below a certain low water mark, we
should wakeup a thread to refill the Rx queue.
Currently I'm re-posting an Rx buffer when it's done being read by recv_msg().

In this particular case though any ideas why the kernel should
complain when a GFP_ATOMIC allocation fails??

>
> What about the other tpyes of Oracle IPC, do they also have
> transactional (req/resp) nature?
>
> Or.
>
> > > swapper: page allocation failure. order:1, mode:0x20
> > >
> > > Call Trace: <IRQ> <ffffffff801572ae>{__alloc_pages+662}
> > > <ffffffff801184c7>{smp_apic_timer_interrupt+54}
> > >       <ffffffff8010e63c>{apic_timer_interrupt+132}
> > > <ffffffff8015a0ff>{cache_grow+288}
> > >       <ffffffff8015a4ef>{cache_alloc_refill+419}
> > > <ffffffff80159fb2>{kmem_cache_alloc+87}
> > >       <ffffffff8824c01c>{:ib_rds:rds_alloc_buf+16}
> > > <ffffffff8824c0f1>{:ib_rds:rds_alloc_recv_buffer+12}
> > >       <ffffffff8824b377>{:ib_rds:rds_post_new_recv+23}
> > > <ffffffff8824bfc3>{:ib_rds:rds_recv_completion+85}
> > >       <ffffffff88249b6f>{:ib_rds:rds_cq_callback+87}
> > > <ffffffff8814882b>{:ib_mthca:mthca_eq_int+119}
> > >       <ffffffff801102d8>{do_IRQ+50} <ffffffff8010de1e>{ret_from_intr+0}
> > >       <ffffffff88148b45>{:ib_mthca:mthca_tavor_interrupt+91}
> > >       <ffffffff80151bd5>{handle_IRQ_event+41}
> > > <ffffffff80151ca2>{__do_IRQ+156}
> > >       <ffffffff801102d3>{do_IRQ+45} <ffffffff8010de1e>{ret_from_intr+0}
> > >        <EOI> <ffffffff8010be87>{mwait_idle+54}
> > > <ffffffff8010be37>{cpu_idle+93}
> > >       <ffffffff8052733d>{start_secondary+1131}
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list