[ofa-general] IPoIB post_send failed

Hal Rosenstock hal.rosenstock at gmail.com
Wed Jul 29 11:31:41 PDT 2009


Hi Pradeep,

On Wed, Jul 29, 2009 at 2:14 PM, Pradeep Satyanarayana <
pradeeps at linux.vnet.ibm.com> wrote:

> Hal Rosenstock wrote:
> > Hi,
> >
> > I'm seeing the following messages from IPoIB:
> > ib0: post_send failed
> > ib0: post_send failed
> > ib0: post_send failed
> > ib0: post_send failed
> > ib0: post_send failed
> > ib0: post_send failed
> > NETDEV WATCHDOG: ib0: transmit timed out
> > ib0: transmit timeout: latency 1374 msecs
> > ib0: queue stopped 1, tx_head 140245691, tx_tail 140245565
> >
> > What are the possible (and most likely) causes of post_send failures ? I
> > went through the code for all the errors (some at the driver level) but
> > none popped out at me.
> >
>
> Is it possible that the receiver is overwhelmed and hence the tx_ring is
> full?


It's possible but from the message you can't tell whether the tx_ring is
full.

Does it make sense to increase the transmit ring size via send_queue_size
mod param ?


>
> Is this a UDP application?


There is at least some UDP and there are many concurrent clients.


>
>
> > Once the transmit queue is stopped, does the interface need to be taken
> > down and then back up to restart this ?
>
> One does not need to take down the interface. It should be able to recover
> on it's
> own. There is a timer that kicks in and checks if the tx_ring is still full
> or not-
> the transmits should start again. Thanks!


Thanks for the help!

-- Hal


>
>
> Pradeep
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090729/e8219477/attachment.html>


More information about the general mailing list