[openib-general] Re: ib0: ipoib_ib_post_receive failed for buf 111 ib0: failed to allocate receive buffer
Roland Dreier
rolandd at cisco.com
Thu Oct 13 15:12:54 PDT 2005
Helen> It doesn't seem like shrinking the TCP window had helped.
Helen> I captured the Dmesg log from Lustre server and associated
Helen> client reporting IOZONE error.
What is the state of the system after you start seeing the ib0
transmit time out messages? Does IPoIB work at all? Is the HCA
responsive at all -- for example what do you see if you do
cat /sys/class/infiniband/mthca0/ports/1/state
or
cat /sys/class/infiniband/mthca0/ports/1/counters/*
Helen> BTW, this problem is a moving target so it is hard to
Helen> believe that it is hardware related(?) BTW, I am using the
Helen> mellanox DDR switch and HCA.
Not sure what you mean by a moving target... the symptoms really look
like a crash of the HCA firmware to me.
Thanks,
Roland
More information about the general
mailing list