[ofa-general] OOM problem with ib_ipoib?

John Marshall John.Marshall at ec.gc.ca
Wed Oct 29 07:44:47 PDT 2008


Roland Dreier wrote:
>  > MemTotal:     33274492 kB
>  ...
>  > LowTotal:       638684 kB
>
> It looks as if you have a box with 32G of RAM running a 32-bit kernel,
> which means low (direct kernel-mapped) memory is extremely tight.  IPoIB
> connected mode ties up a signifcant amount of memory in the receive
> queue -- perhaps around 64M, which is 10% of low memory for you.  So
> loading IPoIB may push you past the tipping point where things really
> break easily.
>   
The curious thing is that the OOM occurs even when the ib interfaces
are _not even UP_, although the ib_ipoib module is loaded. So, I cannot
see how it can be an allocation issue in such a case related to usage. Am I
missing something here?

As well, shouldn't the OS handle this transparently via the pdflush which
will write out the data and free up memory? Or does the pdflush not
distinguish between total memory and low memory so that a problem
occurs (yet the OOM happens even when the interfaces are not UP!)?
> I'm not surprised that you run into memory management problems with such
> a system -- 32-bit kernels really have a hard time coping with such an
> inbalance between total memory and low memory.  The simplest solution
> would probably be to switch to a 64-bit kernel -- note that you don't
> have to change any userspace, just use a 64-bit kernel.
>   
I will give it a shot.

Thanks,
John



More information about the general mailing list