[openib-general] ibv_reg_mr temporary vs permanent errors
Dotan Barak
dotanb at dev.mellanox.co.il
Thu Oct 19 00:16:12 PDT 2006
Hi Troy.
Troy Benjegerdes wrote:
> If ibv_reg_mr fails, can an application (or library, such as pvfs)
> assume that this is just a temporary error, and try to deregister
> some memory, then try again?
>
I believe that the answer is not always. They may be several reasons for
a memory registration to fail:
* bad parameters (memory type and requested permission doesn't match)
* if the permission is not legal (Remote Write is enabled but Local
Write isn't)
* the process cannot lock any more memory (ulimit configuration)
* the process cannot register any more memory regions (maybe other
processes registered all of the available MRs supported by HCA)
> How can we differentiate between the case where the hardware (such as
> ehca) actually has more information about why the memory registration
> failed, and the application can act on that information (by
> coalescing memory regions, for example), vs cases where something is
> just plain broken and the application should give up and exit.
>
For now, the gen2 driver doesn't give the user any reason for the
failure of the operation.
I hope that this will be changed ...
Dotan
More information about the general
mailing list