[openib-general] ibv_reg_mr temporary vs permanent errors

Dotan Barak dotanb at dev.mellanox.co.il
Thu Oct 19 00:16:12 PDT 2006


Hi Troy.

Troy Benjegerdes wrote:
> If ibv_reg_mr fails, can an application (or library, such as pvfs)  
> assume that this is just a temporary error, and try to deregister  
> some memory, then try again?
>   
I believe that the answer is not always. They may be several reasons for 
a memory registration to fail:
* bad parameters (memory type and requested permission doesn't match)
* if the permission is not legal (Remote Write is enabled but Local 
Write isn't)
* the process cannot lock any more memory (ulimit configuration)
* the process cannot register any more memory regions (maybe other 
processes registered all of the available MRs supported by HCA)
> How can we differentiate between the case where the hardware (such as  
> ehca) actually has more information about why the memory registration  
> failed, and the application can act on that information (by  
> coalescing memory regions, for example), vs cases where something is  
> just plain broken and the application should give up and exit.
>   
For now, the gen2 driver doesn't give the user any reason for the 
failure of the operation.
I hope that this will be changed ...

Dotan




More information about the general mailing list