[openib-general][PATCH][RFC]: CMA IB implementation

Tue Sep 20 05:04:45 PDT 2005

Hi Sean,

Thanks a lot for all your comments.

I think that adding cma_create and force use of cma_destroy, as you suggested, 
would take care of many of the comments below.

Sean Hefty wrote:
>> #define CMA_TARGET_MAX        4
>> #define CMA_INITIATOR_DEPTH    4
>> #define CMA_RC_RETRY_COUNT    7
>> #define CMA_RNR_RETRY_COUNT    6
>> #define CMA_CM_RESPONSE_TIMEOUT    20 /* 4 sec */
>> #define CMA_MAX_CM_RETRIES    0
> 
> Are these values hard-coded just for the initial implementation?  How 
> would these change?
I thought the consumer would pass some of them in the qp_attr that in struct 
ib_cma_conn, other parameters can be added to the struct, if needed.

>> enum cma_close_flags {
>>     CMA_CLOSE_ABRUPT = 0,
>>     CMA_CLOSE_GRACEFUL
>> };
> 
> Not sure what these are for.  Why not have the user always destroy the 
> cma_id? If it hasn't yet been destroyed when a disconnect comes in, 
> callback the user. If a connection hasn't been disconnected when it is 
> destroyed, automatically send a disconnect message.

OK.

>> struct accept_callback {
>>     ib_cma_ac_handler func;
>>     void *context;
>> };
>>
>> struct listen_callback {
>>     ib_cma_listen_handler func;
>>     void *context;
>> };
> 
> 
> These could be eliminated if we just associated a context with a cma_id 
> and left it at that.  I think we're asking for race conditions if we try 
> to update the context with every callback.

I don't think I update the context more then once (unless the consumer calls 
cma_accept twice or something..)
It is there for convenience - to distinguish between a context held for an 
accept cb or listen cb.

>> static int cma_modify_qp_state(struct ib_cm_id *cm_id, struct ib_qp *qp,
>>                       enum ib_qp_state qp_state,                       
>> int qp_attr_mask)
>> {
>>     struct ib_qp_attr qp_attr;
>>     int status = 0;
>>     printk(KERN_DEBUG PFX "%s: enter >>> modify to %d\n",
>>            __func__, qp_state);
>>
>>     if (qp == NULL)
>>         return -EINVAL;
> We shouldn't need checks like this in the kernel.

OK

>>     memset(&qp_attr, 0, sizeof qp_attr);
>>     qp_attr.qp_state = qp_state;
>>
>>     if (cm_id && !qp_attr_mask) 
> Or this check...

This check we do need, because:
- when we call modify qp state to RTR or RTS cm_id is valid and 
qp_attr_mask==0, so we need to call ib_cm_init_qp_attr
- when we call modify qp state to ERROR cm_id==0 and  qp_attr_mask is valid

>> static int destroy_cma_ctx(struct cma_context *cma_ctx)
>> {
>>     if(!IS_ERR(cma_ctx->cm_id))
>>         ib_destroy_cm_id(cma_ctx->cm_id);
>>     if (cma_ctx->cma_param.private_data)
>>         kfree(cma_ctx->cma_param.private_data);
> 
> 
> Is this the outbound private data or inbound?  Why not tie the private 
> data to an event and avoid storing it with the cma_ctx?

It is the private data passed by the consumer in the connection request.
I stored it in cma_ctx to retrieve it back in the cma_path_handler.
For some reason I saw that the private data was garbaged when I passed the 
consumer's pointer of private_data (maybe it was my improvised test module)...

> 
>>     if (cma_ctx)
>>         kfree(cma_ctx);
>>     
>>     cma_ctx = NULL;
> 
> 
> Is there a reason to set this to NULL?
> 
>>     return 0;
>> }

The reason was to check cma_ctx in cma_disconnect - as to not free it twice. 
After the change of forcing cma_destroy call before disconnect - this would no 
longer be needed.

>> static int cma_disconnect(struct ib_qp *qp,
>>               struct cma_context *cma_ctx,               enum 
>> cma_close_flags cflags)
>> {
>>     int status;
>>
>>     if (cma_ctx == NULL)
>>         goto modqp;
> 
> 
> We shouldn't need this check.

Right. see above.

>>     if (cflags == CMA_CLOSE_ABRUPT)
>>         status = destroy_cma_ctx(cma_ctx);
> 
> 
> Why would this call fail?  What would the user do if it does?

I will change it to void.

> 
>>     else if (cflags == CMA_CLOSE_GRACEFUL){
>>         status = ib_send_cm_dreq(cma_ctx->cm_id, NULL, 0);
>>     }
> 
> 
> See comments above.  Eliminate the GRACEFUL/ABRUPT flags and just let 
> the user either issue the disconnect or just destroy the cma_ctx.

Agreed.

>> modqp:
>>     status = cma_modify_qp_state(0, qp, IB_QPS_ERR, IB_QP_STATE);
>>     return status;
>> }
>>
>> void cma_connection_callback(struct cma_context *cma_ctx,
>>                 const enum ib_cma_event event,
>>                 const void *private_data)
>> {
>>     ib_cma_event_handler conn_cb;
>>     struct ib_qp *qp = cma_ctx->cma_conn.qp;
>>     int status;
>>
>>     conn_cb = cma_ctx->cma_conn.cma_event_handler;
>>     
>>     switch (event) {
>>     case IB_CMA_EVENT_ESTABLISHED:
>>         break;
>>     case IB_CMA_EVENT_DISCONNECTED:
>>     case IB_CMA_EVENT_REJECTED:
>>     case IB_CMA_EVENT_UNREACHABLE:
>>     case IB_CMA_EVENT_NON_PEER_REJECTED:
>>         status = cma_disconnect(qp, cma_ctx, CMA_CLOSE_ABRUPT);
> 
> 
> This is destroying the cma_ctx without the user knowing it.  The 
> dereference to cma_ctx below will crash.  We shouldn't take any action 
> on behalf of the user. Simply report the error and let the user destroy 
> the cma_id.

This is the same of create/destroy issue. I will remove the call.

>> int cma_active_cb_handler(struct ib_cm_id *cm_id, struct ib_cm_event 
>> *event)
>> {
>>     int status = 0;
>>     enum ib_cma_event cma_event = 0;
>>     struct cma_context *cma_ctx = cm_id->context;
>>
>>     printk(KERN_DEBUG PFX "%s: enter >>> cm_id=%p 
>> cma_ctx=%p\n",__func__,         cm_id, cma_ctx);
>>
>>     switch (event->event) {
>>     case IB_CM_REQ_ERROR:
>>         cma_event = IB_CMA_EVENT_UNREACHABLE;
>>         break;
>>     case IB_CM_REJ_RECEIVED:
>>         cma_event = IB_CMA_EVENT_NON_PEER_REJECTED;
>>         break;
>>     case IB_CM_DREP_RECEIVED:
>>     case IB_CM_TIMEWAIT_EXIT:
>>         cma_event = IB_CMA_EVENT_DISCONNECTED;
>>         break;
>>     case IB_CM_REP_RECEIVED:
>>         status = cma_rep_recv(cma_ctx, event);
>>         if (!status)
>>             cma_event = IB_CMA_EVENT_ESTABLISHED;
>>         else
>>             cma_event = IB_CMA_EVENT_DISCONNECTED;
>>         break;
>>     case IB_CM_DREQ_RECEIVED:
>>         ib_send_cm_drep(cm_id, NULL, 0);
>>         cma_event = IB_CMA_EVENT_DISCONNECTED;
>>         break;
>>     case IB_CM_DREQ_ERROR:
>>         break;
>>     default:
>>         printk(KERN_WARNING PFX "%s: cm event (%d) not handled\n",
>>             __func__, event->event);
>>         break;
>>     }
>>
>>     printk(KERN_WARNING PFX "%s: cm_event=%d cma_event=%d\n",
>>         __func__, event->event, cma_event);
>>
>>     if (cma_event)
> 
> This check isn't needed.

OK

>>         cma_connection_callback(cma_ctx, cma_event, 
>>                     event->private_data);
>>
>>     return status;
> 
> Returning non-zero will destroy the underlying cm_id.  

Interesting. I didn't know that.

> We can avoid some 
> synchronization by letting it exist until the user destroys the 
> corresponding cma_id.  Otherwise, there's the potential of the user 
> trying to destroy it twice.  Once from the cma_connection_callback 
> reporting an error, and then again here.

So you suggests this function will always return 0, then ?

>> }
>>
>> static struct cma_context *get_cma_ctx(struct ib_cm_id *cm_id,
>>                        struct ib_cm_event *event)
>> {
>>     struct cma_context *new_cma_ctx;
>>     int status;
>>
>>     
>>     if (event->event != IB_CM_REQ_RECEIVED)
>>         return cm_id->context;
>>
>>     if (((struct cma_context *)cm_id->context)->cm_id != cm_id) 
>>         printk(KERN_DEBUG PFX "%s: old_cm_id=%p new_cm_id=%p\n", 
>>                __func__, ((struct cma_context *)cm_id->context)->cm_id,
>>                cm_id);
> 
> 
> This check shouldn't be needed.

It is just for debug purposes - I will remove it later on.

>>     new_cma_ctx = kmalloc(sizeof *new_cma_ctx, GFP_KERNEL);
>>     if (!new_cma_ctx) {
>>         status = ib_send_cm_rej(cm_id,
>>                     IB_CM_REJ_CONSUMER_DEFINED,
>>                     NULL, 0, NULL, 0);
>>         return NULL;
>>     }
>>
>>     memset(new_cma_ctx, 0, sizeof *new_cma_ctx);
>>     new_cma_ctx->cm_id = cm_id;
>>     new_cma_ctx->creq_cma_ctx = cm_id->context;
>>     cm_id->context = new_cma_ctx;
>>
>>     return new_cma_ctx;
>> }
>>
>> static void cma_path_handler(u64 req_id, void *context, int rec_num)
>> {
>>     struct cma_context *cma_ctx = context;
>>     enum ib_cma_event event;
>>     int status = 0;
>>
>>     if (!cma_ctx) {
> 
> This check isn't needed.

What if the consumer destroyed the cma_id, before the path handler cb returned ?

>>         printk(KERN_ERR PFX "%s: context received null\n",__func__);
>>         return;

>> int ib_cma_create_qp(struct ib_pd *pd, u8 port, struct ib_qp **qp_in, 
>>              struct ib_qp_init_attr *init_attr)
>> {
> 
> Why not return struct ib_qp* similar to how the other APIs operate?

I thought of returning ib_modify_qp status, but I agree that destroying the qp 
if failed and just returning the qp if not, is better.

>>     struct ib_qp_attr qp_attr;
>>     int qp_attr_mask;
>>     struct ib_qp *qp;
>>
>>     qp = ib_create_qp(pd, init_attr);
>>     if (IS_ERR(qp))
>>         return IS_ERR(qp);
>>     *qp_in = qp;
>>     printk(KERN_DEBUG PFX "%s: qp created (%p)\n",__func__, qp);
>>     memset(&qp_attr, 0, sizeof qp_attr);
>>
>>     qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS |                
>> IB_QP_PKEY_INDEX | IB_QP_PORT;
>>
>>     qp_attr.qp_access_flags = IB_ACCESS_REMOTE_READ | 
>>                   IB_ACCESS_REMOTE_WRITE;
>>     qp_attr.qp_state = IB_QPS_INIT;
>>     qp_attr.pkey_index = 0;
>>     qp_attr.port_num = port;
>>     printk(KERN_DEBUG PFX "%s: call ib_modify_qp (qp_attr_mask=%x)\n",
>>         __func__, qp_attr_mask);
>>     
>>     return ib_modify_qp(qp, &qp_attr, qp_attr_mask);
> 
> 
> If modify fails, the QP should be destroyed.

OK. see above.

>> int ib_cma_reject(void *cma_id, const void *private_data,
>>           u8 private_data_len)
>> {
>>     struct cma_context *cma_ctx = cma_id;
>>     int status;
>>
>>     if (cma_ctx == NULL)
> 
> This check isn't needed.

OK

  > As a general statement, can we reduce the number of debug print statements?

Sure.
This is mainly for bring up purposes, they can be removed after the code stabilizes.

Thanks again for your detailed feedback
Guy.