[ofa-general] crash in cm_init_qp_rts_attr() - any ideas?

Sean Hefty sean.hefty at intel.com
Wed Aug 12 16:20:41 PDT 2009


>Call Trace: <ffffffff882fb6d5>{:rdma_cm:rdma_init_qp_attr+209}
>       <ffffffff88309285>{:rdma_ucm:ucma_init_qp_attr+160}
>       <ffffffff802ea55a>{thread_return+0}
><ffffffff8830832e>{:rdma_ucm:ucma_write+115}
>       <ffffffff80186662>{vfs_write+215} <ffffffff80186c2b>{sys_write+69}
>      <ffffffff8010adba>{system_call+126}

The rdma_cm is being used, so alternate path information is not used.

>static int cm_init_qp_rts_attr(struct cm_id_private *cm_id_priv,
>                               struct ib_qp_attr *qp_attr,
>                               int *qp_attr_mask)
>{
>        ........
>        if (cm_id_priv->id.lap_state == IB_CM_LAP_UNINIT) {
>                .....
>        } else {
>               *qp_attr_mask = IB_QP_ALT_PATH | IB_QP_PATH_MIG_STATE;
>               qp_attr->alt_port_num = cm_id_priv->alt_av.port->port_num; <-die

The rdma_cm should always send us through the if portion, and I would expect
alt_av to be NULL.  Maybe the cm_id is corrupted..?  Is there any chance that
the remote side is trying to load an alternate path?  Getting the value of the
lap_state may help, to see if it's at least a valid lap_state value.

- Sean




More information about the general mailing list