[openib-general] RE: calling to ibv_create_qp with big number in qp_init_attr.cap. max_ inline_data never return

Dotan Barak dotanb at mellanox.co.il
Mon Jul 18 05:36:46 PDT 2005


here is a test that reproduces the problem:


#include <stdlib.h>
#include <stdio.h>
#include <infiniband/verbs.h>

#define PORT_NUM 1                  /* IB port number to work with */
#define MR_SIZE (1024)              /* size of the MR */
#define CQ_SIZE 10                  /* size of the CQ */
#define QP_CAP_SG_WR 1 /* s/g and w/r of the QP */


/* structure of test parameters */
struct config_t {
    uint32_t                tcp_port;
    const char*             ip;
    const char               *dev_name;
    uint8_t                  ib_port;

    uint8_t                  is_daemon;

};

struct config_t config = {
    20000,          /* tcp_port */
    "0",            /* ip */
    "mthca0",       /* ib_dev */
    PORT_NUM,       /* ib_port */
};


int main(
    int argc,
    char *argv[])
{
    int test_result = 1;
    struct dlist *dev_list;
    struct ibv_device *ib_dev = NULL;
    struct ibv_context *ctx = NULL;
    struct ibv_pd *pd = NULL;
    struct ibv_cq *rcq = NULL;
    struct ibv_cq *scq = NULL;
    struct ibv_qp *qp = NULL;
    
    printf("Finding IB devices\n");
    /* get device names in the system */
    dev_list = ibv_get_devices();
    if (dev_list == NULL) {
        perror("Error, failed to get IB devices list");
        goto cleanup;
    }
    
    dlist_for_each_data(dev_list, ib_dev, struct ibv_device)
        if (!strcmp(ibv_get_device_name(ib_dev), config.dev_name)) {
            break;
        }

    if (ib_dev == NULL) {
        printf("Error, IB device %s wasn't found\n", config.dev_name);
        goto cleanup;
    }
    printf("Device %s was found\n", config.dev_name);
    
    ctx = ibv_open_device(ib_dev);
    if (ctx == NULL) {
        perror("Error, failed to open device");
        goto cleanup;
    }
    
    pd = ibv_alloc_pd(ctx);
    if (pd == NULL) {
        printf("Error, failed to allocate PD\n");
        goto cleanup;
    }
    printf("PD was allocated\n");

    rcq = ibv_create_cq(ctx, CQ_SIZE, NULL);
    if (rcq == NULL) {
        perror("Error, failed to create receive CQ");
        goto cleanup;
    }
    printf("Receive CQ was created with %u entries\n", rcq->cqe);

    scq = ibv_create_cq(ctx, CQ_SIZE, NULL);
    if (scq == NULL) {
        perror("Error, failed to create send CQ");
        goto cleanup;
    }
    printf("Send was created with %u entries\n", scq->cqe);

    {
        struct ibv_qp_init_attr qp_init_attr = {
            .qp_type = IBV_QPT_RC,
            .recv_cq = rcq,
            .send_cq = scq,
            .sq_sig_all = 0,
            .cap.max_send_wr = QP_CAP_SG_WR,
            .cap.max_send_sge = QP_CAP_SG_WR,
            .cap.max_recv_wr = QP_CAP_SG_WR,
            .cap.max_recv_sge = QP_CAP_SG_WR
        };

        qp_init_attr.cap.max_inline_data = 1075060724;
        
        printf("before calling create QP\n");
        qp = ibv_create_qp(pd, &qp_init_attr);
        printf("after calling create QP\n");
        if (qp == NULL) {
            perror("Error, failed to create QP");
            goto cleanup;
        }
    }
    printf("QP with number 0x%x was created\n", qp->qp_num);

    test_result = 0;
cleanup:
    return test_result;
}



Dotan

-----Original Message-----
From: Michael S. Tsirkin [mailto:mst at mellanox.co.il]
Sent: Monday, July 18, 2005 3:36 PM
To: Roland Dreier
Cc: Dotan Barak; openib-general at openib.org
Subject: Re: calling to ibv_create_qp with big number in
qp_init_attr.cap.max_ inline_data never return


Hi, Roland!
Quoting r. Roland Dreier <rolandd at cisco.com>:
> Subject: Re: calling to ibv_create_qp with big number in
qp_init_attr.cap.max_ inline_data never return
> 
>     Dotan> the create_qp function never ends.
> 
> Where does it hang?  Can you do strace on the process?  If it's stuck
> sleeping, what does /proc/<pid>/wchan say?

Here:

        size = sizeof (struct mthca_next_seg) +
                qp->sq.max_gs * sizeof (struct mthca_data_seg);
        switch (qp->qpt) {
        case IBV_QPT_UD:
                if (mthca_is_memfree(pd->context))
                        size += sizeof (struct mthca_arbel_ud_seg);
                else
                        size += sizeof (struct mthca_tavor_ud_seg);
                break;
        default:
                /* bind seg is as big as atomic + raddr segs */
                size += sizeof (struct mthca_bind_seg);
        }

---->

        for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size;
             qp->sq.wqe_shift++)
                ; /* nothing */


The problem here is that size is bigger than 0x40000000.
As a result 1 << qp->sq.wqe_shift gets to 0x80000000, which is negative,
so its less than size, and everything starts all over again.

Looking at the code, passing insanely huge values in qp params
will get all kind of overflows (e.g. size could get negative).

I think the best way is to check qp parameters for sanity in
mthca_create_qp.

-- 
MST
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050718/0cf00277/attachment.html>


More information about the general mailing list