[ofa-general] mthca max_sge value... ugh.

Roland Dreier rdreier at cisco.com
Thu May 15 15:22:12 PDT 2008


 > We've been hit by this twice this week on two NFS/RDMA servers, so I'm
 > glad to see this! But, for us it happens with memless ConnectX - our mthca
 > devices are ok (but OTOH they're memfull not memfree)

Strange... as I said before though something seems to have changed to
affect this, though I have no idea what.  I'm including the test program
I use to check if QP creation succeeds, you can run this on any suspect
systems and see what it prints.

 > I'll be happy to test it with our misbehaving cards, but I can't do it until
 > next week since they just went into a box for shipping. In the meantime,
 > dare I ask - what's different about memfree cards that limits the sge
 > attributes like this? And, what values result from the new code? The
 > ConnectX ones I have report 32, and fail when trying to set that.

The patch doesn't change ConnectX -- creating a QP with max send/recv
sge 32 works fine for me here with mlx4 from 2.6.26-rc2.  For mem-free
the new max_sge reported is 27 sge entries, and for memful it is 59 (and
creating such QPs succeeds of course).  The difference between memfree
and memful that matters is just that the max_sge on memfree runs into
the max WQE size, and the code didn't handle that correctly without the
patch.

Here's the test program to check QP creation vs reported max_sge:

#include <stdio.h>
#include <string.h>

#include <infiniband/verbs.h>

int main(int argc, char *argv)
{
	struct ibv_device      **dev_list;
	struct ibv_device_attr	 dev_attr;
	struct ibv_context	*context;
	struct ibv_pd		*pd;
	struct ibv_cq		*cq;
	struct ibv_qp_init_attr  qp_attr;
	int			 t;
	static const struct {
		enum ibv_qp_type type;
		char		*name;
	}			 type_tab[] = {
		{ IBV_QPT_RC, "RC" },
		{ IBV_QPT_UC, "UC" },
		{ IBV_QPT_UD, "UD" },
	};

	dev_list = ibv_get_device_list(NULL);
	if (!dev_list) {
		printf("No RDMA devices found\n");
		return 1;
	}

	for (; *dev_list; ++dev_list) {
		printf("%s:\n", ibv_get_device_name(*dev_list));

		context = ibv_open_device(*dev_list);
		if (!context) {
			printf("  ibv_open_device failed\n");
			continue;
		}

		if (ibv_query_device(context, &dev_attr)) {
			printf("  ibv_query_device failed\n");
			continue;
		}

		cq = ibv_create_cq(context, 1, NULL, NULL, 0);
		if (!cq) {
			printf("  ibv_create_cq failed\n");
			continue;
		}

		pd = ibv_alloc_pd(context);
		if (!pd) {
			printf("  ibv_alloc_pd failed\n");
			continue;
		}

		for (t = 0; t < sizeof type_tab / sizeof type_tab[0]; ++t) {
			memset(&qp_attr, 0, sizeof qp_attr);

			qp_attr.send_cq = cq;
			qp_attr.recv_cq = cq;
			qp_attr.cap.max_send_wr = 1;
			qp_attr.cap.max_recv_wr = 1;
			qp_attr.cap.max_send_sge = dev_attr.max_sge;
			qp_attr.cap.max_recv_sge = dev_attr.max_sge;
			qp_attr.qp_type = type_tab[t].type;

			printf("  %s: SGE %d ", type_tab[t].name, dev_attr.max_sge);

			if (ibv_create_qp(pd, &qp_attr))
				printf("ok (got %d/%d)\n",
				       qp_attr.cap.max_send_sge,
				       qp_attr.cap.max_recv_sge);
			else
				printf("FAILED\n");
		}
	}

	return 0;
}



More information about the general mailing list