[ewg] Interop test failure using OFED-3.5 RC4
Marciniszyn, Mike
mike.marciniszyn at intel.com
Fri Jan 11 11:36:38 PST 2013
I've opened OFED bz 2410 for this issue.
Mike
> -----Original Message-----
> From: Woodruff, Robert J
> Sent: Friday, January 11, 2013 1:30 PM
> To: Marciniszyn, Mike; Elken, Tom; ewg at lists.openfabrics.org; Ido Shamai
> Subject: RE: Interop test failure using OFED-3.5 RC4
>
>
> Adding Shamai from Mellanox to this thread.
>
> Woody
>
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Marciniszyn, Mike
> Sent: Friday, January 11, 2013 7:51 AM
> To: Elken, Tom; ewg at lists.openfabrics.org
> Subject: Re: [ewg] Interop test failure using OFED-3.5 RC4
>
> This is definitely a perftest bug.
>
> This is a significant re-write of these utilities and this bug is a regression in the
> routine ctx_set_out_reads().
>
> In 1.4 the code is this:
> /****************************************************************
> **************
> *
>
> ****************************************************************
> **************/
> static int ctx_set_out_reads(struct ibv_context *context,int num_user_reads) {
>
>
> int max_reads;
>
> max_reads = (is_dev_hermon(context) == HERMON) ?
> MAX_OUT_READ_HERMON : MAX_OUT_READ;<---------------
>
> if (num_user_reads > max_reads) {
> fprintf(stderr," Number of outstanding reads is above max =
> %d\n",max_reads);
> fprintf(stderr," Changing to that max value\n");
> num_user_reads = max_reads;
> }
> else if (num_user_reads <= 0) {
> num_user_reads = max_reads;
> }
>
> return num_user_reads;
> }
>
> The new 2.0 code is:
> /****************************************************************
> **************
> *
>
> ****************************************************************
> **************/
> static int ctx_set_out_reads(struct ibv_context *context,int num_user_reads) {
>
>
> int max_reads;
>
> Device ib_fdev = ib_dev_name(context);
>
> switch (ib_fdev) {
> case CONNECTIB : ;
> case CONNECTX3 : ;
> case CONNECTX2 : ;
> case CONNECTX : max_reads = MAX_OUT_READ_HERMON; break;
> case LEGACY : max_reads = MAX_OUT_READ; break;
> default : max_reads = 0; <--------------------
> }
>
> if (num_user_reads > max_reads) {
> printf(RESULT_LINE);
> fprintf(stderr," Number of outstanding reads is above max =
> %d\n",max_reads);
> fprintf(stderr," Changing to that max value\n");
> num_user_reads = max_reads;
> }
> else if (num_user_reads <= 0) {
> num_user_reads = max_reads;
> }
>
> return num_user_reads;
> }
>
> The old code will return MAX_OUT_READ, while the new code for any other
> HCAs (qib and probably others), will return 0.
>
> I have a patch that works, while preserving the desired hardcoded values for
> "known/legacy" devices:
> +
> +/***************************************************************
> *******
> +********
> + *
> +
> +***************************************************************
> ********
> +*******/ static int device_max_reads(struct ibv_context *context) {
> + struct ibv_device_attr attr;
> + int ret = 0;
> +
> + if (!ibv_query_device(context,&attr)) {
> + ret = attr.max_qp_rd_atom;
> + }
> + return ret;
> +}
> +
>
> /****************************************************************
> **************
> *
>
> ****************************************************************
> **************/
> @@ -496,7 +510,7 @@ static int ctx_set_out_reads(struct ibv_
> case CONNECTX2 : ;
> case CONNECTX : max_reads = MAX_OUT_READ_HERMON; break;
> case LEGACY : max_reads = MAX_OUT_READ; break;
> - default : max_reads = 0;
> + default : max_reads = device_max_reads(context);
> }
>
> if (num_user_reads > max_reads) {
>
> I'm curious why the old and new code used hardcoded values?
>
> Mike
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
More information about the ewg
mailing list