[ewg] Interop test failure using OFED-3.5 RC4
Woodruff, Robert J
robert.j.woodruff at intel.com
Mon Jan 14 09:52:10 PST 2013
Were you able to get the new package posted yet ?
We need this ASAP so we can do another OFED-3.5 RC.
Woody
-----Original Message-----
From: Ido Shamai [mailto:idos at dev.mellanox.co.il]
Sent: Friday, January 11, 2013 12:32 PM
To: Marciniszyn, Mike
Cc: Woodruff, Robert J; Elken, Tom; ewg at lists.openfabrics.org; Hefty, Sean; Mascarenhas, Edward
Subject: Re: Interop test failure using OFED-3.5 RC4
On 1/11/2013 9:36 PM, Marciniszyn, Mike wrote:
> I've opened OFED bz 2410 for this issue.
>
> Mike
Great thanks.
I will apply the patch and release a new version to OFED website
tomorrow morning.
Ido
>> -----Original Message-----
>> From: Woodruff, Robert J
>> Sent: Friday, January 11, 2013 1:30 PM
>> To: Marciniszyn, Mike; Elken, Tom; ewg at lists.openfabrics.org; Ido Shamai
>> Subject: RE: Interop test failure using OFED-3.5 RC4
>>
>>
>> Adding Shamai from Mellanox to this thread.
>>
>> Woody
>>
>> -----Original Message-----
>> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
>> bounces at lists.openfabrics.org] On Behalf Of Marciniszyn, Mike
>> Sent: Friday, January 11, 2013 7:51 AM
>> To: Elken, Tom; ewg at lists.openfabrics.org
>> Subject: Re: [ewg] Interop test failure using OFED-3.5 RC4
>>
>> This is definitely a perftest bug.
>>
>> This is a significant re-write of these utilities and this bug is a regression in the
>> routine ctx_set_out_reads().
>>
>> In 1.4 the code is this:
>> /****************************************************************
>> **************
>> *
>>
>> ****************************************************************
>> **************/
>> static int ctx_set_out_reads(struct ibv_context *context,int num_user_reads) {
>>
>>
>> int max_reads;
>>
>> max_reads = (is_dev_hermon(context) == HERMON) ?
>> MAX_OUT_READ_HERMON : MAX_OUT_READ;<---------------
>>
>> if (num_user_reads > max_reads) {
>> fprintf(stderr," Number of outstanding reads is above max =
>> %d\n",max_reads);
>> fprintf(stderr," Changing to that max value\n");
>> num_user_reads = max_reads;
>> }
>> else if (num_user_reads <= 0) {
>> num_user_reads = max_reads;
>> }
>>
>> return num_user_reads;
>> }
>>
>> The new 2.0 code is:
>> /****************************************************************
>> **************
>> *
>>
>> ****************************************************************
>> **************/
>> static int ctx_set_out_reads(struct ibv_context *context,int num_user_reads) {
>>
>>
>> int max_reads;
>>
>> Device ib_fdev = ib_dev_name(context);
>>
>> switch (ib_fdev) {
>> case CONNECTIB : ;
>> case CONNECTX3 : ;
>> case CONNECTX2 : ;
>> case CONNECTX : max_reads = MAX_OUT_READ_HERMON; break;
>> case LEGACY : max_reads = MAX_OUT_READ; break;
>> default : max_reads = 0; <--------------------
>> }
>>
>> if (num_user_reads > max_reads) {
>> printf(RESULT_LINE);
>> fprintf(stderr," Number of outstanding reads is above max =
>> %d\n",max_reads);
>> fprintf(stderr," Changing to that max value\n");
>> num_user_reads = max_reads;
>> }
>> else if (num_user_reads <= 0) {
>> num_user_reads = max_reads;
>> }
>>
>> return num_user_reads;
>> }
>>
>> The old code will return MAX_OUT_READ, while the new code for any other
>> HCAs (qib and probably others), will return 0.
>>
>> I have a patch that works, while preserving the desired hardcoded values for
>> "known/legacy" devices:
>> +
>> +/***************************************************************
>> *******
>> +********
>> + *
>> +
>> +***************************************************************
>> ********
>> +*******/ static int device_max_reads(struct ibv_context *context) {
>> + struct ibv_device_attr attr;
>> + int ret = 0;
>> +
>> + if (!ibv_query_device(context,&attr)) {
>> + ret = attr.max_qp_rd_atom;
>> + }
>> + return ret;
>> +}
>> +
>>
>> /****************************************************************
>> **************
>> *
>>
>> ****************************************************************
>> **************/
>> @@ -496,7 +510,7 @@ static int ctx_set_out_reads(struct ibv_
>> case CONNECTX2 : ;
>> case CONNECTX : max_reads = MAX_OUT_READ_HERMON; break;
>> case LEGACY : max_reads = MAX_OUT_READ; break;
>> - default : max_reads = 0;
>> + default : max_reads = device_max_reads(context);
>> }
>>
>> if (num_user_reads > max_reads) {
>>
>> I'm curious why the old and new code used hardcoded values?
>>
>> Mike
>> _______________________________________________
>> ewg mailing list
>> ewg at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
More information about the ewg
mailing list