[ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug

Pradeep Satyanarayana pradeeps at linux.vnet.ibm.com
Sun Jan 27 15:17:01 PST 2008


Eli Cohen wrote:
> I meant that the test hangs but not the system. You can still ping hosts on the ipoib interface, it is just that the test never ends. You can press Ctrl C and restart the test again.
> 

If one can Ctrl-C that means it is not hung in the kernel. Several things strike
me:
a) Is this a new version of the test?
b) Was the system left in an "unclean" state from the previous test in 
the regression suite?
c) Can this test hang be reproduce by just running this test on a freshly
booted system?

Right now I do not have access to the machines to run a test. I will try and
do it next week.

Pradeep
> -----Original Message-----
> From: Pradeep Satyanarayana [mailto:pradeeps at linux.vnet.ibm.com] 
> Sent: א 27 ינואר 2008 20:06
> To: Eli Cohen
> Cc: Shirley Ma; openfabrics; Dotan Barak
> Subject: Re: CM Enable SRQ for less than 16 s/g - bug
> 
> Eli Cohen wrote:
>> This commit b150c30c28976f0dcf96bb28780ae62897264c54 introduces a 
>> problem in IPOIB CM.
>>
>> failure description:
>> test hangs.
>>
>> bug was found by Mellanox regression.
>>
>> test info:
>> server:
>> ttcpv -s -r -p 19033 -l 100000
>>
>> client:
>> ttcpv -s -t 11.4.3.112 -p 19033 -l 100000 -n 8192
>>
> ...
>>
>> Can you take a look at this?
>>
> Sure, can you provide some more details about this hang like a stack trace?
> Bulk of the changes this patch introduces are in ipoib_cm_dev_init(). So, it should not affect the send and receive paths. Not sure why the hang occurs.
> 
> Pradeep
> 
> 





More information about the general mailing list