[ofa-general] Bug with SDP on IA64
Dotan Barak
dotanba at gmail.com
Mon Oct 27 05:08:25 PDT 2008
On Mon, Oct 27, 2008 at 11:09 AM, Nicolas Morey Chaisemartin
<nicolas.morey-chaisemartin at ext.bull.net> wrote:
> Amir Vadai a écrit :
>>
>> I asked our IB expert Jack for hints and he told me this:
>>
>>
>> >From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
>> revision 1.2.1
>> * Local Length Error - ... Generated for a
>> Work Request posted to the local Receive Queue when the sum of
>> the Data Segment lengths is too small to receive a valid incoming
>> message or the length of the incoming message is greater than the
>> maximum message size supported by the HCA port that received the
>> message.
>>
>>
>> There seem to be 2 possibilities:
>> 1. The receiver did not post enough/large-enough scatter gather entries in
>> the receive queue.
>>
>>
>> or 2. The sender sent a 0-length packet, but did so incorrectly.
>> (if any of the s/g entries (i.e., data segment entries) have a zero
>> byte count, this results in 2 GigaBytes of data being sent over the
>> wire).
>>
>>
>> I note that SDP does not check for this (see sdp_post_send() in file
>> sdp_bcopy.c:
>> the sge->length field is not checked for zero length).
>>
>>
>> Regarding how to debug this, you need to talk with an sdp expert to see if
>> sdp may try
>> to send 0-length packets under stress ([Amir]: I can help you with this).
>>
>>
>
> I've just run a few more tests.
> I added a test in sdp_post_send to check to sge->length field:
> if(sge->length == 0){printk(KERN_ERR "SDP sending 0bytes packet\n");}
Please pay attension: sge->length of 0 means that you send 2GB and not 0 bytes.
If you want to send 0 bytes, the sg_list should be empty (0 entries).
This is why you have a length violation ...
Dotan
More information about the general
mailing list