[ofa-general] Bug with SDP on IA64

Amir Vadai amirv at mellanox.co.il
Mon Oct 27 06:32:55 PDT 2008


I opened a bug in bugzilla with your research:
https://bugs.openfabrics.org/show_bug.cgi?id=1311

Nicolas Morey Chaisemartin wrote:
> Amir Vadai a écrit :
>> I asked our IB expert Jack for hints and he told me this:
>>
>>
>> >From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume
>> 1, revision 1.2.1
>> * Local Length Error - ... Generated for a
>>   Work Request posted to the local Receive Queue when the sum of
>>   the Data Segment lengths is too small to receive a valid incoming
>>   message or the length of the incoming message is greater than the
>>   maximum message size supported by the HCA port that received the
>>   message.
>>
>>
>> There seem to be 2 possibilities:
>> 1. The receiver did not post enough/large-enough scatter gather
>> entries in
>>    the receive queue.
>>
>>
>> or 2. The sender sent a 0-length packet, but did so incorrectly.
>>    (if any of the s/g entries (i.e., data segment entries) have a zero
>>    byte count, this results in 2 GigaBytes of data being sent over
>> the wire).
>>
>>
>>    I note that SDP does not check for this (see sdp_post_send() in
>> file sdp_bcopy.c:
>>    the sge->length field is not checked for zero length).
>>
>>   
>
> I think I got it.
> In sdp_cma.c/sdp_response_handler,
> the fragment size is retrieved through
>        sdp_sk(sk)->xmit_size_goal = ntohl(h->actrcvsz) -
>                sizeof(struct sdp_bsdh);
> The dmesg messages shows :
> sdp_sock(41820:0): sdp_response_handler bufs 64 xmit_size_goal 34816
> send trigger 16
>
> I forced this value to 2048 and then it works.
> On Xeon this size is 2048 by default.
>
> In my understanding the xmit_size_goal is the size of the receiving
> buffer for buffered copies, isn't it?
> So it shouldn't really matters as long as the packet is properly split
> at the MTu size to be sent over the network, right?
> Could it be only working from x86/x86_64 working because the buffer
> size is smaller than the MTU?
>
> Nicolas
>
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list