[Openib-windows] [PATCH v2] Handle RMPP send payload < MAD bu ffer length

Tzachi Dar tzachid at mellanox.co.il
Thu Oct 6 08:42:50 PDT 2005


I have looked at my suggested fix, and it obviously only makes things worse
than before.
 
It seems that the following fix might work however:
 
 
        CL_ASSERT( h_send->p_send_mad->size != offset );
        if( (h_send->p_send_mad->size - offset) < max_len)
            max_len = h_send->p_send_mad->size - offset ;
 
        cl_memcpy(
            p_rmpp_dst + hdr_len, p_rmpp_src + offset, max_len );
 
I hope this helps.
 
Thanks
Tzachi
 
-----Original Message-----
From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
Sent: Thursday, October 06, 2005 4:45 PM
To: 'Fab Tillier'; openib-windows at openib.org
Subject: RE: [Openib-windows] [PATCH v2] Handle RMPP send payload < MAD bu
ffer length
 
Hi Fab, 
Please add the patch also as an attachment to your mail, since Exchange
seems to break the text no meter what we do. 
In any case I have manually applied the patch and received a blue screen
after a few hours. (Actually I was receiving this even without your change
from time to time). 
I'm not so familiar with rmpp, but I think that I know where the problem is:

Replace the line: 
        max_len = h_send->p_send_mad->size - offset; 
with 
        max_len = h_send->p_send_mad->size - offset - hdr_len; 
(of course I might be wrong here). 
I'll try to run with this change and let you know if there are still
problems. Please note that as the problem is very hard to reproduce it will
be hard to tell that this is indeed a fix.
Thanks 
Tzachi 
 
More info: 
The line that caused the problem was 
       cl_memcpy( 
            p_rmpp_dst + hdr_len, p_rmpp_src + hdr_len + offset, max_len ); 
        } 
The immediate problem was in the source part - that is: 
p_rmpp_src + hdr_len + offset + max_len was crossing a page boundary and
this page was not found. This immediately caused a blue screen.
 
Some more info: 
p_rmpp_hdr->seg_num == 1 
I believe that if you change max_len to be h_send->p_send_mad->size - offset
- hdr_len 
it will solve the problem (at least in this example): 
=> max_len = 90. And the copy operation will not access places behind this
page. 
The local variables are: 
      h_mad_reg = 0x813e733c 
         h_send = 0xfef74570 
        hdr_len = 0x38 
         offset = 0x38 
     p_rmpp_hdr = 0xfecf3ee8 
        max_len = 0xc8 
     p_rmpp_dst = 0x813beabc "???" 
      p_send_wr = 0xfef745a8 
   p_al_element = 0x813be9c0 
     p_rmpp_src = 0xfecf3ee8 "???" 
 
1: kd> dt -r2 h_send 
Local var @ 0xf50f0b2c Type _al_mad_send* 
0xfef74570 
   +0x000 pool_item        : _cl_pool_item 
      +0x000 list_item        : _cl_list_item 
         +0x000 p_next           : 0xfef3af10 
         +0x004 p_prev           : 0xfef3af10 
         +0x008 p_list           : 0xfef3af10 
      +0x00c pad              : 0x46200000 
      +0x010 p_pool           : 0x45484644 
         +0x000 num_components   : ?? 
         +0x004 component_sizes  : ???? 
         +0x008 p_components     : ???? 
         +0x00c num_objects      : ?? 
         +0x010 max_objects      : ?? 
         +0x014 grow_size        : ?? 
         +0x018 pfn_init         : ???? 
         +0x01c pfn_dtor         : ???? 
         +0x020 context          : ???? 
         +0x024 free_list        : _cl_qlist 
         +0x038 alloc_list       : _cl_qlist 
         +0x04c state            : ?? 
 (No matching name) 
   +0x014 p_send_mad       : 0x813be9e8 
      +0x000 p_next           : (null) 
      +0x008 context1         : 0x001505e8 
      +0x010 context2         : (null) 
      +0x018 p_mad_buf        : 0xfecf3ee8 
         +0x000 base_ver         : 0x1 '' 
         +0x001 mgmt_class       : 0x3 '' 
         +0x002 class_ver        : 0x2 '' 
         +0x003 method           : 0x92 '' 
         +0x004 status           : 0 
         +0x006 class_spec       : 0 
         +0x008 trans_id         : 0x9b000000`01000000 
         +0x010 attr_id          : 0x1100 
         +0x012 resv             : 0 
         +0x014 attr_mod         : 0 
      +0x020 size             : 0x118 
      +0x024 immediate_data   : 0 
      +0x028 remote_qp        : 0x1000000 
      +0x030 h_av             : 0x820f0854 
         +0x000 obj              : _al_obj 
         +0x0b8 h_ci_av          : 0x813bf248 
         +0x0c0 av_attr          : _ib_av_attr 
         +0x0f8 list_item        : _cl_list_item 
      +0x038 send_opt         : 4 
      +0x03c remote_qkey      : 0x180 
      +0x040 resp_expected    : 0 
      +0x044 timeout_ms       : 0x64 
      +0x048 retry_cnt        : 3 
      +0x04c rmpp_version     : 0 '' 
      +0x050 status           : d ( IB_WCS_UNKNOWN ) 
      +0x054 grh_valid        : 0 
      +0x058 p_grh            : 0x813bea94 
         +0x000 ver_class_flow   : 0 
         +0x004 resv1            : 0 
         +0x006 resv2            : 0 '' 
         +0x007 hop_limit        : 0 '' 
         +0x008 src_gid          : _ib_gid 
         +0x018 dest_gid         : _ib_gid 
      +0x060 recv_opt         : 0 
      +0x064 remote_lid       : 0 
      +0x066 remote_sl        : 0 '' 
      +0x068 pkey_index       : 0 
      +0x06a path_bits        : 0 '' 
      +0x070 send_context1    : (null) 
      +0x078 send_context2    : (null) 
   +0x018 h_av             : (null) 
   +0x020 mad_wr           : _al_mad_wr 
      +0x000 list_item        : _cl_list_item 
         +0x000 p_next           : (null) 
         +0x004 p_prev           : (null) 
         +0x008 p_list           : (null) 
      +0x00c client_id        : 1 
      +0x010 client_tid       : 0xd3510000`00000000 
      +0x018 send_wr          : _ib_send_wr 
         +0x000 p_next           : (null) 
         +0x008 wr_id            : 0 
         +0x010 wr_type          : 1 ( WR_SEND ) 
         +0x014 send_opt         : 4 
         +0x018 num_ds           : 0 
         +0x020 ds_array         : (null) 
         +0x028 immediate_data   : 0 
         +0x030 dgrm             : _send_dgrm 
         +0x050 remote_ops       : _send_remote_ops 
   +0x0a8 p_resp_mad       : (null) 
   +0x0b0 retry_time       : 0xffffffff`ffffffff 
   +0x0b8 delay            : 0 
   +0x0bc retry_cnt        : 3 
   +0x0c0 canceled         : 0 
   +0x0c4 uses_rmpp        : 1 
   +0x0c8 ack_seg          : 0 
   +0x0cc cur_seg          : 2 
   +0x0d0 seg_limit        : 1 
   +0x0d4 total_seg        : 2 
Memory read error 45484690 




>-----Original Message----- 
>From: Fab Tillier [mailto:ftillier at silverstorm.com
<mailto:ftillier at silverstorm.com> ] 
>Sent: Tuesday, October 04, 2005 2:42 AM 
>To: openib-windows at openib.org 
>Subject: RE: [Openib-windows] [PATCH v2] Handle RMPP send payload < MAD 
>buffer length 
> 
>> From: Tillier, Fabian 
>> Sent: Monday, October 03, 2005 4:12 PM 
>> 
>> Folks, 
>> 
>> I found a bug in sending MADs where the last segment of an RMPP send 
>would try 
>> to send a full payload's worth of data in the MAD, which could result in 
>> copying data beyond the end of the source buffer and the corresponding 
>BSOD. 
>> 
>> Here's a patch that corrects this.  It does not clear the remaining bytes

>of 
>> the MAD, as I wasn't sure it was needed.  Please take a look and confirm 
>that 
>> what I'm doing is sane, and I'll check it in. 
> 
>Here's an updated version that adds a line to clear the unused portion of 
>the 
>MAD payload.  This is probably being overly cautious, but I didn't want to 
>let a 
>previous MAD's potentially sensitive information be partially 
>retransmitted. 
> 
>- Fab 
> 
>Signed-off-by: Fab Tillier (ftillier at silverstorm.com) 
> 
>Index: core/al/al_mad.c 
>=================================================================== 
>--- core/al/al_mad.c    (revision 100) 
>+++ core/al/al_mad.c    (working copy) 
>@@ -1681,6 +1681,7 @@ 
>    al_mad_element_t    *p_al_element; 
>    ib_rmpp_mad_t       *p_rmpp_hdr; 
>    uint8_t             *p_rmpp_src, *p_rmpp_dst; 
>+   uintn_t             hdr_len, offset, max_len; 
> 
>    CL_ENTER( AL_DBG_MAD_SVC, g_al_dbg_lvl ); 
>    p_send_wr = &h_send->mad_wr.send_wr; 
>@@ -1702,28 +1703,32 @@ 
>        p_rmpp_dst = (uint8_t*)(uintn_t)p_al_element->mad_ds.vaddr; 
> #endif 
>        p_rmpp_src = (uint8_t* __ptr64)h_send->p_send_mad->p_mad_buf; 
>-       p_rmpp_hdr = (ib_rmpp_mad_t* __ptr64)h_send->p_send_mad->p_mad_buf;

>+       p_rmpp_hdr = (ib_rmpp_mad_t*)p_rmpp_src; 
> 
>        if( h_send->p_send_mad->p_mad_buf->mgmt_class == IB_MCLASS_SUBN_ADM

>) 
>-       { 
>-           /* Copy the header into the registered send buffer. */ 
>-           cl_memcpy( p_rmpp_dst, p_rmpp_src, IB_SA_MAD_HDR_SIZE ); 
>-           /* Copy this segment's payload into the registered send buffer.

>*/ 
>-           p_rmpp_dst = p_rmpp_dst + IB_SA_MAD_HDR_SIZE; 
>-           p_rmpp_src = p_rmpp_src + IB_SA_MAD_HDR_SIZE + 
>-               ( (cl_ntoh32( p_rmpp_hdr->seg_num ) - 1) * IB_SA_DATA_SIZE 
>); 
>-           cl_memcpy( p_rmpp_dst, p_rmpp_src, IB_SA_DATA_SIZE ); 
>-       } 
>+           hdr_len = IB_SA_MAD_HDR_SIZE; 
>        else 
>+           hdr_len = MAD_RMPP_HDR_SIZE; 
>+ 
>+       max_len = MAD_BLOCK_SIZE - hdr_len; 
>+ 
>+       offset = hdr_len + (max_len * (cl_ntoh32( p_rmpp_hdr->seg_num ) - 
>1)); 
>+ 
>+       /* Copy the header into the registered send buffer. */ 
>+       cl_memcpy( p_rmpp_dst, p_rmpp_src, hdr_len ); 
>+ 
>+       /* Copy this segment's payload into the registered send buffer. */ 
>+       CL_ASSERT( h_send->p_send_mad->size != offset ); 
>+       if( (h_send->p_send_mad->size - offset) < max_len ) 
>        { 
>-           /* Copy the header into the registered send buffer. */ 
>-           cl_memcpy( p_rmpp_dst, p_rmpp_src, MAD_RMPP_HDR_SIZE ); 
>-           /* Copy this segment's payload into the registered send buffer.

>*/ 
>-           p_rmpp_dst = p_rmpp_dst + MAD_RMPP_HDR_SIZE; 
>-           p_rmpp_src = p_rmpp_src + MAD_RMPP_HDR_SIZE + 
>-               ( (cl_ntoh32( p_rmpp_hdr->seg_num ) - 1) * 
>MAD_RMPP_DATA_SIZE ); 
>-           cl_memcpy( p_rmpp_dst, p_rmpp_src, MAD_RMPP_DATA_SIZE ); 
>+           max_len = h_send->p_send_mad->size - offset; 
>+           /* Clear unused payload. */ 
>+           cl_memclr( p_rmpp_dst + hdr_len + max_len, 
>+               MAD_BLOCK_SIZE - hdr_len - max_len ); 
>        } 
>+ 
>+       cl_memcpy( 
>+           p_rmpp_dst + hdr_len, p_rmpp_src + hdr_len + offset, max_len );

>    } 
> 
>    p_send_wr->num_ds = 1; 
> 
>_______________________________________________ 
>openib-windows mailing list 
>openib-windows at openib.org 
>http://openib.org/mailman/listinfo/openib-windows
<http://openib.org/mailman/listinfo/openib-windows>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20051006/124f6bb8/attachment.html>


More information about the ofw mailing list