[openib-general] Re: RMPP

Hal Rosenstock halr at voltaire.com
Tue Jun 28 11:46:39 PDT 2005


On Tue, 2005-06-28 at 14:07, Hal Rosenstock wrote: 
> On Tue, 2005-06-28 at 13:48, Hal Rosenstock wrote:
> > On Tue, 2005-06-28 at 13:44, Sean Hefty wrote:
> > > Hal Rosenstock wrote:
> > > > Hi Sean,
> > > > 
> > > > I'm in the process of enabling the receive side RMPP from user space and
> > > > this is what I'm seeing in terms of RMPP right now. I have a question
> > > > about the OpenSM side.
> > > > 
> > > > SA client OpenSM
> > > > SA GetTable (PortInfoRecord) -->
> > > >                              <--  SA GetTableResp (PortInfoRecord)
> > > > RMPP active, first
> > > > payload length 0x44C
> > > > 
> > > > retries is set to 4 so I see 4 responses (at 2 sec intervals) as the
> > > > client is not currently ACKing. All is fine up to that point.
> > > > 
> > > > At that point, OpenSM sees a large receive which appears to be that send
> > > > timing out (nothing was sent nor observed on the IB wire).
> > > > 
> > > > Could a timed out RMPP send end up as a receive somehow ?
> > > 
> > > On the side that sent the MAD?
> > 
> > The side that sent the RMPP MAD response (e.g. OpenSM).
> > 
> > >   That should be no.
> > 
> > That's what I thought. I'm not sure where the problem is but will start
> > to try to narrow it down.
> 
> I do get EINVAL from user_mad.c::ib_umad_read as follows:
> 
> if (count < packet->length + sizeof (struct ib_user_mad))
> 	ret = -EINVAL;
> 
> as the packet->length is larger than a single MAD (and looks like the
> user MAD that was sent by OpenSM).

I think I see what is going on here...

In user_mad.c::send_handler


        if (send_wc->status == IB_WC_RESP_TIMEOUT_ERR) {
                packet->mad.hdr.status = ETIMEDOUT;

                if (!queue_packet(file, agent, packet))
                        return;
        }

That is what is causing the problem. I think the send side queues the
packet on a timeout and simulates a receive so that a transaction can be
terminated. RMPP sends appear to be a little different in that even non
transactions get timeouts. 

-- Hal




More information about the general mailing list