[openib-general] Re: RMPP
Hal Rosenstock
halr at voltaire.com
Tue Jun 28 11:46:39 PDT 2005
On Tue, 2005-06-28 at 14:07, Hal Rosenstock wrote:
> On Tue, 2005-06-28 at 13:48, Hal Rosenstock wrote:
> > On Tue, 2005-06-28 at 13:44, Sean Hefty wrote:
> > > Hal Rosenstock wrote:
> > > > Hi Sean,
> > > >
> > > > I'm in the process of enabling the receive side RMPP from user space and
> > > > this is what I'm seeing in terms of RMPP right now. I have a question
> > > > about the OpenSM side.
> > > >
> > > > SA client OpenSM
> > > > SA GetTable (PortInfoRecord) -->
> > > > <-- SA GetTableResp (PortInfoRecord)
> > > > RMPP active, first
> > > > payload length 0x44C
> > > >
> > > > retries is set to 4 so I see 4 responses (at 2 sec intervals) as the
> > > > client is not currently ACKing. All is fine up to that point.
> > > >
> > > > At that point, OpenSM sees a large receive which appears to be that send
> > > > timing out (nothing was sent nor observed on the IB wire).
> > > >
> > > > Could a timed out RMPP send end up as a receive somehow ?
> > >
> > > On the side that sent the MAD?
> >
> > The side that sent the RMPP MAD response (e.g. OpenSM).
> >
> > > That should be no.
> >
> > That's what I thought. I'm not sure where the problem is but will start
> > to try to narrow it down.
>
> I do get EINVAL from user_mad.c::ib_umad_read as follows:
>
> if (count < packet->length + sizeof (struct ib_user_mad))
> ret = -EINVAL;
>
> as the packet->length is larger than a single MAD (and looks like the
> user MAD that was sent by OpenSM).
I think I see what is going on here...
In user_mad.c::send_handler
if (send_wc->status == IB_WC_RESP_TIMEOUT_ERR) {
packet->mad.hdr.status = ETIMEDOUT;
if (!queue_packet(file, agent, packet))
return;
}
That is what is causing the problem. I think the send side queues the
packet on a timeout and simulates a receive so that a transaction can be
terminated. RMPP sends appear to be a little different in that even non
transactions get timeouts.
-- Hal
More information about the general
mailing list