[ofa-general] Receiving Unknown packets at regular interval

Hal Rosenstock hrosenstock at xsigo.com
Mon May 19 06:25:14 PDT 2008


Hi Sumit,

On Mon, 2008-05-19 at 17:20 +0530, Sumit Gaur - Sun Microsystem wrote:
> Hi Hal,
> 
> 
> Hal Rosenstock wrote:
> > Sumit,
> > 
> > On Mon, 2008-05-19 at 15:25 +0530, Sumit Gaur - Sun Microsystem wrote:
> > 
> >>Hi
> >>I have an issue while my program interacting with OFED umad library.
> > 
> > 
> > Are you referring to libibumad ?
> yes, I am using mad_receive(0, -1) function to get my response back.

OK.
 
> >>I have two 
> >>separate threads one for sending SMP,GMP packets and another to receive 
> >>response. Things are working fine but during the whole process I keep receiving 
> >>packets with unknown tid apart from correct response.
> > 
> > 
> > What's the exact message ?
> Response comes as proper mad packets but with "tid" that I have never send and 
> my logic to keep track of send/response pkts failed.
> > 
> > 
> >> Is it a correct behavior.
> > 
> > 
> > It could be; there's not enough info as to what is going on. It could be
> > some unsolicited message (e.g. from SM) comes in during your
> > transactions. Can you see what MADs are incoming ? One way to do that
> > would be to run madeye.
> Yes I could see complete mad with madhdr as following fields
> 
> Response TID2 = 0x000000006701869b , BaseVersion = 1, MgmtClass=129, 
> ClassVersion=1, R_Method=129, ClassSpecific=1, Status=128, AttributeID=435

Class 129 is a Subn directed route packet. Some of the other info (like
attribute ID) doesn't look right to me but maybe that's something
"special" to your environment.

> 	If these are unsolicited packets. Is there anyway to filter them.

Yes. How do you register ?

> Any reference to madeye ?

There's only the code for this (kernel module) which is added by OFED
(not upstream) in drivers/infiniband/util but it's pretty
straightforward to use.

-- Hal
 
> >>If yes how I could avoid them ?
> > 
> > 
> > Not sure what you are seeing yet.
> > 
> > -- Hal
> > 
> > 
> >>Thanks and Regards
> >>sumit
> >>
> >>general-request at lists.openfabrics.org wrote:
> >>
> >>>Send general mailing list submissions to
> >>>	general at lists.openfabrics.org
> >>>
> >>>To subscribe or unsubscribe via the World Wide Web, visit
> >>>	http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>or, via email, send a message with subject or body 'help' to
> >>>	general-request at lists.openfabrics.org
> >>>
> >>>You can reach the person managing the list at
> >>>	general-owner at lists.openfabrics.org
> >>>
> >>>When replying, please edit your Subject line so it is more specific
> >>>than "Re: Contents of general digest..."
> >>>
> >>>
> >>>Today's Topics:
> >>>
> >>>   1. Re:  [PATCH] IB/core: handle race between elements in	qork
> >>>      queues after event (Roland Dreier)
> >>>   2. Re:  RDS flow control (Steve Wise)
> >>>   3. Re:  RDS flow control (Olaf Kirch)
> >>>   4. Re:  RDS flow control (Steve Wise)
> >>>   5. Re:  RDS flow control (Olaf Kirch)
> >>>   6. Re:  [PATCH 3/3] IB/ipath - fix RDMA read response	sequence
> >>>      checking (Roland Dreier)
> >>>   7.  Re: [PATCH][INFINIBAND]: Make ipath_portdata work with
> >>>      struct pid * not pid_t. (Roland Dreier)
> >>>   8. Re:  bitops take an unsigned long * (Roland Dreier)
> >>>
> >>>
> >>>----------------------------------------------------------------------
> >>>
> >>>Message: 1
> >>>Date: Tue, 13 May 2008 10:41:39 -0700
> >>>From: Roland Dreier <rdreier at cisco.com>
> >>>Subject: Re: [ofa-general] [PATCH] IB/core: handle race between
> >>>	elements in	qork queues after event
> >>>To: Moni Shoua <monis at Voltaire.COM>
> >>>Cc: Olga Stern <olgas at voltaire.com>,	OpenFabrics General
> >>>	<general at lists.openfabrics.org>
> >>>Message-ID: <adatzh2ksoc.fsf at cisco.com>
> >>>Content-Type: text/plain; charset=us-ascii
> >>>
> >>> > Can we please go on with this patch? We would like to see it in the next kernel.
> >>>
> >>>I still don't get why this is important to you.  Is there a concrete
> >>>example of a situation where this actually makes a measurable difference?
> >>>
> >>>We need some justification for adding this locking complexity beyond "it
> >>>doesn't hurt."  (And also of course we need it fixed so there aren't races)
> >>>
> >>> - R.
> >>>
> >>>
> >>>------------------------------
> >>>
> >>>Message: 2
> >>>Date: Tue, 13 May 2008 12:58:11 -0500
> >>>From: Steve Wise <swise at opengridcomputing.com>
> >>>Subject: Re: [ofa-general] RDS flow control
> >>>To: Richard Frank <richard.frank at oracle.com>
> >>>Cc: rds-devel at oss.oracle.com, general at lists.openfabrics.org
> >>>Message-ID: <4829D6B3.5080900 at opengridcomputing.com>
> >>>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>>
> >>>Richard Frank wrote:
> >>>
> >>>
> >>>>Steve Wise wrote:
> >>>>
> >>>>
> >>>>>Olaf Kirch wrote:
> >>>>>
> >>>>>
> >>>>>>On Monday 12 May 2008 18:57:38 Jon Mason wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>As part of my effort to get RDS working for iWARP, I will be 
> >>>>>>>working on the RDS flow control.  Flow control is needed for iWARP 
> >>>>>>>due to the fact that iWARP connections terminate if there is no 
> >>>>>>>posted recv for an incoming packet.  IB connections do not have 
> >>>>>>>this limitation if setup in a certain way.  In its current 
> >>>>>>>implementation, RDS sets the connection attribute rnr_retry to 7.  
> >>>>>>>This causes IB to retransmit until there is a posted recv buffer.     
> >>>>>>
> >>>>>>I think for the initial implementation, it is fine for iWARP to just
> >>>>>>fail the connect when that happens, and re-establish the connection.
> >>>>>>
> >>>>>>If you use reasonable defaults for the send and recv queues, receiver
> >>>>>>overruns should be relatively rare.
> >>>>>>
> >>>>>>Once everything else works, let's revisit the flow control part.
> >>>>>>
> >>>>>> 
> >>>>>
> >>>>>I _think_ you'll hit this quickly with one-way flows.  Send 
> >>>>>completions for iWARP only mean the user's buffer can be reused.  Not 
> >>>>>that its placed at the remote peer or in the remote user's buffer.
> >>>>>
> >>>>
> >>>>Let's see what happens - anyway - this could be solved in an IWARP 
> >>>>extension to RDS  - right ?
> >>>
> >>>
> >>>
> >>>Yes, by adding flow control.  And it could be iwarp-specific if you 
> >>>want.    I would not suggest relying on connection termination and 
> >>>re-establishment as the way to handle this :).
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>>But perhaps I'm wrong.  Jon, maybe you should try to hit this with IB 
> >>>>>and rnr_retry == 0 using the rds perf tools?
> >>>>>Also "the everything else" part depends on remove fmr usage.  I'm 
> >>>>>working on the new RDMA memory verbs allowing fast registration of 
> >>>>>physical memory via a send WR.  To support iWARP we need to remove 
> >>>>>the fmr usage from RDS.   The idea was to replace fmrs with the new 
> >>>>>fastreg verbs.   Thoughts?
> >>>>>
> >>>>
> >>>>What does "fast" imply here - how does this compare to the performance 
> >>>>of FMRs ?
> >>>
> >>>
> >>>
> >>>Don't know yet, but probably as fast. 
> >>>
> >>>
> >>>
> >>>>Why would not push memory window creation into the RDS transport 
> >>>>specific implementations ?
> >>>
> >>>
> >>>Isn't it already transport-specific?  IE you don't need FMRs for TCP.  
> >>>(I'm ignorant on the specifics of the implementation at this point, so 
> >>>please excuse any dumb statements :)
> >>>
> >>>
> >>>
> >>>
> >>>>Changing the API may be OK - if we retain the performance we have with 
> >>>>IB.
> >>>
> >>>
> >>>
> >>>I assume nothing would fly that regresses IB performance.  Worst case, 
> >>>you have an iwarp-specific RDS transport like you do for TCP, I guess.  
> >>>Hopefully though, IB + iWARP will be a common transport.
> >>>
> >>>
> >>>
> >>>
> >>>>>Stay tuned for the new verbs API RFC...
> >>>>>
> >>>>>Steve.
> >>>>>_______________________________________________
> >>>>>general mailing list
> >>>>>general at lists.openfabrics.org
> >>>>>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>>>
> >>>>>To unsubscribe, please visit 
> >>>>>http://openib.org/mailman/listinfo/openib-general
> >>>
> >>>
> >>>
> >>>
> >>>------------------------------
> >>>
> >>>Message: 3
> >>>Date: Tue, 13 May 2008 20:04:00 +0200
> >>>From: Olaf Kirch <okir at lst.de>
> >>>Subject: Re: [ofa-general] RDS flow control
> >>>To: Steve Wise <swise at opengridcomputing.com>
> >>>Cc: rds-devel at oss.oracle.com, general at lists.openfabrics.org
> >>>Message-ID: <200805132004.01371.okir at lst.de>
> >>>Content-Type: text/plain;  charset="iso-8859-1"
> >>>
> >>>On Tuesday 13 May 2008 19:58:11 Steve Wise wrote:
> >>>
> >>>
> >>>>Yes, by adding flow control.  And it could be iwarp-specific if you 
> >>>>want.    I would not suggest relying on connection termination and 
> >>>>re-establishment as the way to handle this :).
> >>>
> >>>
> >>>No, not in the long term. But let's hold off on the flow control stuff
> >>>for a little - I would first like to finish my patch set and hand it
> >>>out for you folks to bang on it, rather than the other way round.
> >>>Okay with you guys?
> >>>
> >>>
> >>>
> >>>>I assume nothing would fly that regresses IB performance.  Worst case, 
> >>>>you have an iwarp-specific RDS transport like you do for TCP, I guess.  
> >>>>Hopefully though, IB + iWARP will be a common transport.
> >>>
> >>>
> >>>If it turns out that way, fine. If iWARP ands up sharing 80% of the
> >>>code with IB except the RDMA specific functions, I think that's
> >>>very much acceptable, too.
> >>>
> >>>Olaf
> >>
> >>_______________________________________________
> >>general mailing list
> >>general at lists.openfabrics.org
> >>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>
> >>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > 
> > 




More information about the general mailing list