[ofa-general] Manipulating Credits in Infiniband

Ashwath Narasimhan nashwath at gmail.com
Wed Aug 12 23:41:37 PDT 2009


Dear Tom/all
  I understand the end to end credit based flow control at the link layer
where we have a 32 bit Flow control packet being sent for each VL (with FCCL
and FCTBS fields) but I fail to understand where this scheme is implemented
in the driver. (OFED linux- 1.4 stack, hw-mthca) . I can see a file with a
credit table mapped to different credits counts and another that computes
the AETH based on this credit table.
1. Is this the place where the flow control packets are formulated?
2. If yes, I don't see them computing this for each VL. why? If no, is it a
mid layer flow control?
3. And thats why I have this basic question--> is the link layer implemented
as part of OFED stack at all? or does it go into the hardware HCA as
firmware? As I understand the hardware vendor only provides verbs to
communicate with the HCA.

Pardon me if i am bundling you all with a lot with questions. I am new to
all this and I am trying my best to understand the stack.
Thank you,
Ashwath


On Tue, Aug 11, 2009 at 10:37 PM, Nifty Tom Mitchell <niftyompi at niftyegg.com
> wrote:

> On Mon, Aug 10, 2009 at 12:11:22PM -0400, Ashwath Narasimhan wrote:
> >
> >    I looked into the infiniband driver files. As I understand, in order
> to
> >    limit the data rate we manipulate the credits on either ends. Since
> the
> >    number of credits available depends on the receiver's work receive
> >    queue size, I decided to limit the queue size to say 5 instead of 8192
> >    (reference---> ipoib.h, IPOIB_MAX_QUEUE_SIZE to say 3 since my higher
> >    layer protocol is ipoib). I just want to confirm if I am doing the
> >    right thing?
>
> Data rate is not manipulated by credits.
> Credits and queue sizes are different and have different purposes.
>
> Visit the Infiniband Trade Association web site and grab the IB
> specifications to understand some of the hardware level parts.
>
>        http://www.infinibandta.org/
>
> InfiniBand offers credit based flow control and given the nature of
> modern IB switches and processors a very small credit count can still
> result in full data rate.    Having said that flow control is the lowest
> level throttle in the system.   Reducing the credit count forces the
> higher levels in the protocol stack to source or sink the data through
> the hardware before any more can be delivered.   Thus flow control can
> simplify the implementation of higher level protocols.   It can also be
> used
> to cost reduce or simplify hardware design (smaller hardware buffers).
>
> The IB specifications are way too long.  Start with this FAQ.
>
>       http://www.mellanox.com/pdf/whitepapers/InfiniBandFAQ_FQ_100.pdf
>
> The IB specification is way too full of optional features.  A vendor may
> have XYZ working fine and dandy on one card and since it is optional not
> at all on another.
>
> The various queue sizes for the various protocols built on top of
> IB establish transfer behavior in keeping with system interrupt,
> system process time slice, system kernel activity loads and needs.
> It is counter intuitive but in some cases small queues result in
> more responsive and agile systems, especially in the presence of errors.
>
> Since there are often multiple protocols on the IB stack all protocols
> will be impacted by credit tinkering.  Most vendors know their hardware
> so most drivers will have credit related code optimum.
>
> In the case of TCP/IP the interaction between IB bandwidths&MTU (IPoIB),
> ethernet bandwidth&MTU and even localhost (127.0.0.1) bandwidth&MTU can
> be "interesting" depending on host names, subnets, routing etc.   TCP/IP
> has lots of tuning flags well above the IB driver.   I see 500+ net.*
> sysctl knobs on this system.
>
> As you change things do make the changes on all the moving parts, benchmark
> and keep a log.   Since there are multiple IB hardware vendors
> it is important to track hardware specifics.  "lspci" is a good tool
> to gather chip info.   With some cards you also need specifics about
> the active firmware.
>
> So go forth (RPN forever) and conquer.
>
>
> --
>        T o m  M i t c h e l l
>        Found me a new hat, now what?
>
>


-- 
regards,
Ashwath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090813/6577386a/attachment.html>


More information about the general mailing list