[Ofa_boardplus] Draft Mission Statement

Atchley, Scott atchleyes at ornl.gov
Tue Mar 20 14:40:13 PDT 2018


> On Mar 20, 2018, at 5:08 PM, Paul Grun <grun at cray.com> wrote:
> 
> Part 1:  What do we call these things we work on?  And are they networks or fabrics?
> 
> From Doug L:
>> 
>> So, I had originally suggested we use "RDMA enabled networks" and that got shot
>> down because there is an existing assumption, right or wrong, that RDMA
>> equates to IB.  The term advanced networks was really meant to encompass any
>> network setup that enables what we would traditionally think of as common
>> RDMA functionality.  So, for instance, IB and OPA are certainly in there, but we
>> can also run RDMA functionality over standard Ethernet using iWARP, or RoCE
>> with a lossless Ethernet, or RoCE with ECN enabled on the Ethernet.  But any of
>> these Ethernet solutions require additional software/hardware to make them
>> work (a soft RoCE or soft iWARP driver, or actual RoCE/iWARP hardware, and in
>> some cases specific abilities in the switches), so that's what earns them the
>> moniker "advanced networks", because they *are* more advanced than a garden
>> variety Ethernet deployment.  But it was also meant to encompass any new
>> fabrics that might come around in the future assuming they are a good fit for the
>> OFA stacks.
>> 
>> This then brings up the point, "What is a good fit for the OFA stacks?"
>> 
>> I've tried to pin down the essence of this.  After a lot of thought, I think the core,
>> defining element of what is a good fit for the OFA stacks is any technology that
>> enables application direct network I/O (ADNIO I guess?).  You can't say "it must
>> be RDMA capable" because not everything we support is truly RDMA (USNic for
>> one, but even just the verbs API has non-RDMA components in that
>> post_send/recv are not truly RDMA operations, but instead are merely
>> application direct I/O).  (snip)
>> 
> (snipped for brevity) 
> 
> Whether the pipes are queue pairs, or fabric endpoints, or a virtual NIC with our
>> own MAC address so that all packets to our MAC come directly to us, that's all
>> implementation details.  The core feature is Application Direct Network I/O.
>> 
> 
> From Eddie Wai:
> 
> Although I understand the use of the broader term "Application Direct Network I/O enabled networks", but it seems to be invading the DPDK territory.
> 
> From Sean Hefty:
> 
> I too prefer something like application direct I/O over the term RDMA.  
> 
> Paul's Comment:
> Doug has the essence of why the words 'Advanced Networks' were chosen - clearly, our charter does not (currently) include garden variety sockets.
> 
> One issue I have with Doug's statement is this: "...assuming they are a good fit for the OFA stacks."  I want to caution us not to limit our thinking to current verbs-based stacks, but to also include other stacks e.g. libfabrics, and whatever may come in the future.  I'm not sure if that's what Doug had in mind or not. 
> 
> Personally, I strongly prefer that we not invent a new term in the midst of the missions statement, so I would advocate against Application Direct Network I/O enabled networks, or even application direct I/O.  Can we agree on 'accelerated' with no loss of generality?
> 
> As for the word 'Networks' versus 'Fabrics' - no matter what, we are bound to please roughly 1/2 the people at any given time.  Early on in the process we had settled on 'fabrics' because it differentiates us from various networking standards bodies, and because 'fabrics' seems to suggest a smaller diameter fabrics than a classical 'network' (e.g. IP is a network) might encompass.  Note that there is an implication here that the OFA doesn't focus on WANs, even though we have done a lot of work in the past on extending IB over the WAN.
> 
> I suggest we zero in on 'Accelerated Fabrics’.

We used to use the term “kernel-bypass” to indicate user-space application direct access of hardware. :-)

The “bad” things about Berkeley sockets is the file/pipe semantic which requires buffering on send and again on receive. Data must be copied from the application buffer before sending and copied to the user buffer after receiving. OFA’s umbrella covers non-socket interfaces (Verbs, PSM[2], libfabric) that allow direct access from/placement in tagged/untagged user buffers as well as memory operations (i.e. atomics).

It is these additional semantics that make these interfaces so beneficial (i.e. lower latency, higher bandwidth, better semantic match to the application’s needs). OFA supports more than simply accelerating communication. We support more semantically rich communication.

Unfortunately, I do not have a catchy phrase that captures that.

Scott



More information about the Ofa_boardplus mailing list