[OFIWG-MPI] [ofiwg] Call today
Hefty, Sean
sean.hefty at intel.com
Thu May 22 12:29:22 PDT 2014
Thanks, Howard, this is helpful.
Regarding the 'tag match class' that you mention, would you create an 'rma class' as a peer, with the RMA operations defined in a similar fashion? If not, why not? Would this also extend to all other data transfer operations? I.e. message queue (send/receive) and atomics, plus any others defined in the future?
> -----Original Message-----
> From: Howard Pritchard [mailto:hppritcha at gmail.com]
> Sent: Thursday, May 22, 2014 11:53 AM
> To: Richard Graham
> Cc: Hefty, Sean; ofiwg at lists.openfabrics.org; ofiwg-
> mpi at lists.openfabrics.org
> Subject: Re: [ofiwg] Call today
>
> Hi Folks,
>
> here is a diagram of a concept that was discussed in a side conversation at
> the last OFA workshop. I'd thought that a msgq (aka tag matcher class)
> object
> should be instantiated via a method of the fabric class.
>
> red lines in the diagram indicated the pointee can be associated with the
> class
> being pointed to by the arrow, using the bind method of the class being
> pointed
> to.
>
> the search_by_addr method of the msgq is for use with FID_RDM endpoints,
> while search_by_ep method is when the msgq is associated with multipled
> FID_MSG type endpoints.
>
> Note the slide is a little old since the EC class has been divided now into
> a EQ and counter type completion notification mechanisms.
>
> Hoping this will maybe help a little here.
>
> Howard
>
>
>
> On Thu, May 22, 2014 at 11:59 AM, Richard Graham <richardg at mellanox.com>
> wrote:
>
>
> Please see inline
>
> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Thursday, May 22, 2014 12:43 PM
> To: Richard Graham; ofiwg at lists.openfabrics.org; ofiwg-
> mpi at lists.openfabrics.org
> Cc: Paul Grun (grun at cray.com); Liran Liss
> Subject: RE: Call today
>
> With permission, copying mailing list on side thread that popped up.
>
> I understand MPI has wild card receives. But tagged semantics are
> useful even when associated with a generic endpoint concept, or a specific
> address. Note the proposed endpoint concept is not necessarily bound to a
> specific piece of hardware, though it may be based on the provider
> implementation. The tagged operations themselves may be implemented by
> hardware and are not restricted to being purely a software construct.
> [rich] If the attempt here is to provide a building block that will
> map to different use-case scenarios, then need to have an architecture that
> will map well onto the areas of interest. MPI is just one such upper level
> service, one that has been called out specifically in the context of the
> proposal you have been presenting. So, following on this (the precise
> definition of end point is still rather fuzzy at this stage) in general,
> there is no such one-to-one mapping of and endpoint to an MPI matching
> context, but there can be an association of a matching context with one or
> more endpoints. What I am suggesting here is that we keep data notions
> around data transfer orthogonal to what is done with the data (tag
> matching, in this case). How the functionality is implemented (hardware
> or not) is separate from how the stack in architected
>
> Tagged interfaces, as well as other interfaces such as message
> queues, may still exist above the endpoint. But that layering of
> interfaces seems better suited above the fabric interfaces (e.g. MPI),
> rather than included with it. This seems more debatable to me though, and
> we could examine whether a domain or fabric object should have send/receive
> capabilities.
> [rich] Need to keep separate how data is transferred (perhaps with
> functions that we may call send/recv) from the ULP's use of this data
> (perhaps also using the a similar naming scheme of send/recv).
>
> - Sean
>
> > -----Original Message-----
> > From: Richard Graham [mailto:richardg at mellanox.com]
> > Sent: Wednesday, May 21, 2014 11:09 AM
> > To: Hefty, Sean
> > Cc: Paul Grun (grun at cray.com); Liran Liss
> > Subject: RE: Call today
> >
> > Tag matching as it comes to MPI semantics is not local to a given
> pair
> > of processes, e.g. MPI has a wild card receive that can take data
> from
> > any source, and therefore the matching context is broader than just
> a
> > single pair of source and destination.
> >
> > Rich
> >
> > -----Original Message-----
> > From: Hefty, Sean [mailto:sean.hefty at intel.com]
> > Sent: Wednesday, May 21, 2014 1:13 PM
> > To: Richard Graham
> > Cc: Paul Grun (grun at cray.com); Liran Liss
> > Subject: RE: Call today
> >
> > Tag matching, RMA, atomics, and message operations are currently
> > associated with an endpoint, but the functions are independent of
> the
> > communication protocol in use. Conceptually, it seems reasonable
> to
> > think of tag matching as a merging of message and RMA write
> operations.
> >
> > I agree that an endpoint is associated with the data source/sink.
> > There is no implied mapping between a process and an endpoint.
> >
> >
> > > -----Original Message-----
> > > From: Richard Graham [mailto:richardg at mellanox.com]
> > > Sent: Tuesday, May 20, 2014 9:22 PM
> > > To: Hefty, Sean
> > > Cc: Paul Grun (grun at cray.com); Liran Liss
> > > Subject: RE: Call today
> > >
> > > I suppose that you could consider tag-matching as part of
> transport.
> > > However, I would argue that such protocols should be independent
> of
> > > whether or not a reliable or unreliable communication protocol is
> > > used
> > (at least
> > > when it comes to the tag support needed for MPI). Also, I
> associate an
> > > end-point with either the source and/or the sync of data. In MPI
> > > tag matching is associated with mpi-level (process,communicator)
> > > pair, and therefore the tag-matching context may be associated
> with
> > > many end-
> > points.
> > > I would therefore keep tag-matching as a separate concept.
> > >
> > > Rich
> > >
> > > -----Original Message-----
> > > From: Hefty, Sean [mailto:sean.hefty at intel.com]
> > > Sent: Tuesday, May 20, 2014 1:26 PM
> > > To: Richard Graham
> > > Cc: Paul Grun (grun at cray.com); Liran Liss
> > > Subject: RE: Call today
> > >
> > > Tag-matching is a transport object (protocol), so I do think it
> > > makes sense being associated with a transport level object (i.e.
> endpoint).
> > >
> > > I thought you were referring to the SRQ, which may or may not be
> a
> > > transport level object. If the sharing of data buffer(s) among
> > > multiple connections is not considered a transport object, then I
> > > agree, it may make sense to have it be a separate object with its
> > > own
> > interfaces.
> > > Alternatively, it could also be a property of endpoints to share
> > > receive buffers.
> > >
> > > When the SRQ appears in the transport object (protocol), it may
> get
> > > more complex.
> > >
> > > For initial thoughts, sharing receive buffers could be handled
> by:
> > >
> > > 1. Creating an explicit SRQ object as a 'peer' to an endpoint.
> SRQ
> > > would have the ability to associate receive buffers with it.
> > > Endpoints would need to be associated with an SRQ to make use of
> it.
> > > 2. Create an SRQ 'endpoint' object. A send-receive endpoint
> could
> > > be created from and inherent the SRQ interfaces.
> > > 3. Add an endpoint property to allow sharing data buffers.
> Shared
> > > buffers could be posted to a domain object, or, alternatively,
> any
> > endpoint.
> > >
> > > Ultimately, the question becomes a matter of where the 'post
> receive
> > > buffer' operation resides, and the behavior of any 'post receive
> buffer'
> > > call which may reside elsewhere. E.g. SRQ::PostRecv() versus
> > > EP::PostRecv(), what is the behavior of EP::PostRecv() if buffer
> > > sharing is enabled?
> > >
> > > These assume SRQ as a non-transport object, or at least one that
> is
> > > not visible to the application.
> > >
> > >
> > >
> > > > Liran mentioned that you wanted me to repeat what I said - my
> only
> > > > comment was that we not couple transport (connection based
> > > > transport) with tag- matching (or any other object supported by
> > > > the
> > library).
> > > > These are two different concepts, and should be kept separate.
> > > >
> > > >
> > > >
> > > > Rich
>
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg
>
>
More information about the ofiwg-mpi
mailing list