[ofa-general] Verbs: IB vs. iWARP
Tang, Changqing
changquing.tang at hp.com
Thu May 8 15:02:13 PDT 2008
Great, Thanks, Jeff.
--CQ
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of
> Jeff Squyres
> Sent: Thursday, May 08, 2008 4:29 PM
> To: OpenFabrics General
> Subject: [ofa-general] Verbs: IB vs. iWARP
>
> Over the past 24 hours, we assembled a list of differences
> between IB and iWARP usage of verbs. I got a few comments on
> the text we assembled, and figured it was time to turn this
> text over to OpenFabrics to make it fully
> correct/complete/whatever, and then publish it however you see fit.
>
> I hope this starter text is helpful to you; enjoy.
>
> -----
> * struct ib_device.transport_type will be
> IBV_TRANSPORT_IWARP for iWARP devices and IBV_TRANSPORT_IB
> for IB devices.
>
> * ibv_query_gid():
> * When invoked on an IB HCA, will return the IB subnet
> prefix in subnet_prefix and GUID of the port in the interface_id.
> * When invoked on an iWARP NIC, will return the NIC's MAC
> address in subnet_prefix and 0 in the interface_id.
>
> * iWARP QPs ''must'' be made with the RDMA CM; IB QPs can
> be made using the IB CM, RDMA CM, or some other (assumedly
> out-of-band) mechanism.
>
> * When making QPs, some versions of iWARP drivers require
> the initiator of the connection to send the first message
> (having the non- initiator send the first message will
> terminate the connection).
> Newer versions of iWARP firmware/drivers hide this
> functionality down in the driver, so the ULP doesn't have to
> ensure that the initiator sends the first message.
>
> * When terminating connections via the RDMA CM (via the
> rdma_disconnect() call or by simply destroying the QP without
> disconnecting first), iWARP transports will automatically
> create a CQE for any pending send or receive WRs with the
> status set to IBV_WC_WR_FLUSH_ERR. Note that IB HCAs do the
> same thing, but the iWARP RDMA CM disconnection progresses
> independently of the ULP, meaning that when one side issues
> the disconnect, the other side will automatically be
> disconnected (even if the ULP doesn't realize it).
> IB HCAs may not process the disconnect until later (via RDMA
> CM or otherwise), perhaps not until the ULP realizes that the
> disconnect has occurred. In short: device-independent
> verbs-based applications need to be able to handle FLUSH WRs
> during disconnection and not treat them as an error.
>
> * LIDs are always 0 in iWARP.
>
> * LMC is always 0 for iWARP.
>
> * Memory regions used to receive RDMA read responses must
> have "remote write" permission (since in the iWARP protocol,
> RDMA read responses are basically the same as incoming RDMA
> write requests).
>
> * Atomics and immediate data are not available in iWARP.
>
> * The sink scatter-gather list for an RDMA read can only
> have one element for iWARP (which is reported accurately in
> struct ibv_device.max_sge).
>
> * Send completions provide a slightly different guarantee:
> * iWARP: indicates that the resources in the
> corresponding WR can be reused; it does ''not'' indicate that
> the data is in the peer's memory, or even that they have been
> transmitted yet.
> * IB: indicates that the data has been transmitted and
> has arrived at the remote HCA (but is not necessarily in the
> remote target buffer
> yet)
>
> * All currently-available RNICs (May 2008) do not support
> RNR retry. Specifically: current RNICs will terminate a QP
> connection if a SEND arrives with no corresponding pre-posted receive.
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list