[ofa-general] Verbs: IB vs. iWARP
Jeff Squyres
jsquyres at cisco.com
Thu May 8 14:28:46 PDT 2008
Over the past 24 hours, we assembled a list of differences between IB
and iWARP usage of verbs. I got a few comments on the text we
assembled, and figured it was time to turn this text over to
OpenFabrics to make it fully correct/complete/whatever, and then
publish it however you see fit.
I hope this starter text is helpful to you; enjoy.
-----
* struct ib_device.transport_type will be IBV_TRANSPORT_IWARP for
iWARP devices and IBV_TRANSPORT_IB for IB devices.
* ibv_query_gid():
* When invoked on an IB HCA, will return the IB subnet prefix in
subnet_prefix and GUID of the port in the interface_id.
* When invoked on an iWARP NIC, will return the NIC's MAC address
in subnet_prefix and 0 in the interface_id.
* iWARP QPs ''must'' be made with the RDMA CM; IB QPs can be made
using the IB CM, RDMA CM, or some other (assumedly out-of-band)
mechanism.
* When making QPs, some versions of iWARP drivers require the
initiator of the connection to send the first message (having the non-
initiator send the first message will terminate the connection).
Newer versions of iWARP firmware/drivers hide this functionality down
in the driver, so the ULP doesn't have to ensure that the initiator
sends the first message.
* When terminating connections via the RDMA CM (via the
rdma_disconnect() call or by simply destroying the QP without
disconnecting first), iWARP transports will automatically create a CQE
for any pending send or receive WRs with the status set to
IBV_WC_WR_FLUSH_ERR. Note that IB HCAs do the same thing, but the
iWARP RDMA CM disconnection progresses independently of the ULP,
meaning that when one side issues the disconnect, the other side will
automatically be disconnected (even if the ULP doesn't realize it).
IB HCAs may not process the disconnect until later (via RDMA CM or
otherwise), perhaps not until the ULP realizes that the disconnect has
occurred. In short: device-independent verbs-based applications need
to be able to handle FLUSH WRs during disconnection and not treat them
as an error.
* LIDs are always 0 in iWARP.
* LMC is always 0 for iWARP.
* Memory regions used to receive RDMA read responses must have
"remote write" permission (since in the iWARP protocol, RDMA read
responses are basically the same as incoming RDMA write requests).
* Atomics and immediate data are not available in iWARP.
* The sink scatter-gather list for an RDMA read can only have one
element for iWARP (which is reported accurately in struct
ibv_device.max_sge).
* Send completions provide a slightly different guarantee:
* iWARP: indicates that the resources in the corresponding WR can
be reused; it does ''not'' indicate that the data is in the peer's
memory, or even that they have been transmitted yet.
* IB: indicates that the data has been transmitted and has arrived
at the remote HCA (but is not necessarily in the remote target buffer
yet)
* All currently-available RNICs (May 2008) do not support RNR
retry. Specifically: current RNICs will terminate a QP connection if
a SEND arrives with no corresponding pre-posted receive.
--
Jeff Squyres
Cisco Systems
More information about the general
mailing list