[ofw][ipoib] Connected Mode changes for review.
Tzachi Dar
tzachid at mellanox.co.il
Fri Oct 17 07:30:29 PDT 2008
How does the code handle multicast messages that are bigger than the UD
MTU?
Thanks
Tzachi
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Alex Estrin
> Sent: Friday, October 17, 2008 3:29 PM
> To: ofw at lists.openfabrics.org
> Subject: [ofw][ipoib] Connected Mode changes for review.
>
> Hello,
>
> Here are the changes introducing ipoib connected mode
> implementation (as per RFC 4755).
> Source code that is quite close to these changes can be
> pulled from
> svn://openib.tc.cornell.edu/gen1/branches/ipoib_cm to your
> working copy ULP folder along with regular ipoib. Something
> like this will do:
> svn co svn://openib.tc.cornell.edu/gen1/branches/ipoib_cm
> <your-sandbox-trunk>\ulp\ipoib_cm
> Build is straight forward. Code was tested on 2003 x86 and
> x64 with Linux OFED 1.3.1 Some implementation points worth mentioning:
>
> Connection:
> Listening CEP associated with local endpoint.
> Connection established per endpoint. CM Active side, QP
> creation/destroy is offloaded to system thread (PASSIVE LVL).
> Conn REQ sends along with unicast ARP reply to the
> endpoint if that endpoint reports it's CM capabilities in ARP request.
> Host will also accept conn REQ from the same endpoint.
> As a result one or two RC QPs will be created per connection
> ( to match linux behavior).
> Both QPs are developed bidirectional so the first
> established connection will be used for transmission.
> Max payload MTU is set to 65520 (to match Linux ipoib
> cm MTU size).
> Since there is no HW support for RC QP yet, checksum
> offload flags were forced to set: Send - disabled, receive - bypass.
>
> Send path:
> So far it implements a simple rule - if endpoint
> connected all unicasts go through RC QP, the rest goes
> through UD QP(please see minor issues note below).
>
> Recv path:
> endpoint recv queue attached to SRQ.
> SRQ is created per port, SRQ queue size calculated
> using data from ca attributes query (might need to come up
> with better scalable value).
> introduces new descriptor type that extends layout of
> UD receive descriptor.
> implemented simple ARP filter (probably should go away.
> please see minor issues note)
> reused existing filter for UDP/DHCP packets.
>
> Common code changes:
> added new file ipoib_cm.c (most IB CM related code was
> put there).
> fixed and put to use ipoib hw addr fields manipulation routines
> (ipoib_xfr_mgr.h)
> cm recv buffer management implemented in ipoib_endpoint.c
> recv statistics update was optimized a bit for CM path.
> most global definitions moved to ipoib_driver.h.
> Ca attrs query packaged to a function and result is
> saved for the life of the port.
> added MiniportCancelSendPackets routine.
> added ErrorLog messages for success/failed CM initialization.
> reduced some debug print noise by moving statistic OIDs
> and few other to higher level.
> some minor code format, while tried to maintain
> consistent project coding style.
>
> Known major issues:
> code wasn't tested yet to work properly along with LSO,
> so far CM is forced to stay disabled if LSO is enabled.
> code wasn't tested with "avoid the CM" patch yet.
> Just figured out it is possible to recv DREQ from Linux
> (on it's ARP entry expiration), while windows host ARP entry
> is still there, so if application sends to linux message
> larger than UD payload MTU, windows won't re-connect(first
> packet is not ARP) and will try to send large message through
> UD QP, which obviously fail. This is not a case for windows
> hosts though, where connection stays alive until endpoint goes away.
>
> minor issues:
> SID is misformatted (IETF bit) to match Linux implementation.
> Linux PR is open.
> ARP reply (as well as neighbor discovery ) goes through RC QP.
> (RFC says it must go through UD. I'm not sure though,
> what was the reason behind this requirement).
>
> Please review.
>
> Thanks,
> Alex.
>
More information about the ofw
mailing list