[openib-general] IBM eHCA testing..

Roland Dreier rolandd at cisco.com
Mon Oct 10 14:44:21 PDT 2005


    IBMEHCA> So you need some kind of signal from the operating system
    IBMEHCA> to system firmware, which in the eHCA case is the
    IBMEHCA> H_DEFINE_AQP1 triggered by ib_create_qp with IB_QPT_GSI
    IBMEHCA> parameter.  AFTER that call handshaking between system
    IBMEHCA> firmware and the SM will start, here's a new adapter
    IBMEHCA> active on a switch port... what's your guid? here's your
    IBMEHCA> LID, p_key, SM lid....  ...and after all that it's
    IBMEHCA> possible to send and receive packets from the fabric.
    IBMEHCA> The openib stack expects that a port is fully functional
    IBMEHCA> after this create_qp returns, and starts to do all sorts
    IBMEHCA> of modify QP and post send.  So the only choice we have
    IBMEHCA> there is to delay create_qp until the complete
    IBMEHCA> handshaking between system firmware and the SM has
    IBMEHCA> finished (until we see a IB_PORT_ACTIVE in hcad_mod). If
    IBMEHCA> we don't see that until EHCA_PORT_ACTIVE_TIMEOUT we have
    IBMEHCA> to return an error code to openib, otherwise we're
    IBMEHCA> seriously in trouble (tried that).

I think this scheme breaks the IB model.  When consumers get access to
an HCA, they expect to be able to access the HCA, even if an SM has
not configured it (and even in the case no cable is connected).  As an
example of why this is useful, if the link won't come up, it's nice to
be able to get to query the port's PMA counters to see if there are
excessive errors or something like that.

I understand that you don't want to have all HCAs always visible to
the SM, but the scheme you've chosen puts an unneeded dependency
between driver initialization and the external SM.  It would be fine
if creating QP1 triggered the transition of the port from DOWN to INIT
so that it is discoverable by the SM, but there's no reason for
creation of QP1 to wait to finish until the SM has brought the port up.

(As a side note, Mellanox HCAs don't bring a port to INIT until the
host driver has transitioned QP0 to the RTR state, which seems more
sensible than waiting for QP1 to be created)

I hope this can be fixed in firmware with your current HCA hardware.

 - R.



More information about the general mailing list