[ofw] A problem in the ai_ioc_pnp.c file

Fab Tillier ftillier at microsoft.com
Tue Feb 8 11:46:19 PST 2011


Hi Tzachi,

You're right that if the SA query for node records completes before the SA query for path records then __process_query will be called twice with disastrous effects.  The fix here is to take an extra reference on query_cnt while the queries are being initiated (for the duration of __ioc_query_sa.)  Note that when you decrement this extra reference in __ioc_query_sa, you will need to call __process_query if you released the last reference.

The same logic needs to be applied in other functions (__query_iou, __query_ioc_profile, and __query_svc_entries) to prevent the reference count from bouncing, though that's less likely to occur (the logic that sends all the DM MADs.)  Note that some of the error handling looks broken (__ioc_pnp_send_cb of IO_UNIT_INFO timeout is one where it just returns without decrementing the count - it should probably just 'break' instead.)

The ref counting looks right other than that.  The behavior should be straight forward in any case: take a reference before sending out a MAD (whether via ib_query_sa or by explicitly sending a DM MAD.)  When that MAD completes, release the reference.  You can either use locks to serialize initiating requests and their callbacks, or you can add an extra reference when initiating to prevent the count from bouncing around 0 and minimize serialization.  The code attempts to do the latter, but doesn't account for fast responses.

-Fab

From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
Sent: Tuesday, February 08, 2011 9:41 AM
To: Fab Tillier; ofw at lists.openfabrics.org
Subject: A problem in the ai_ioc_pnp.c file

Hello Fab,
Recently, we got the following assert  at __ioc_async_cb:            CL_ASSERT( !gp_ioc_pnp->query_cnt ); (==-1)
We suspect the problem appeared because of:

1.       From __ioc_pnp_timer_cb(
                /* Pre-charge the ref count so that we don't toggle between 0 and 1. */
                cl_atomic_inc( &p_mgr->query_cnt );
                /* Take a reference on the object for the duration of the sweep process. */
                ref_al_obj( &p_mgr->obj );
                for( p_item = cl_qlist_head( &p_mgr->obj.obj_list );
                                p_item != cl_qlist_end( &p_mgr->obj.obj_list );
                                p_item = cl_qlist_next( p_item ) )
                {
                                p_svc = PARENT_STRUCT( PARENT_STRUCT( p_item, al_obj_t, pool_item ),
                                                ioc_pnp_svc_t, obj );
                                cl_atomic_inc( &p_mgr->query_cnt );
                                status = __ioc_query_sa( p_svc );
                                if( status != IB_SUCCESS )
                                                cl_atomic_dec( &p_mgr->query_cnt );
                }
                /* Release the reference we took and see if we're done sweeping. */
                if( !cl_atomic_dec( &p_mgr->query_cnt ) )
                                cl_async_proc_queue( gp_async_pnp_mgr, &p_mgr->async_item )

2.       Here one assumes that either one of 2 situations is true:

·         __ioc_query_sa - returns success and later the reference will be removed by the appropriate callback exactly one time

·         It fails and the reference will be removed immediately
Occasionally, this is not the case

3.       __ioc_query_sa calls twice to ib_query: for Node and Path queries. Thus, 2 callbacks will be invoked thereafter: __node_rec_cb and __path_rec_cb

4.       Each of these callback may (and will) call to __process_query()

5.       If process_query will be called twice, that query_cnt will be decreased twice and we are in trouble.

6.       Theoretically, there is another counter that should avoid this case.

>From __ioc_query_sa():

........

cl_atomic_inc( &p_svc->query_cnt );<---Raise reference to ensure that __process_query will be called only once

status = ib_query( gh_al, &query, &p_svc->h_node_query );

        if( status != IB_SUCCESS )

        {

                        cl_atomic_dec( &p_svc->query_cnt );

                        deref_al_obj( &p_svc->obj );

                        AL_PRINT_EXIT( TRACE_LEVEL_WARNING, AL_DBG_PNP,

                                        ("ib_query returned %s\n", ib_get_err_str( status )) );

                        return status;

        }

..............

        /* Reference the service for the node record query. */

        ref_al_obj( &p_svc->obj );

        cl_atomic_inc( &p_svc->query_cnt );



        status = ib_query( gh_al, &query, &p_svc->h_path_query );

        if( status != IB_SUCCESS )

        {

                        cl_atomic_dec( &p_svc->query_cnt );

                        deref_al_obj( &p_svc->obj );

                        AL_PRINT_EXIT( TRACE_LEVEL_WARNING, AL_DBG_PNP,

                                        ("ib_query returned %s\n", ib_get_err_str( status )) );

        }

>From __path_rec_cb and node_rec_cb:



if( !cl_atomic_dec( &p_svc->query_cnt ) ) //should avoid calling __process_query twice

        {

                        /* The node query has already completed.  Process the results. */

                        __process_query( p_svc );

        }



The issues that we see with the following situation are this:

1)       When __ioc_query_sa is running, the first query will be executed and it's call back will be called before the second query starts. In this case, the __node_rec_cb will decrease the p_svc->query_cnt and bring it to 0. Thus, __process_query will be called twice !

2)       When looking at the function __process_query we expect it to call cl_atomic_dec( &gp_ioc_pnp->query_cnt)exactly one time. However, if the code executes till the end of the function, than we reach a case statement in which in the case of status == success the reference will not be removed.

3)       p_svc->query_cnt seems to be like a reference count that is protecting gp_ioc_pnp->query_cnt. However this only works if p_svc->query_cnt was 0 when __ioc_pnp_timer_cb was called and no one has touched it on the entire time. However, we see that p_svc->query_cnt is being touched in many places in the code, and we don't seem to understand why this assumptions are right.

4)       What is the consequences of calling __process_query more than once (this is an issue that we believe that might currently happen in the code).

We are still trying to understand the assumptions that this code is based on.

Thanks
Tzachi & Alexander (XaleX) Naslednikov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20110208/20ed4be4/attachment.html>


More information about the ofw mailing list