<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=US-ASCII">

<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.45">

<TITLE>RE: Reference count as a solution to the problem of an object life time</TITLE>

</HEAD>

<BODY>


<P><FONT SIZE=2>Hi Fab,</FONT>

</P>


<P><FONT SIZE=2>It seems that the only place in which I'm missing you is why there is a need for a "destroy complete" call-back. According to the method that I mention, there is no need for that I just call destroy and forget about the object. I haven't yet figured why the destroy call back is needed on your approach.</FONT></P>


<P><FONT SIZE=2>Some more small comments inside. </FONT>

</P>


<P><FONT SIZE=2>>-----Original Message-----</FONT>

<BR><FONT SIZE=2>>From: Fab Tillier [<A HREF="mailto:ftillier@silverstorm.com">mailto:ftillier@silverstorm.com</A>]</FONT>

<BR><FONT SIZE=2>>Sent: Friday, September 16, 2005 8:07 PM</FONT>

<BR><FONT SIZE=2>>To: 'Tzachi Dar'</FONT>

<BR><FONT SIZE=2>>Subject: RE: Reference count as a solution to the problem of an object life</FONT>

<BR><FONT SIZE=2>>time</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>> From: Tzachi Dar [<A HREF="mailto:tzachid@mellanox.co.il">mailto:tzachid@mellanox.co.il</A>]</FONT>

<BR><FONT SIZE=2>>> Sent: Friday, September 16, 2005 6:51 AM</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> The way that I would like to see, is an improvement of one of the</FONT>

<BR><FONT SIZE=2>>suggestions</FONT>

<BR><FONT SIZE=2>>> that you have mentioned. I'm adding it bellow.</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> "Yet another solution is to have callback driven objects take an</FONT>

<BR><FONT SIZE=2>>interface</FONT>

<BR><FONT SIZE=2>>> representing the client's context, providing the object with the ability</FONT>

<BR><FONT SIZE=2>>to</FONT>

<BR><FONT SIZE=2>>> take and release references on the client's context.  This would allow a</FONT>

<BR><FONT SIZE=2>>CQ,</FONT>

<BR><FONT SIZE=2>>> for example, to hold a reference on its user's context as long as a</FONT>

<BR><FONT SIZE=2>>callback</FONT>

<BR><FONT SIZE=2>>> could be outstanding.  This isn't much different than the current</FONT>

<BR><FONT SIZE=2>>> implementation, however, aside from providing the ability to take</FONT>

<BR><FONT SIZE=2>>additional</FONT>

<BR><FONT SIZE=2>>> references on the client's context."</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> The change that I would like to see, is not "providing the object with</FONT>

<BR><FONT SIZE=2>>the</FONT>

<BR><FONT SIZE=2>>> ability to take reference" but rather *forcing* the object to take the</FONT>

<BR><FONT SIZE=2>>> reference.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>Just like the access layer must take a reference on an object before</FONT>

<BR><FONT SIZE=2>>returning</FONT>

<BR><FONT SIZE=2>>that object to a user, the user must take a reference on its context before</FONT>

<BR><FONT SIZE=2>>supplying that context to an object.  Putting code in the callee to take</FONT>

<BR><FONT SIZE=2>>the</FONT>

<BR><FONT SIZE=2>>reference leaves a timing window that shouldn't be there.  It is up to the</FONT>

<BR><FONT SIZE=2>>provider of the object to take the reference on its object.  This is</FONT>

<BR><FONT SIZE=2>>similar to</FONT>

<BR><FONT SIZE=2>>how the IRP_MN_QUERY_INTERFACE process works, where the provider of an</FONT>

<BR><FONT SIZE=2>>interface</FONT>

<BR><FONT SIZE=2>>takes a reference on that interface on behalf of the caller.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>There is a difference between the case of one "returning an object" to the case of one "supplying an object". The difference is when I return an object I might not want to hold the object anymore. If I remove me reference this might be the last reference and the object might be destroyed. On the other hand when I supply an object to someone (by calling his function with a pointer to the object) I must be having at least one reference and that reference can not be removed by me until the function returns, so there is no "window" in which the reference can go back to zero.</FONT></P>

<BR>


<P><FONT SIZE=2>>So with the destroy callback we currently have (which can be used to</FONT>

<BR><FONT SIZE=2>>dereference</FONT>

<BR><FONT SIZE=2>>the object), I believe we have exactly the behavior you are looking for.</FONT>

<BR><FONT SIZE=2>>The</FONT>

<BR><FONT SIZE=2>>only difference is when the destroy callback is supplied - in the destroy</FONT>

<BR><FONT SIZE=2>>function rather than at creation time.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>As I said, I don't understand why I need a destroy callback and more than that, why should I wait for one.</FONT>

</P>


<P><FONT SIZE=2>>> As my original mail said, everyone who is holding a pointer to the</FONT>

<BR><FONT SIZE=2>>> object and would like to use it *must* increase the reference count.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>The reference count must be increased before the object pointer is given to</FONT>

<BR><FONT SIZE=2>>anyone.  The user therefore doesn't need to increase the reference count at</FONT>

<BR><FONT SIZE=2>>all.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>> There are</FONT>

<BR><FONT SIZE=2>>> other two things that should also apply (those are probably obvious but I</FONT>

<BR><FONT SIZE=2>>> would like to make sure that we have a common language): 1) Each object</FONT>

<BR><FONT SIZE=2>>should</FONT>

<BR><FONT SIZE=2>>> have a Shutdown() function (probably something like a destroy today) 2)</FONT>

<BR><FONT SIZE=2>>Each</FONT>

<BR><FONT SIZE=2>>> object must have a lock. One of the things that shutdown will do will be</FONT>

<BR><FONT SIZE=2>>to</FONT>

<BR><FONT SIZE=2>>> flag that the object is in shutdown state. So every thing that will be</FONT>

<BR><FONT SIZE=2>>done on</FONT>

<BR><FONT SIZE=2>>> the object will be: 1) take the lock (nothing new). 2) Check that the</FONT>

<BR><FONT SIZE=2>>object</FONT>

<BR><FONT SIZE=2>>> wasn't destroyed yet (again I believe that this is already the case. I'm</FONT>

<BR><FONT SIZE=2>>not</FONT>

<BR><FONT SIZE=2>>> sure although that this is done when the lock is taken - and if not this</FONT>

<BR><FONT SIZE=2>>is a</FONT>

<BR><FONT SIZE=2>>> mistake.) The last thing that each object should have is a function that</FONT>

<BR><FONT SIZE=2>>will</FONT>

<BR><FONT SIZE=2>>> do the entire cleaning, once the reference count reaches 0 ( in the C++</FONT>

<BR><FONT SIZE=2>>> methodology, this is the destructor).</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>I don't think you need to check the state all the time.  If you have a</FONT>

<BR><FONT SIZE=2>>reference</FONT>

<BR><FONT SIZE=2>>on the object, then it is safe to manipulate it.  The functions</FONT>

<BR><FONT SIZE=2>>manipulating the</FONT>

<BR><FONT SIZE=2>>object can check the state if appropriate.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>For example, there is no need to check the state for calls to ref_al_obj</FONT>

<BR><FONT SIZE=2>>and</FONT>

<BR><FONT SIZE=2>>deref_al_obj.  Assertions here are sufficient - anyone calling ref_al_obj</FONT>

<BR><FONT SIZE=2>>*must*</FONT>

<BR><FONT SIZE=2>>already be holding a reference, otherwise the object pointer could be</FONT>

<BR><FONT SIZE=2>>stale.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>I agree that for ref_al and dref_al there is no need to check if an object is destroyed or not, but for all other cases there is probably a need to check. This is because the model that I'm referring to allow everyone that has a reference to the object to call destroy when needed (error, shutdown or any other reason).</FONT></P>


<P><FONT SIZE=2>>As an optimization, we should try to use atomic operations for the state</FONT>

<BR><FONT SIZE=2>>variables, allowing the code to eliminate the locking around state checks</FONT>

<BR><FONT SIZE=2>>and</FONT>

<BR><FONT SIZE=2>>changes.  Note that not all state machines can work properly with atomics,</FONT>

<BR><FONT SIZE=2>>though, so some cases will still need locks.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>I agree on the places that it is possible. </FONT>

</P>


<P><FONT SIZE=2>>> Here is an example for a timer object: this is the interface that I think</FONT>

<BR><FONT SIZE=2>>that</FONT>

<BR><FONT SIZE=2>>> a timer object should have. I'm writing this in C++, as I believe that</FONT>

<BR><FONT SIZE=2>>this is</FONT>

<BR><FONT SIZE=2>>> a better language, if we agree on the concepts we can always stay in C</FONT>

<BR><FONT SIZE=2>>and</FONT>

<BR><FONT SIZE=2>>> have the same effects.</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> So here it is:</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> Enum reason {</FONT>

<BR><FONT SIZE=2>>>         TimeArrived,</FONT>

<BR><FONT SIZE=2>>>         Cancelled,</FONT>

<BR><FONT SIZE=2>>>         Shutdown</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> } // There are three reasons why a shutdown might happen: The time</FONT>

<BR><FONT SIZE=2>>arrived,</FONT>

<BR><FONT SIZE=2>>> the operation was cancalled, or the object (driver) is shutting down.</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> Interface TimerCallback {</FONT>

<BR><FONT SIZE=2>>>         Void AddRef();</FONT>

<BR><FONT SIZE=2>>>         Void Release();</FONT>

<BR><FONT SIZE=2>>>         Void TimerAction(Reason reason);</FONT>

<BR><FONT SIZE=2>>> }</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> What I mean here is that if one wants to use the timer he should pass</FONT>

<BR><FONT SIZE=2>>three</FONT>

<BR><FONT SIZE=2>>> functions: AddRef(), Release() and TimerAction that will be called when</FONT>

<BR><FONT SIZE=2>>the</FONT>

<BR><FONT SIZE=2>>> time comes. The object itself should probably else be passed, if we stay</FONT>

<BR><FONT SIZE=2>>in C.</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> Class Timer {</FONT>

<BR><FONT SIZE=2>>>         HREULT Init();</FONT>

<BR><FONT SIZE=2>>>         HREULT Schedule(DWORD time, TimerCallBack * pTimer);</FONT>

<BR><FONT SIZE=2>>>         HRESULT Cancel();</FONT>

<BR><FONT SIZE=2>>>         VOID Shutdown();</FONT>

<BR><FONT SIZE=2>>>         /// And very likely also addref, release</FONT>

<BR><FONT SIZE=2>>> }</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> And so if I have an object for example an sdp connection, and I want to</FONT>

<BR><FONT SIZE=2>>make</FONT>

<BR><FONT SIZE=2>>> sure that a function will be called, than all I have to do is this:</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> Connection pConnection; // This is the connection that I have</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> Timer.Schdule(pConnection, 50000);</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> What will happen is this:</FONT>

<BR><FONT SIZE=2>>> 1) If I don't want to use pConnection any more I can release it's</FONT>

<BR><FONT SIZE=2>>reference</FONT>

<BR><FONT SIZE=2>>> and forget it.</FONT>

<BR><FONT SIZE=2>>> 2) The call back function will always be called. If the timer is</FONT>

<BR><FONT SIZE=2>>cancelled,</FONT>

<BR><FONT SIZE=2>>> than the flag will tell that it was cancelled and the object can decide</FONT>

<BR><FONT SIZE=2>>to</FONT>

<BR><FONT SIZE=2>>> behave in a different way (for example do nothing). The same thing is</FONT>

<BR><FONT SIZE=2>>true</FONT>

<BR><FONT SIZE=2>>> about shutdown of the system.</FONT>

<BR><FONT SIZE=2>>></FONT>

<BR><FONT SIZE=2>>> What I like about this system is that one can do an Async or non Async</FONT>

<BR><FONT SIZE=2>>> operation and just forget about it. If you want an object destroyed you</FONT>

<BR><FONT SIZE=2>>can</FONT>

<BR><FONT SIZE=2>>> always do it, and you don't have to think about the influence of this on</FONT>

<BR><FONT SIZE=2>>other</FONT>

<BR><FONT SIZE=2>>> parts of the program.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>Your Timer example is identical in behavior to how SA queries and other</FONT>

<BR><FONT SIZE=2>>asynchronous operations behave in IBAL today - the callback is always</FONT>

<BR><FONT SIZE=2>>invoked,</FONT>

<BR><FONT SIZE=2>>regardless of the reason.  This gives a consistent usage model for users,</FONT>

<BR><FONT SIZE=2>>rather</FONT>

<BR><FONT SIZE=2>>than forcing them to handle callback notifications for some cases, and not</FONT>

<BR><FONT SIZE=2>>for</FONT>

<BR><FONT SIZE=2>>others.  The difference lies on what object gets referenced.  For the SA</FONT>

<BR><FONT SIZE=2>>queries, a reference is taken on the AL instance.  The AL instance can</FONT>

<BR><FONT SIZE=2>>therefore</FONT>

<BR><FONT SIZE=2>>not complete destruction until the SA query callback is invoked and</FONT>

<BR><FONT SIZE=2>>unwinds.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>The complib objects don't support asynchronous destruction, and I don't</FONT>

<BR><FONT SIZE=2>>know if</FONT>

<BR><FONT SIZE=2>>they should.  As we develop new drivers, I think we should use the native</FONT>

<BR><FONT SIZE=2>>Windows APIs rather than the complib abstraction.  For example, in user-</FONT>

<BR><FONT SIZE=2>>mode,</FONT>

<BR><FONT SIZE=2>>the timer objects used by the cl_timer abstraction don't support a callback</FONT>

<BR><FONT SIZE=2>>driven asynchronous model, but rather an event driven one (see</FONT>

<BR><FONT SIZE=2>>DeleteTimerQueueTimer).  Changing this object model be support callback</FONT>

<BR><FONT SIZE=2>>driven</FONT>

<BR><FONT SIZE=2>>destruction notifications will require an internal thread, and the destroy</FONT>

<BR><FONT SIZE=2>>call</FONT>

<BR><FONT SIZE=2>>to create an event that this thread waits on.  The thread would then invoke</FONT>

<BR><FONT SIZE=2>>the</FONT>

<BR><FONT SIZE=2>>callback when that event is set by the Win32 subsystem.  This adds a lot of</FONT>

<BR><FONT SIZE=2>>extra complexity that I don't think is necessary, as well as introducing</FONT>

<BR><FONT SIZE=2>>more</FONT>

<BR><FONT SIZE=2>>error paths that must be handled (what if event creation fails due to</FONT>

<BR><FONT SIZE=2>>limited</FONT>

<BR><FONT SIZE=2>>resources?)  Rather than creating more abstractions, I would like to see us</FONT>

<BR><FONT SIZE=2>>interface to the native API.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>I agree on this, the "sad" thing is that we are not creating new drivers we are changing existing ones, so we have to face this more complicated situation.</FONT></P>


<P><FONT SIZE=2>>Lastly, the callback model has a few limitations, in that the callbacks</FONT>

<BR><FONT SIZE=2>>invoke</FONT>

<BR><FONT SIZE=2>>client code.  The reason ib_close_al is currently a synchronous call is</FONT>

<BR><FONT SIZE=2>>that</FONT>

<BR><FONT SIZE=2>>callback driven completion notifications don't work properly for the last</FONT>

<BR><FONT SIZE=2>>call</FONT>

<BR><FONT SIZE=2>>the user makes (before unloading).  If the callback release the last</FONT>

<BR><FONT SIZE=2>>reference</FONT>

<BR><FONT SIZE=2>>on the driver object, the driver object can be unloaded before the callback</FONT>

<BR><FONT SIZE=2>>unwinds (I have seen this happen).  The callback then creates an access</FONT>

<BR><FONT SIZE=2>>violation and BSOD due to referencing freed memory when it unwinds.  I plan</FONT>

<BR><FONT SIZE=2>>on</FONT>

<BR><FONT SIZE=2>>extending ib_close_al to be asynchronous.  In user-mode, this would be done</FONT>

<BR><FONT SIZE=2>>similarly to DeleteTimerQueueTimer, with the user passing in an event that</FONT>

<BR><FONT SIZE=2>>would</FONT>

<BR><FONT SIZE=2>>be signaled when the AL instance is destroyed.  In kernel-mode, ib_open_al</FONT>

<BR><FONT SIZE=2>>would</FONT>

<BR><FONT SIZE=2>>take as input parameter the client's DEVICE_OBJECT.  Because the AL</FONT>

<BR><FONT SIZE=2>>instance</FONT>

<BR><FONT SIZE=2>>itself does not perform any callbacks (only child objects), the ib_close_al</FONT>

<BR><FONT SIZE=2>>function would call ObDereferenceObject once the instance is destroyed.</FONT>

<BR><FONT SIZE=2>>Both</FONT>

<BR><FONT SIZE=2>>these solutions solve the unwind problem in the best possible manner for</FONT>

<BR><FONT SIZE=2>>their</FONT>

<BR><FONT SIZE=2>>environment.</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>If we will agree that destroy call backs are not needed for all other reasons except for shutdown I'll send a description of a mechanism that allows to solve this problems as well (without interfering with other code)</FONT></P>

<BR>


<P><FONT SIZE=2>>It sounds to me like we are in agreement.  Are there specific instances</FONT>

<BR><FONT SIZE=2>>where</FONT>

<BR><FONT SIZE=2>>reference counting is lacking?  Can we focus the discussion on the actual</FONT>

<BR><FONT SIZE=2>>areas</FONT>

<BR><FONT SIZE=2>>that need fixing?</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>Thanks,</FONT>

<BR><FONT SIZE=2>></FONT>

<BR><FONT SIZE=2>>- Fab</FONT>

</P>


</BODY>

</HTML>