<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=US-ASCII">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.45">
<TITLE>RE: Reference count as a solution to the problem of an object life time</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2>See also my answer to Sean.</FONT>
</P>
<P><FONT SIZE=2>One more thing to notice is that I see a difference between "destroying" the object and freeing the memory that it is using.</FONT></P>
<BR>
<P><FONT SIZE=2>>-----Original Message-----</FONT>
<BR><FONT SIZE=2>>From: Fab Tillier [<A HREF="mailto:ftillier@silverstorm.com">mailto:ftillier@silverstorm.com</A>]</FONT>
<BR><FONT SIZE=2>>Sent: Tuesday, September 20, 2005 12:51 AM</FONT>
<BR><FONT SIZE=2>>To: 'Tzachi Dar'; openib-windows@openib.org</FONT>
<BR><FONT SIZE=2>>Subject: RE: Reference count as a solution to the problem of an object life</FONT>
<BR><FONT SIZE=2>>time</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>> From: Tzachi Dar [<A HREF="mailto:tzachid@mellanox.co.il">mailto:tzachid@mellanox.co.il</A>]</FONT>
<BR><FONT SIZE=2>>> Sent: Monday, September 19, 2005 1:45 PM</FONT>
<BR><FONT SIZE=2>>></FONT>
<BR><FONT SIZE=2>>> Hi Fab,</FONT>
<BR><FONT SIZE=2>>></FONT>
<BR><FONT SIZE=2>>> Perhaps I'm wrong about this, but there is one difference between the two</FONT>
<BR><FONT SIZE=2>>> models.</FONT>
<BR><FONT SIZE=2>>></FONT>
<BR><FONT SIZE=2>>> I believe that this difference is what forces one to wait for the destroy</FONT>
<BR><FONT SIZE=2>>> completion. This difference is when the number of call backs is not known</FONT>
<BR><FONT SIZE=2>>to</FONT>
<BR><FONT SIZE=2>>> the user. In my approach the library can play with the reference count</FONT>
<BR><FONT SIZE=2>>and</FONT>
<BR><FONT SIZE=2>>> make sure that it is increased even when there are more than one</FONT>
<BR><FONT SIZE=2>>callback,</FONT>
<BR><FONT SIZE=2>>> while on your model (one removes the reference in the callback) one can</FONT>
<BR><FONT SIZE=2>>not</FONT>
<BR><FONT SIZE=2>>> know when to remove the reference.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>You do know when to remove the reference - in the destroy callback. Once</FONT>
<BR><FONT SIZE=2>>the</FONT>
<BR><FONT SIZE=2>>deref call is invoked, no further callbacks will ever occur for that</FONT>
<BR><FONT SIZE=2>>object.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>It's a slight change in logic. Your model takes a reference when it needs</FONT>
<BR><FONT SIZE=2>>to</FONT>
<BR><FONT SIZE=2>>invoke a callback, and releases it after the callback returns. The IBAL</FONT>
<BR><FONT SIZE=2>>object</FONT>
<BR><FONT SIZE=2>>model holds a reference for the lifetime of the object, and releases it</FONT>
<BR><FONT SIZE=2>>when</FONT>
<BR><FONT SIZE=2>>that object is destroyed. Obviously, this doesn't support sharing objects</FONT>
<BR><FONT SIZE=2>>between multiple users well, but this hasn't been a problem yet.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>If you're sharing an object, waiting until that object is destroyed does</FONT>
<BR><FONT SIZE=2>>keep a</FONT>
<BR><FONT SIZE=2>>user's context around longer than might be necessary. But objects aren't</FONT>
<BR><FONT SIZE=2>>shared</FONT>
<BR><FONT SIZE=2>>in any of the public APIs, so this isn't a problem as far as I know.</FONT>
<BR><FONT SIZE=2>This is probably not a problem now, but I want to allow a more general model in which our internal and external API's are consistent and can be used better. Please note that we had a problem in which objects were destroyed while they were busy. In my approach there is no problem in destroying an object while it is busy.</FONT></P>
<P><FONT SIZE=2>>> A good example for this is the CM API's. If</FONT>
<BR><FONT SIZE=2>>> I understand correctly, one call CM_REQ and doesn't know how many call</FONT>
<BR><FONT SIZE=2>>backs</FONT>
<BR><FONT SIZE=2>>> there will be. For example I have received a REP, and if I was too busy</FONT>
<BR><FONT SIZE=2>>to</FONT>
<BR><FONT SIZE=2>>> answer I have received a DREQ. As I understood it, the only way to know</FONT>
<BR><FONT SIZE=2>>that</FONT>
<BR><FONT SIZE=2>>> the last callback was sent is to wait for the completion of the destroy</FONT>
<BR><FONT SIZE=2>>of the</FONT>
<BR><FONT SIZE=2>>> QP.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>The current CM design invokes the CM callbacks on a QP, and you can't</FONT>
<BR><FONT SIZE=2>>destroy</FONT>
<BR><FONT SIZE=2>>the QP until all its associated callbacks have been either delivered or</FONT>
<BR><FONT SIZE=2>>cancelled. I don't see how your model would change that. The user's QP</FONT>
<BR><FONT SIZE=2>>context</FONT>
<BR><FONT SIZE=2>>can't be freed until the access layer notifies the user that it is safe to</FONT>
<BR><FONT SIZE=2>>do</FONT>
<BR><FONT SIZE=2>>so. I don't see how your model helps this.</FONT>
<BR><FONT SIZE=2>In my model, the QP can be destroyed whenever you want. If there are CM callbacks, than the QP is destroyed, but memory isn't freed. Once the CM has finished doing his callbacks, (either doing them or not) he will release the (last) reference and the memory will be freed. This is why I belive that this method is so good.</FONT></P>
<P><FONT SIZE=2>>I believe that in the current code, all pending (not in flight) events are</FONT>
<BR><FONT SIZE=2>>flushed as soon as you initiate destruction of the QP. If you initiate QP</FONT>
<BR><FONT SIZE=2>>destruction from the REP callback, the REJ gets flushed, and when the REP</FONT>
<BR><FONT SIZE=2>>callback unwinds, reference counts get released and the queue pair's</FONT>
<BR><FONT SIZE=2>>reference</FONT>
<BR><FONT SIZE=2>>count goes to zero and your destroy callback gets invoked.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>Please note that when I start the destroy of the QP I don't know it's state. (It's state might be changing a micro second ago) so I don't know what callbacks should happen. As a result I have to wait and it complicates the code.</FONT></P>
<BR>
<P><FONT SIZE=2>>> A similar example is a timer object that works automatically, that is you</FONT>
<BR><FONT SIZE=2>>set</FONT>
<BR><FONT SIZE=2>>> it once and every second, you get a call back. A new callback, is started</FONT>
<BR><FONT SIZE=2>>even</FONT>
<BR><FONT SIZE=2>>> if the previous wasn't. In this model, when I want to stop things, I just</FONT>
<BR><FONT SIZE=2>>> can't know when the last call back will happen. The only way to solve</FONT>
<BR><FONT SIZE=2>>this is</FONT>
<BR><FONT SIZE=2>>> to call stop on timer, wait (event, callback or whatever) for the timer</FONT>
<BR><FONT SIZE=2>>to</FONT>
<BR><FONT SIZE=2>>> stop and than remove my reference (I want to make this clear there might</FONT>
<BR><FONT SIZE=2>>still</FONT>
<BR><FONT SIZE=2>>> be others using this object!!!).</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>So you want to have a timer that can invoke multiple user's callbacks?</FONT>
<BR><FONT SIZE=2>>This</FONT>
<BR><FONT SIZE=2>>introduces a new object model - currently, there is only one recipient of</FONT>
<BR><FONT SIZE=2>>callbacks. For a multi-client object, having the ability to take</FONT>
<BR><FONT SIZE=2>>references</FONT>
<BR><FONT SIZE=2>>would help let clients deregister without having to wait until all clients</FONT>
<BR><FONT SIZE=2>>have</FONT>
<BR><FONT SIZE=2>>deregistered. Shutdown/destroy/whatever wouldn't really destroy the timer,</FONT>
<BR><FONT SIZE=2>>it</FONT>
<BR><FONT SIZE=2>>would just deregister the callback, and when there are no references left,</FONT>
<BR><FONT SIZE=2>>would</FONT>
<BR><FONT SIZE=2>>implicitly destroy the timer. This isn't a timer object anymore, as it</FONT>
<BR><FONT SIZE=2>>introduces the notion of event dispatching to allow multiplexing to</FONT>
<BR><FONT SIZE=2>>multiple</FONT>
<BR><FONT SIZE=2>>clients. For this, I agree that allowing the dispatcher to take references</FONT>
<BR><FONT SIZE=2>>on</FONT>
<BR><FONT SIZE=2>>each client can be helpful.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>This is only an issue for multi-client objects - single user objects don't</FONT>
<BR><FONT SIZE=2>>need</FONT>
<BR><FONT SIZE=2>>the ability to take extra references. I don't see any need for multi-</FONT>
<BR><FONT SIZE=2>>client</FONT>
<BR><FONT SIZE=2>>objects being exposed in the API - maybe if you could explain what you're</FONT>
<BR><FONT SIZE=2>>trying</FONT>
<BR><FONT SIZE=2>>to do it might make more sense to me.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>The problem is not multi-client objects but rather one object that has more than one call back. If you look at the CM state diagram page 687 in the IB spec you will see that the numbers of callback possible at any state is large. As a result I don't know when I have received the last one and I have to wait. (see also bellow)</FONT></P>
<BR>
<P><FONT SIZE=2>>> On the model that I propose, the timer will increase the reference count</FONT>
<BR><FONT SIZE=2>>> before each callback and will decrease the reference after the callback.</FONT>
<BR><FONT SIZE=2>>> As a result, after I call stop on the timer, I can safely remove me</FONT>
<BR><FONT SIZE=2>>reference,</FONT>
<BR><FONT SIZE=2>>> and I DON'T HAVE TO WAIT.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>When you say you don't have to wait, do you mean wait in the</FONT>
<BR><FONT SIZE=2>>WaitForSingleObject</FONT>
<BR><FONT SIZE=2>>sense, or do you wait for the reference count to reach zero (i.e. your</FONT>
<BR><FONT SIZE=2>>object</FONT>
<BR><FONT SIZE=2>>can immediately be freed)? I think at a minimum, you must wait for the</FONT>
<BR><FONT SIZE=2>>reference count to reach zero, whether through a call to</FONT>
<BR><FONT SIZE=2>>WaitForSingleObject or</FONT>
<BR><FONT SIZE=2>>letting the dereference handler work asynchronously. If you are going to</FONT>
<BR><FONT SIZE=2>>wait</FONT>
<BR><FONT SIZE=2>>for the reference count to reach zero, this isn't any different than</FONT>
<BR><FONT SIZE=2>>waiting for</FONT>
<BR><FONT SIZE=2>>the destroy callback.</FONT>
<BR><FONT SIZE=2>I mean wait in both meaning. In my model I don't have to wait for anything - the object will destroy itself once it's reference has reached zero.</FONT></P>
<BR>
<P><FONT SIZE=2>>> Is there someway to solve this problem in your model?</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>I don't see what the problem is. Can you be more specific about what it is</FONT>
<BR><FONT SIZE=2>>you</FONT>
<BR><FONT SIZE=2>>want to change in the code - like APIs etc? Your object model can work</FONT>
<BR><FONT SIZE=2>>fine, I</FONT>
<BR><FONT SIZE=2>>just don't see why we should change the model throughout the API when there</FONT>
<BR><FONT SIZE=2>>are</FONT>
<BR><FONT SIZE=2>>more important problems to solve.</FONT>
<BR><FONT SIZE=2>As was descried above and in my answer to Sean Hefty my model makes life simpler (especially I don't like the fact that I have to wait on a QP destroy (more than this the problem is probably that I don't know what to wait for, If I new exactly how many call backs there will be it would be probably fine)). </FONT></P>
<P><FONT SIZE=2>I believe that if we are doing a revolution in our API (and our implementation) than we should consider doing this change as well.</FONT></P>
<P><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>>- Fab</FONT>
</P>
</BODY>
</HTML>