[Openib-windows] RE: Reference count as a solution to the pro blemof an object life time

Tzachi Dar tzachid at mellanox.co.il
Wed Sep 21 06:33:12 PDT 2005


[nice to know that someone else is readnig]

Hi Sean,

You have asked "What problem is being solved here?". I'm not sure what is
the length of the answer that you are expecting here, but I'll try to answer
in different sizes.

Shortest answer:
I want to have a module that allows one to act on objects without knowing
what other people are doing with these objects. (This is a multi-threaded
environment). I want a model in which one can use an object, give it to
someone else and wait for some callback, destroy it, or just give it to
someone else and ***forget*** it (someone else will use it).
(Not that short after all - sorry)


A little longer:
[before I begin, wait can be in the waitforSingleObject or keeping something
as in a callback]

There is no arguing that we should use reference count for maintaining the
lifetime of an object. The difference is in the model that is being used. In
the model implemented today, many times functions receive the object as a
void * and there for they can only use it a "fixed" number of times to do
the call back. Many times this number is 1. Sometimes you get a callback,
and one of the parameters that is passed tells you if there will be another
callback. An example to this is MADs that are being sent by the open-SM.
This model implies that the one who has made the query either has to wait
for the reply (the normal case) or cancel it and wait for the cancel to end.
In any case the one who has started the query has to wait for a reply,
before he himself can shutdown.
The drawbacks of this are: 1) One can not use a non fixed number of
callbacks (the case of a multi-timer). 2) If there is a logic that says
there is more than one callback this logic has to be agreed by both sides.
3) One has to wait for all "his objects to shutdown before he itself can be
shutdown. We have seen bugs that are actually "an object being destroyed
while it is busy". 4) (Probably just another way to look at 2) when there is
a bug in object life time one can't simply say where it is, since there is
another logic that is involved.

A different approach that I have been using for years says something simple:
you don't pass a VOID *, but rather you also pass a pointer to 2 functions.
This are Addref(), Release() . This is the logic that is used: If one has
received an object, that object has a reference. If he wants to keep the
object he increases the reference. When he longer needs the object he
releases the reference. When the reference of an object reaches 0, the
object is destroyed, and the memory is being freed. An object can be
destroyed also by a specific function. In this case the memory is not freed
until the last reference is being released.

I want to give an example to this: "I" create an object (and set it's
reference to 1). I give this object to the CM (IB_CM_LISTEN(obj)). The CM
wants to do a call back in the future, so he increases the reference count
to 2. Now I forget about the object and release my reference (back to 1).
The CM receives a request and wants to call my call back. He does this
without increasing the reference. I use the object during the callback. If I
want to keep it I increase the reference count. Now let's say that after 50
callbacks he has some error. He calls the error callback and releases his
reference count since he doesn't want to use the object any more. The
reference went to 0 and the object is being destroyed (release understands
that the reference has reached 0 and call the destroy automatically). 

It was possible to do all this even in the previous model, but now let's
suppose that I want to make a change and the CM wants to pass the object to
someone else (for example a timer). He can do it without anything being
change neither in it's code, nor in my code. In the previous model this was
not possible since my logic would have to be changed in order for me to know
when I can delete the CM.
(now the short answer seems short enough).

I hope this has made things clearer. 

Thanks
Tzachi





>-----Original Message-----
>From: Sean Hefty [mailto:sean.hefty at intel.com]
>Sent: Tuesday, September 20, 2005 12:55 AM
>To: 'Fab Tillier'; 'Tzachi Dar'; openib-windows at openib.org
>Subject: RE: [Openib-windows] RE: Reference count as a solution to the
>problemof an object life time
>
>>> I believe that this difference is what forces one to wait for the
>destroy
>>> completion. This difference is when the number of call backs is not
>known to
>>> the user. In my approach the library can play with the reference count
>and
>>> make sure that it is increased even when there are more than one
>callback,
>>> while on your model (one removes the reference in the callback) one can
>not
>>> know when to remove the reference.
>>
>>You do know when to remove the reference - in the destroy callback.  Once
>the
>>deref call is invoked, no further callbacks will ever occur for that
>object.
>
>
>What problem is being solved here?
>
>- Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20050921/2545d282/attachment.html>


More information about the ofw mailing list