[openib-general] [RFC] [PATCH 2/7] ib_multicast 2.6.20: add ib_multicast module to track join requests from the same port
Eitan Zahavi
eitan at mellanox.co.il
Sun Oct 15 01:28:26 PDT 2006
Sean Hefty wrote:
> Eitan Zahavi wrote:
>
>> I disagree. If you sniff at the MAD level you can simply react to the
>> lower level messages.
>>
>
> First, when designing this, I did consider using the MAD snooping ability, and
> changing what could be done with snooping. However, the multicast handling is
> not simply sniffing MADs going out on the wire and incrementing / decrementing
> some count. It can change or prevent a MAD from being sent. This is a
> fundamental change to the behavior of the ib_mad APIs.
>
I am sorry I was not involved in that early stage. My bad.
I need to look deeper into the code. As long as a response is generated
even though the MAD was not sent this is not
an API change but a bug fix.
In this stage it seems that only a patch would convince you otherwise. I
will try working on it this week.
What I had in mind was to provide back a MAD response in the case of
delete when the client is not the last one on the group.
All other MADs go on the wire (duplicate "join").
> MADs are sent and tracked by their respective registered ib_mad clients.
Exactly and the agent ID is part of the MAD trans_id. So we know which
agent is sending which MAD.
> Trying
> to push this down into the MAD layer means that the send request from one client
> may now occur on some other client's registration.
Not sure I am following you here.
If you refer to the race where one client sends "join" while the other
sends "leave" you should make sure:
1. Mark a client as "joined" only after receiving the SA response.
2. Consider a "leave" when the client MAD is sent out.
> If that client decides to
> unregister in the middle of their send, the operation is canceled, and now needs
> to be restarted on some other registration. And even though the operation was
> canceled, we still need to know whether it was seen by the SA. This requires
> sniffing all MADs, and quickly gets extremely complex.
>
Cancel does not really revert a post_send. Isn't it?
So if we catch it just before it is posting we should be fine.
> In order to avoid issues these with which registered client is actually
> performing the operation, the solution is to filter multicast requests through a
> single registration.
If each client uses its own agent ID then it is available in the
trans_id of the MAD.
> The ib_mad layer is complex enough as it is. (Have you
> tried tracing a MAD through the send path?) We don't need to push even more
> functionality down into it.
>
I agree that layering on top is easier. But does it really solve the
bug? I think not. If you would REPLACE the API and not provide both options
(above and below refcount enforcement ) it would make sense to me.
> - Sean
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list