[openib-general] Re: User MAD support for cancel MAD
shaharf at voltaire.com
Sun Dec 12 02:43:04 PST 2004
> I guess it would be another ioctl. However I'm not sure how useful
> this is for userspace... in the kernel modules like IPoIB want to
> cancel pending sends to avoid a callback when the module is unloaded.
> However in userspace, an application can just close the file on exit.
> - R.
OpenSM attempts to cancel mads on several scenarios. For example, the SM
issues SwitchInfo to all switchs. One of them returns with a port
change. This means that all other SwitchInfo are not relevant anymore -
a full "heavy" sweep must be performed any way. While it is not critical
to be able to cancel these mads, it may help freeing kernel resources in
the case when these mads will timeout, for example when some switches
are behind a switch that is disconnected.
Another scenario is when a discovery is done and an error is received
and another sweep is forced. The pending mads should be canceled.
Again, this is not critical, but if the instrumentation is already
there, it would be nice to use it.
The interface issue is not so trivial. An IOCTL may do, but the problem
is the parameter of the IOCTL. The most straight forward way is to
specify a TID to cancel. This requires the usermode to avoid sending the
same mad until it is timeouts. While this is reasonable, I would not
force such a limitation, after all there are plenty of retries
mechanisms and we should not force the applications to use only one
sort. For example, if you have a retries mechanism that work over IB_MGT
that just retires the mad after X msecs, I think you want to let it work
also above openib. This means that either you allow no timeout/no kernel
matching semantics (do we have one?) or let the user cancel *all* mads
with the same TID. I think that both should be implemented.
More information about the general