***SPAM*** Re: [ofa-general] [PATCH] libibmad: remove functions which use pthread

Ira Weiny weiny2 at llnl.gov
Tue Feb 17 16:52:26 PST 2009


On Tue, 17 Feb 2009 18:21:02 -0500
Hal Rosenstock <hal.rosenstock at gmail.com> wrote:

> On 2/17/09, Ira Weiny <weiny2 at llnl.gov> wrote:
> > On Tue, 17 Feb 2009 16:12:12 -0500
> > Hal Rosenstock <hal.rosenstock at gmail.com> wrote:
> >
> >> On Tue, Feb 17, 2009 at 12:19 PM,  <weiny2 at llnl.gov> wrote:
> >> > Quoting Hal Rosenstock <hal.rosenstock at gmail.com>:
> >> >
> >> >> Sasha,
> >> >>
> >> >> On Wed, Dec 31, 2008 at 12:04 PM, Sasha Khapyorsky
> >> >> <sashak at voltaire.com>
> >> >> wrote:
> >> >>>
> >> >>> I looked at implementation of safe_*() functions (safe_smp_query,
> >> >>> safe_smp_set and safe_ca_call) and found that they are not actually
> >> >>> "safe" as declared by its names. The only thread-unsafe thing which
> >> >>> is used there is static 'mad_portid' structure (from rpc.c),
> >> >>
> >> >> I'm not sure that the only thread unsafe thing in the mad rpc
> >> >> mechanism is the portid.
> >> >>
> >> >>> but modification of this structure is not protected by same mutex
> >> >>> (actually
> >> >>> not protected at all).
> >> >>
> >> >> A first step would be removing the portid as static. If so, portid
> >> >> would need to be a supplied parameter to various mad routines and the
> >> >> existing ones relying on madrpc_portid would be deprecated. Does this
> >> >> make sense to do ? Would you accept such a patch ?
> >> >>
> >>
> >> > Don't we already have an interface like this with mad_rpc_open_port?
> >>
> >> I'm not sure this was carried all the way through (The basic building
> >> blocks are there but I think some additional routines are needed).
> >>
> >> Shouldn't the in tree clients be converted over and the old routines
> >> deprecated ?
> >
> > For utilities which run once through I think the old functions work just
> > fine.
> 
> Well, sort of... Aren't mad_portid "collisions" possible when multiple
> programs are run concurrently ?

I was only thinking of threading but I guess you are right.

> 
> > However, it is pretty confusing which interface to use...  [or even that
> > there
> > are 2 interfaces, but I digress] (see below)
> 
> I don't think the newer improved interfaces were ever documented.
> 
> >> > I don't like the void * return but it is "struct ibmadb_port" under the
> >> > hood.
> >>
> >> Is access into that currently opaque struct needed for something by
> >> the clients of the library ?
> >
> > There is nothing the clients need to access but it would be much better to
> > return some named data type.  This along with some documentation would
> > clarify
> > what the difference between madrpc and mad_rpc really is.  Furthermore, a
> > named type will help to "self document" other functions like "mad_rpc".  For
> > example:
> >
> >    void *mad_rpc(const ibmad_port_t *ibmad_port, ib_rpc_t * rpc, ib_portid_t
> > * dport,
> > 	      void *payload, void *rcvdata);
> >
> > Oh now I found it...  Check out smp_[query|set]_via...  Here the interface
> > changes the parameter name and one has no idea what the type is (without
> > looking at the code that is! ;-)
> >
> >    uint8_t *smp_query_via(void *buf, ib_portid_t * id, unsigned attrid,
> > 		       unsigned mod, unsigned timeout, const void *srcport);
> >                                                    ^^^^
> >
> >    uint8_t *smp_set_via(void *buf, ib_portid_t * id, unsigned attrid,
> > unsigned mod,
> > 		     unsigned timeout, const void *srcport);
> >                                    ^^^^
> > And here is one more...
> >    int ib_path_query_via(const void *srcport, ibmad_gid_t srcgid,
> > 		      ibmad_gid_t destgid, ib_portid_t * sm_id, void *buf);
> 
> Are you referring to how srcport is used to call either the old madrpc
> or the newer mad_rpc API and if the newer one srcport is really a
> pointer to a struct ibmad_port ?

Ok, I did not catch that srcport could be NULL to use the old interface, but
that could just be documented...

Currently mad_rpc takes a void *ibmad_port.  But ib_path_query_via takes a
void *srcport.  If you look under the covers they are the same type "struct
ibmad_port", if you need them.  mad_rpc names it ibmad_port which gives you
some clue about the type however srcport is generic altogether.

Whoa!  Then you look in rpc.c and mad_rpc takes void *port_id.

mad.h:
   void *mad_rpc(const void *ibmad_port, ib_rpc_t * rpc, ib_portid_t * dport,
	      void *payload, void *rcvdata);
rpc.c:
   void *mad_rpc(const void *port_id, ib_rpc_t * rpc, ib_portid_t * dport,
	      void *payload, void *rcvdata)

<sigh>  I figured this all out for libibnetdisc since I am using the "mad_rpc"
interface but I could see where someone could get very confused, or at best
waste a lot of time looking at the code to figure out how to use the
interface.

Ira

> -- Hal
> 
> >> > Are those calls which use it not thread safe?
> >>
> >> They look OK but I'm not 100% sure yet.
> >
> > Yea, they look thread safe but I am not sure either.  :-(
> >
> > I would be in favor of making all the utils use mad_rpc_open_port but it is
> > up
> > to Shasha if we go down this path.
> >
> > Ira
> >
> >>
> >> -- Hal
> >>
> >> > Ira
> >> >
> >> >
> >> >> -- Hal
> >> >>
> >> >>> As far as I know nothing uses those safe_*() primitives right now
> >> >>> outside
> >> >>> libibmad, so I think it is better to remove this confused functions
> >> >>> from
> >> >>> API (with changing library version, etc.).
> >> >>>
> >> >>> The primitives madrpc_lock() and madrpc_unlock() are just wrappers to
> >> >>> hidden static pthread mutex which is not controlled by caller
> >> >>> application. I think that it will be more robust for multithreaded
> >> >>> application to use its own synchronization methods (pthread mutex or
> >> >>> any
> >> >>> other) for better control. So let's remove madrpc_lock/unlock() too.
> >> >>>
> >> >>> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> >> >>> ---
> >> >>>  libibmad/include/infiniband/mad.h |   41
> >> >>>  -------------------------------------
> >> >>>  libibmad/libibmad.ver             |    2 +-
> >> >>>  libibmad/src/libibmad.map         |    2 -
> >> >>>  libibmad/src/rpc.c                |   15 -------------
> >> >>>  libibmad/src/sa.c                 |    5 ++-
> >> >>>  5 files changed, 4 insertions(+), 61 deletions(-)
> >> >>>
> >> >>> diff --git a/libibmad/include/infiniband/mad.h
> >> >>>  b/libibmad/include/infiniband/mad.h
> >> >>> index eff6738..89b4be5 100644
> >> >>> --- a/libibmad/include/infiniband/mad.h
> >> >>> +++ b/libibmad/include/infiniband/mad.h
> >> >>> @@ -703,8 +703,6 @@ void *  madrpc_rmpp(ib_rpc_t *rpc, ib_portid_t
> >> >>>  *dport, ib_rmpp_hdr_t *rmpp,
> >> >>>  void   madrpc_init(char *dev_name, int dev_port, int *mgmt_classes,
> >> >>>                   int num_classes);
> >> >>>  void   madrpc_save_mad(void *madbuf, int len);
> >> >>> -void   madrpc_lock(void);
> >> >>> -void   madrpc_unlock(void);
> >> >>>  void   madrpc_show_errors(int set);
> >> >>>
> >> >>>  void * mad_rpc_open_port(char *dev_name, int dev_port, int
> >> >>> *mgmt_classes,
> >> >>> @@ -725,32 +723,6 @@ uint8_t * smp_query_via(void *buf, ib_portid_t
> >> >>> *id,
> >> >>> unsigned attrid,
> >> >>>  uint8_t * smp_set_via(void *buf, ib_portid_t *id, unsigned attrid,
> >> >>>  unsigned mod,
> >> >>>                     unsigned timeout, const void *srcport);
> >> >>>
> >> >>> -inline static uint8_t *
> >> >>> -safe_smp_query(void *rcvbuf, ib_portid_t *portid, unsigned attrid,
> >> >>>  unsigned mod,
> >> >>> -              unsigned timeout)
> >> >>> -{
> >> >>> -       uint8_t *p;
> >> >>> -
> >> >>> -       madrpc_lock();
> >> >>> -       p = smp_query(rcvbuf, portid, attrid, mod, timeout);
> >> >>> -       madrpc_unlock();
> >> >>> -
> >> >>> -       return p;
> >> >>> -}
> >> >>> -
> >> >>> -inline static uint8_t *
> >> >>> -safe_smp_set(void *rcvbuf, ib_portid_t *portid, unsigned attrid,
> >> >>>  unsigned mod,
> >> >>> -            unsigned timeout)
> >> >>> -{
> >> >>> -       uint8_t *p;
> >> >>> -
> >> >>> -       madrpc_lock();
> >> >>> -       p = smp_set(rcvbuf, portid, attrid, mod, timeout);
> >> >>> -       madrpc_unlock();
> >> >>> -
> >> >>> -       return p;
> >> >>> -}
> >> >>> -
> >> >>>  /* sa.c */
> >> >>>  uint8_t * sa_call(void *rcvbuf, ib_portid_t *portid, ib_sa_call_t
> >> >>> *sa,
> >> >>>                 unsigned timeout);
> >> >>> @@ -761,19 +733,6 @@ int        ib_path_query(ibmad_gid_t srcgid,
> >> >>>  ibmad_gid_t destgid, ib_portid_t *sm_id,
> >> >>>  int    ib_path_query_via(const void *srcport, ibmad_gid_t srcgid,
> >> >>>                         ibmad_gid_t destgid, ib_portid_t *sm_id,  void
> >> >>> *buf);
> >> >>>
> >> >>> -inline static uint8_t *
> >> >>> -safe_sa_call(void *rcvbuf, ib_portid_t *portid, ib_sa_call_t *sa,
> >> >>> -            unsigned timeout)
> >> >>> -{
> >> >>> -       uint8_t *p;
> >> >>> -
> >> >>> -       madrpc_lock();
> >> >>> -       p = sa_call(rcvbuf, portid, sa, timeout);
> >> >>> -       madrpc_unlock();
> >> >>> -
> >> >>> -       return p;
> >> >>> -}
> >> >>> -
> >> >>>  /* resolve.c */
> >> >>>  int    ib_resolve_smlid(ib_portid_t *sm_id, int timeout);
> >> >>>  int    ib_resolve_guid(ib_portid_t *portid, uint64_t *guid,
> >> >>> diff --git a/libibmad/libibmad.ver b/libibmad/libibmad.ver
> >> >>> index 7e93c16..23d2dc2 100644
> >> >>> --- a/libibmad/libibmad.ver
> >> >>> +++ b/libibmad/libibmad.ver
> >> >>> @@ -6,4 +6,4 @@
> >> >>>  # API_REV - advance on any added API
> >> >>>  # RUNNING_REV - advance any change to the vendor files
> >> >>>  # AGE - number of backward versions the API still supports
> >> >>> -LIBVERSION=5:0:4
> >> >>> +LIBVERSION=2:0:0
> >> >>> diff --git a/libibmad/src/libibmad.map b/libibmad/src/libibmad.map
> >> >>> index 927e51c..f944d86 100644
> >> >>> --- a/libibmad/src/libibmad.map
> >> >>> +++ b/libibmad/src/libibmad.map
> >> >>> @@ -72,14 +72,12 @@ IBMAD_1.3 {
> >> >>>               madrpc;
> >> >>>               madrpc_def_timeout;
> >> >>>               madrpc_init;
> >> >>> -               madrpc_lock;
> >> >>>               madrpc_portid;
> >> >>>               madrpc_rmpp;
> >> >>>               madrpc_save_mad;
> >> >>>               madrpc_set_retries;
> >> >>>               madrpc_set_timeout;
> >> >>>               madrpc_show_errors;
> >> >>> -               madrpc_unlock;
> >> >>>               ib_path_query;
> >> >>>               sa_call;
> >> >>>               sa_rpc_call;
> >> >>> diff --git a/libibmad/src/rpc.c b/libibmad/src/rpc.c
> >> >>> index 5226540..670a936 100644
> >> >>> --- a/libibmad/src/rpc.c
> >> >>> +++ b/libibmad/src/rpc.c
> >> >>> @@ -38,7 +38,6 @@
> >> >>>  #include <stdio.h>
> >> >>>  #include <stdlib.h>
> >> >>>  #include <unistd.h>
> >> >>> -#include <pthread.h>
> >> >>>  #include <string.h>
> >> >>>  #include <errno.h>
> >> >>>
> >> >>> @@ -286,20 +285,6 @@ madrpc_rmpp(ib_rpc_t *rpc, ib_portid_t *dport,
> >> >>>  ib_rmpp_hdr_t *rmpp, void *data)
> >> >>>       return mad_rpc_rmpp(&port, rpc, dport, rmpp, data);
> >> >>>  }
> >> >>>
> >> >>> -static pthread_mutex_t rpclock = PTHREAD_MUTEX_INITIALIZER;
> >> >>> -
> >> >>> -void
> >> >>> -madrpc_lock(void)
> >> >>> -{
> >> >>> -       pthread_mutex_lock(&rpclock);
> >> >>> -}
> >> >>> -
> >> >>> -void
> >> >>> -madrpc_unlock(void)
> >> >>> -{
> >> >>> -       pthread_mutex_unlock(&rpclock);
> >> >>> -}
> >> >>> -
> >> >>>  void
> >> >>>  madrpc_init(char *dev_name, int dev_port, int *mgmt_classes, int
> >> >>>  num_classes)
> >> >>>  {
> >> >>> diff --git a/libibmad/src/sa.c b/libibmad/src/sa.c
> >> >>> index 27b9d52..c601254 100644
> >> >>> --- a/libibmad/src/sa.c
> >> >>> +++ b/libibmad/src/sa.c
> >> >>> @@ -132,7 +132,7 @@ ib_path_query_via(const void *srcport,
> >> >>> ibmad_gid_t
> >> >>> srcgid, ibmad_gid_t destgid,
> >> >>>       if (srcport) {
> >> >>>               p = sa_rpc_call (srcport, buf, sm_id, &sa, 0);
> >> >>>       } else {
> >> >>> -               p = safe_sa_call(buf, sm_id, &sa, 0);
> >> >>> +               p = sa_call(buf, sm_id, &sa, 0);
> >> >>>       }
> >> >>>       if (!p) {
> >> >>>               IBWARN("sa call path_query failed");
> >> >>> @@ -142,8 +142,9 @@ ib_path_query_via(const void *srcport,
> >> >>> ibmad_gid_t
> >> >>> srcgid, ibmad_gid_t destgid,
> >> >>>       mad_decode_field(p, IB_SA_PR_DLID_F, &dlid);
> >> >>>       return dlid;
> >> >>>  }
> >> >>> +
> >> >>>  int
> >> >>>  ib_path_query(ibmad_gid_t srcgid, ibmad_gid_t destgid, ib_portid_t
> >> >>>  *sm_id, void *buf)
> >> >>>  {
> >> >>> -       return ib_path_query_via (NULL, srcgid, destgid, sm_id, buf);
> >> >>> +       return ib_path_query_via(NULL, srcgid, destgid, sm_id, buf);
> >> >>>  }
> >> >>> --
> >> >>> 1.6.0.4.766.g6fc4a
> >> >>>
> >> >>> _______________________________________________
> >> >>> general mailing list
> >> >>> general at lists.openfabrics.org
> >> >>> http://   lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >> >>>
> >> >>> To unsubscribe, please visit http:// 
> >> >>>  openib.org/mailman/listinfo/openib-general
> >> >>>
> >> >> _______________________________________________
> >> >> general mailing list
> >> >> general at lists.openfabrics.org
> >> >> http://   lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >> >>
> >> >> To unsubscribe, please visit http:// 
> >> >>  openib.org/mailman/listinfo/openib-general
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >>
> >
> >
> > --
> > Ira Weiny <weiny2 at llnl.gov>
> >
> 


-- 
Ira Weiny <weiny2 at llnl.gov>



More information about the general mailing list