[openib-general] Performance Degradation with OFED v. Voltaire(lustre)

Michael S. Tsirkin mst at mellanox.co.il
Tue Dec 19 07:48:00 PST 2006


Interesting. So, does lustre actually work on top of rdma_cm?

Quoting r. Bernadat, Philippe <philippe_bernadat at hp.com>:
Subject: RE: Performance Degradation with OFED v. Voltaire(lustre)

I checked. We apparently never go through this path (with lustre) 

> -----Original Message-----
> From: Bernadat, Philippe 
> Sent: Tuesday, December 19, 2006 4:20 PM
> To: Michael S. Tsirkin
> Cc: Or Gerlitz; Roland Dreier; openib-general at openib.org
> Subject: RE: Performance Degradation with OFED v. Voltaire(lustre)
> 
> Sorry to say that this still doesn't do it.
> Are we sure we go this path ?
> 
> I double checked the code I compiled and tried was:
> 
> static int cma_query_ib_route(struct rdma_id_private 
> *id_priv, int timeout_ms,
>                               struct cma_work *work)
> {
>         struct rdma_dev_addr *addr = &id_priv->id.route.addr.dev_addr;
>         struct ib_sa_path_rec path_rec;
>         ib_sa_comp_mask mask;
> 
>         memset(&path_rec, 0, sizeof path_rec);
>         ib_addr_get_sgid(addr, &path_rec.sgid);
>         ib_addr_get_dgid(addr, &path_rec.dgid);
>         path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(addr));
>         path_rec.numb_path = 1;
> 
>         if (tavor_quirk) {
>                 path_rec.mtu_selector = IB_SA_LT;
>                 path_rec.mtu = IB_MTU_2048;
>                 mask = IB_SA_PATH_REC_MTU_SELECTOR | 
> IB_SA_PATH_REC_MTU;
>         } else
>                 mask = 0;
> 
>         id_priv->query_id = ib_sa_path_rec_get(id_priv->id.device,
>                                 id_priv->id.port_num, 
> &path_rec, mask |
>                                 IB_SA_PATH_REC_DGID | 
> IB_SA_PATH_REC_SGID |
>                                 IB_SA_PATH_REC_PKEY | 
> IB_SA_PATH_REC_NUMB_PATH,
>                                 timeout_ms, GFP_KERNEL,
>                                 cma_query_handler, work, 
> &id_priv->query);
> 
>         return (id_priv->query_id < 0) ? id_priv->query_id : 0;
> }
> 
> Philippe 
> 
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst at mellanox.co.il] 
> > Sent: Tuesday, December 19, 2006 1:25 PM
> > To: Bernadat, Philippe
> > Cc: Or Gerlitz; Roland Dreier; openib-general at openib.org
> > Subject: Re: Performance Degradation with OFED v. Voltaire(lustre)
> > 
> > > So after a bit more testing, setting the route path mtu to 
> > 1024 before
> > > the qp creation (rdma_create_qp()) seems sufficient.
> > 
> > OK, so the following fixes the tavor_quirk flag in cma to 
> > actually do something.
> > Could you please replace the patch cma_tavor_quirk.patch with 
> > this one,
> > set tavor_quirk option for cma module, and see if this works 
> > as expected?
> > 
> > Unpack OFED 1.1, copy the following to
> > OFED-1.1/openib-1.1/kernel_patches/fixes/cma_tavor_quirk.patch
> > removing the patch by the same name that is in OFED
> > (also remove xxx_cma_tavor_quirk.txt or other patches if you 
> > put them there)
> > and then pack OFED 1.1 and rebuild.
> > 
> > 
> > Thanks,
> > 
> > -----------------
> > 
> > Tavor systems get better performance with 1K MTU. Since there does
> > not seem to be any way to find out whether the remote system 
> > uses Tavor,
> > add an option to limit the MTU globally.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
> > 
> > ---
> > 
> > diff --git a/drivers/infiniband/core/cma.c 
> > b/drivers/infiniband/core/cma.c
> > index 50150c8..261bf45 100644
> > --- a/drivers/infiniband/core/cma.c
> > +++ b/drivers/infiniband/core/cma.c
> > @@ -48,6 +48,10 @@ MODULE_AUTHOR("Sean Hefty");
> >  MODULE_DESCRIPTION("Generic RDMA CM Agent");
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  
> > +static int tavor_quirk = 0;
> > +module_param_named(tavor_quirk, tavor_quirk, int, 0644);
> > +MODULE_PARM_DESC(tavor_quirk, "Tavor performance quirk: 
> > limit MTU to 1K if > 0");
> > +
> >  #define CMA_CM_RESPONSE_TIMEOUT 20
> >  #define CMA_MAX_CM_RETRIES 3
> >  
> > @@ -1138,6 +1142,7 @@ static int cma_query_ib_route(struct 
> > rdma_id_private *id_priv, int timeout_ms,
> >  {
> >  	struct rdma_dev_addr *addr = &id_priv->id.route.addr.dev_addr;
> >  	struct ib_sa_path_rec path_rec;
> > +	ib_sa_comp_mask mask;
> >  
> >  	memset(&path_rec, 0, sizeof path_rec);
> >  	ib_addr_get_sgid(addr, &path_rec.sgid);
> > @@ -1145,8 +1150,15 @@ static int cma_query_ib_route(struct 
> > rdma_id_private *id_priv, int timeout_ms,
> >  	path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(addr));
> >  	path_rec.numb_path = 1;
> >  
> > +	if (tavor_quirk) {
> > +		path_rec.mtu_selector = IB_SA_LT;
> > +		path_rec.mtu = IB_MTU_2048;
> > +		mask = IB_SA_PATH_REC_MTU_SELECTOR | IB_SA_PATH_REC_MTU;
> > +	} else
> > +		mask = 0;
> > +
> >  	id_priv->query_id = ib_sa_path_rec_get(id_priv->id.device,
> > -				id_priv->id.port_num, &path_rec,
> > +				id_priv->id.port_num, &path_rec, mask |
> >  				IB_SA_PATH_REC_DGID | 
> > IB_SA_PATH_REC_SGID |
> >  				IB_SA_PATH_REC_PKEY | 
> > IB_SA_PATH_REC_NUMB_PATH,
> >  				timeout_ms, GFP_KERNEL,
> > 
> > -- 
> > MST
> > 

-- 
MST




More information about the general mailing list