From jenos at ncsa.uiuc.edu Tue Sep 1 01:11:13 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Tue, 01 Sep 2009 03:11:13 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <200908311217.43954.jackm@dev.mellanox.co.il> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> Message-ID: <4A9CD721.2080104@ncsa.uiuc.edu> Though bad news, it's good news to get news. :) Now I have a direction to pursue. FC11 next week. thx- Jeremy Jack Morgenstein wrote: >>>>> I think OFED 1.5 might work on it but not sure. Which kernel version >>>>> FC10 use? >>>>> In general OFED 1.5 supports FC11 >>>>> >>>>> >>>> Actually, it supports FC12 (kernel 2.6.29). >>>> >>>> >>> We had originally planned to support FC11 -- however, in the interim, FC12 was >>> released -- based on kernel 2.6.29, which is supported -- so we decided to support >>> FC12 instead. >>> >>> -Jack >>> > Actually, Tziporet is correct. FC11 is built on kernel 2.6.29.4-167. > OFED 1.5 supports FC11 (I confused this with OpenSuse) -- No FC12 as yet. > > There is no support for FC10. > > sorry about the mistake. > -Jack > > From vlad at lists.openfabrics.org Tue Sep 1 03:06:56 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 1 Sep 2009 03:06:56 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090901-0200 daily build status Message-ID: <20090901100656.98DBAE61E41@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c: In function 'rds_cong_clear_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:301: error: implicit declaration of function 'generic___clear_le_bit' /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c: In function 'rds_cong_test_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:312: error: implicit declaration of function 'generic_test_le_bit' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c: In function 'rds_cong_clear_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:301: error: implicit declaration of function 'generic___clear_le_bit' /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c: In function 'rds_cong_test_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:312: error: implicit declaration of function 'generic_test_le_bit' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c: In function 'rds_cong_clear_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:301: error: implicit declaration of function 'generic___clear_le_bit' /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c: In function 'rds_cong_test_bit': /home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:312: error: implicit declaration of function 'generic_test_le_bit' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090901-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From Robert at saq.co.uk Tue Sep 1 03:44:42 2009 From: Robert at saq.co.uk (Robert Dunkley) Date: Tue, 1 Sep 2009 11:44:42 +0100 Subject: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support Message-ID: Hi everyone, A DRBD release candidate with specific SDP/Infiniband support was released last week. I have an existing OFED 1.3.1 install without the SDP protocol loaded, I need to add it. I still have the original source I installed with and found what looked like a suitable SRPM, I built the SRPM, installed the resulting RPM and then restarted OpenSM and OpenIBD but OpenIBD does not seem to have loaded "ib_sdp". I don't want to reboot this server. Does anyone know where I am going wrong? Thanks, Rob The SAQ Group Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ SAQ is the trading name of SEMTEC Limited. Registered in England & Wales Company Number: 06481952 http://www.saqnet.co.uk AS29219 SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. ISPA Member From jackm at dev.mellanox.co.il Tue Sep 1 04:16:44 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 1 Sep 2009 14:16:44 +0300 Subject: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support In-Reply-To: References: Message-ID: <200909011416.45120.jackm@dev.mellanox.co.il> On Tuesday 01 September 2009 13:44, Robert Dunkley wrote: > Hi everyone, > > A DRBD release candidate with specific SDP/Infiniband support was > released last week. > > I have an existing OFED 1.3.1 install without the SDP protocol loaded, I > need to add it. I still have the original source I installed with and > found what looked like a suitable SRPM, I built the SRPM, installed the > resulting RPM and then restarted OpenSM and OpenIBD but OpenIBD does not > seem to have loaded "ib_sdp". I don't want to reboot this server. Does > anyone know where I am going wrong? > Try adding the lines: # Load SDP module SDP_LOAD=yes to file /etc/infiniband/openib.conf and then restart the driver (if SDP_LOAD is already in the file, and set to "no", just change it to "yes"). -Jack > Thanks, > > Rob > > The SAQ Group > > Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ > SAQ is the trading name of SEMTEC Limited. Registered in England & Wales > Company Number: 06481952 > > http://www.saqnet.co.uk AS29219 > > SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. > > Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. > > ISPA Member > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From Robert at saq.co.uk Tue Sep 1 04:52:41 2009 From: Robert at saq.co.uk (Robert Dunkley) Date: Tue, 1 Sep 2009 12:52:41 +0100 Subject: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support References: <200909011416.45120.jackm@dev.mellanox.co.il> Message-ID: Hi Jack, Thanks for the reply, it now tries to load the ib_sdp module but fails: # /etc/rc.d/init.d/openibd restart Unloading HCA driver: [ OK ] Loading HCA driver and Access Layer: [ OK ] Setting up InfiniBand network interfaces: Bringing up interface ib0: [ OK ] Setting up service network . . . [ done ] Loading ib_sdp [FAILED] Where does the full log for this go? Am I missing some sort of dependency? (Loaded modules shown below) # /etc/rc.d/init.d/openibd status HCA driver loaded Configured devices: ib0 Currently active devices: ib0 The following OFED modules are loaded: rdma_ucm rdma_cm ib_addr ib_ipoib mlx4_core mlx4_ib ib_mthca ib_uverbs ib_umad ib_sa ib_cm ib_mad ib_core iw_cxgb3 Thanks, Rob -----Original Message----- From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] Sent: 01 September 2009 12:17 To: general at lists.openfabrics.org Cc: Robert Dunkley Subject: Re: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support On Tuesday 01 September 2009 13:44, Robert Dunkley wrote: > Hi everyone, > > A DRBD release candidate with specific SDP/Infiniband support was > released last week. > > I have an existing OFED 1.3.1 install without the SDP protocol loaded, I > need to add it. I still have the original source I installed with and > found what looked like a suitable SRPM, I built the SRPM, installed the > resulting RPM and then restarted OpenSM and OpenIBD but OpenIBD does not > seem to have loaded "ib_sdp". I don't want to reboot this server. Does > anyone know where I am going wrong? > Try adding the lines: # Load SDP module SDP_LOAD=yes to file /etc/infiniband/openib.conf and then restart the driver (if SDP_LOAD is already in the file, and set to "no", just change it to "yes"). -Jack > Thanks, > > Rob > > The SAQ Group > > Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ > SAQ is the trading name of SEMTEC Limited. Registered in England & Wales > Company Number: 06481952 > > http://www.saqnet.co.uk AS29219 > > SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. > > Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. > > ISPA Member > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From fenkes at de.ibm.com Tue Sep 1 04:55:33 2009 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Tue, 1 Sep 2009 13:55:33 +0200 Subject: [ofa-general] [PATCH] IB/ehca: Fix CQE flags reporting In-Reply-To: <48499C11.7030504@gmail.com> References: <200806061835.43802.fenkes@de.ibm.com> <48499C11.7030504@gmail.com> Message-ID: <200909011355.34319.fenkes@de.ibm.com> Was reporting CQE flags in the wrong bit positions, causing consumers to miss incoming immediate data. Signed-off-by: Joachim Fenkes --- Please review and queue for 2.6.32 if you think it's okay. Thanks! Joachim drivers/infiniband/hw/ehca/ehca_reqs.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 5a3d96f..8fd88cd 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -786,7 +786,11 @@ repoll: wc->slid = cqe->rlid; wc->dlid_path_bits = cqe->dlid; wc->src_qp = cqe->remote_qp_number; - wc->wc_flags = cqe->w_completion_flags; + /* + * HW has "Immed data present" and "GRH present" in bits 6 and 5. + * SW defines those in bits 1 and 0, so we can just shift and mask. + */ + wc->wc_flags = (cqe->w_completion_flags >> 5) & 3; wc->ex.imm_data = cpu_to_be32(cqe->immediate_data); wc->sl = cqe->service_level; -- 1.6.0.4 From sashak at voltaire.com Tue Sep 1 05:05:58 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 1 Sep 2009 15:05:58 +0300 Subject: [ofa-general] Re: [PATCH] infiniband-diags/ibroute: Add support for MulticastFDBTop In-Reply-To: References: <20090826140350.GB19158@comcast.net> <20090830115316.GF21909@me> <20090830153619.GB15546@me> <20090831164456.GA24631@me> Message-ID: <20090901120558.GE24631@me> On 13:42 Mon 31 Aug , Hal Rosenstock wrote: > > Wouldn't endlid be set to top for this case (since top < endlid) ? It > ignores endlid and not top in this case. Ok, it is clear for me now (sorry, it took a long time :)). Thanks for the explanations. Sasha From hnrose at comcast.net Tue Sep 1 05:55:05 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 1 Sep 2009 08:55:05 -0400 Subject: [ofa-general] [PATCH] opensm: Add infrastructure support for more newly allocated PortInfo CapabilityMask bits Message-ID: <20090901125505.GA9455@comcast.net> Per published MgtWG errata: RefID 4484 - vendor specific MADs RefID 4575 - multicast PKey trap suppression RefID 4641 - hierarchy info Signed-off-by: Hal Rosenstock --- diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index c9d81cb..25ed35f 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -4490,10 +4490,10 @@ typedef struct _ib_port_info { #define IB_PORT_CAP_HAS_CLIENT_REREG (CL_HTON32(0x02000000)) #define IB_PORT_CAP_HAS_OTHER_LOCAL_CHANGES_NTC (CL_HTON32(0x04000000)) #define IB_PORT_CAP_HAS_LINK_SPEED_WIDTH_PAIRS_TBL (CL_HTON32(0x08000000)) -#define IB_PORT_CAP_RESV28 (CL_HTON32(0x10000000)) -#define IB_PORT_CAP_RESV29 (CL_HTON32(0x20000000)) +#define IB_PORT_CAP_HAS_VEND_MADS (CL_HTON32(0x10000000)) +#define IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS (CL_HTON32(0x20000000)) #define IB_PORT_CAP_HAS_MCAST_FDB_TOP (CL_HTON32(0x40000000)) -#define IB_PORT_CAP_RESV31 (CL_HTON32(0x80000000)) +#define IB_PORT_CAP_HAS_HIER_INFO (CL_HTON32(0x80000000)) /****f* IBA Base: Types/ib_port_info_get_port_state * NAME diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 341d778..4b4e320 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -752,15 +752,15 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, &total_len) != IB_SUCCESS) return; } - if (p_pi->capability_mask & IB_PORT_CAP_RESV28) { + if (p_pi->capability_mask & IB_PORT_CAP_HAS_VEND_MADS) { if (dbg_do_line(&p_local, buf_size, p_prefix_str, - "IB_PORT_CAP_RESV28\n", + "IB_PORT_CAP_HAS_VEND_MADS\n", &total_len) != IB_SUCCESS) return; } - if (p_pi->capability_mask & IB_PORT_CAP_RESV29) { + if (p_pi->capability_mask & IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS) { if (dbg_do_line(&p_local, buf_size, p_prefix_str, - "IB_PORT_CAP_RESV29\n", + "IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS\n", &total_len) != IB_SUCCESS) return; } @@ -770,9 +770,9 @@ static void dbg_get_capabilities_str(IN char *p_buf, IN const uint32_t buf_size, &total_len) != IB_SUCCESS) return; } - if (p_pi->capability_mask & IB_PORT_CAP_RESV31) { + if (p_pi->capability_mask & IB_PORT_CAP_HAS_HIER_INFO) { if (dbg_do_line(&p_local, buf_size, p_prefix_str, - "IB_PORT_CAP_RESV31\n", + "IB_PORT_CAP_HAS_HIER_INFO\n", &total_len) != IB_SUCCESS) return; } From jackm at dev.mellanox.co.il Tue Sep 1 06:05:25 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 1 Sep 2009 16:05:25 +0300 Subject: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support In-Reply-To: References: <200909011416.45120.jackm@dev.mellanox.co.il> Message-ID: <200909011605.25698.jackm@dev.mellanox.co.il> On Tuesday 01 September 2009 14:52, Robert Dunkley wrote: > Hi Jack, > > Thanks for the reply, it now tries to load the ib_sdp module but fails: > > # /etc/rc.d/init.d/openibd restart > Unloading HCA driver: [ OK ] > Loading HCA driver and Access Layer: [ OK ] > Setting up InfiniBand network interfaces: > Bringing up interface ib0: [ OK ] > Setting up service network . . . [ done ] > Loading ib_sdp [FAILED] > > Where does the full log for this go? Am I missing some sort of > dependency? (Loaded modules shown below) > do "dmesg" from a console window to see what the failure is. - Jack > # /etc/rc.d/init.d/openibd status > > HCA driver loaded > > Configured devices: > ib0 > > Currently active devices: > ib0 > > The following OFED modules are loaded: > > rdma_ucm > rdma_cm > ib_addr > ib_ipoib > mlx4_core > mlx4_ib > ib_mthca > ib_uverbs > ib_umad > ib_sa > ib_cm > ib_mad > ib_core > iw_cxgb3 > > > Thanks, > > Rob > > > -----Original Message----- > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > Sent: 01 September 2009 12:17 > To: general at lists.openfabrics.org > Cc: Robert Dunkley > Subject: Re: [ofa-general] Installing SDP on existing OFED 1.3.1 install > - DRBD SDP/Infiniband Support > > On Tuesday 01 September 2009 13:44, Robert Dunkley wrote: > > Hi everyone, > > > > A DRBD release candidate with specific SDP/Infiniband support was > > released last week. > > > > I have an existing OFED 1.3.1 install without the SDP protocol loaded, > I > > need to add it. I still have the original source I installed with and > > found what looked like a suitable SRPM, I built the SRPM, installed > the > > resulting RPM and then restarted OpenSM and OpenIBD but OpenIBD does > not > > seem to have loaded "ib_sdp". I don't want to reboot this server. Does > > anyone know where I am going wrong? > > > Try adding the lines: > > # Load SDP module > SDP_LOAD=yes > > to file /etc/infiniband/openib.conf > > and then restart the driver > > (if SDP_LOAD is already in the file, and set to "no", just change it to > "yes"). > > -Jack > > > Thanks, > > > > Rob > > > > The SAQ Group > > > > Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ > > SAQ is the trading name of SEMTEC Limited. Registered in England & > Wales > > Company Number: 06481952 > > > > http://www.saqnet.co.uk AS29219 > > > > SAQ Group Delivers high quality, honestly priced communication and > I.T. services to UK Business. > > > > Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : > Transit : Backups : Managed Networks : Remote Support. > > > > ISPA Member > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > > From roel.kluin at gmail.com Tue Sep 1 07:03:08 2009 From: roel.kluin at gmail.com (Roel Kluin) Date: Tue, 01 Sep 2009 16:03:08 +0200 Subject: [ofa-general] [PATCH] IB: dereference of dev->ibdev.iwcm in c2_register_device() In-Reply-To: References: <4A998EC2.70500@gmail.com> Message-ID: <4A9D299C.3030104@gmail.com> dev->ibdev.iwcm allocation may fail, prevent a dereference. Signed-off-by: Roel Kluin --- > Looks like a real fix to me -- but then don't we need to kfree() this > memory if any of the later initialization fails (to avoid a leak)? Ok, how about this? diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c index f1948fa..3f2ee64 100644 --- a/drivers/infiniband/hw/amso1100/c2_provider.c +++ b/drivers/infiniband/hw/amso1100/c2_provider.c @@ -780,11 +780,11 @@ int c2_register_device(struct c2_dev *dev) /* Register pseudo network device */ dev->pseudo_netdev = c2_pseudo_netdev_init(dev); if (!dev->pseudo_netdev) - goto out3; + goto out; ret = register_netdev(dev->pseudo_netdev); if (ret) - goto out2; + goto out_free_netdev; pr_debug("%s:%u\n", __func__, __LINE__); strlcpy(dev->ibdev.name, "amso%d", IB_DEVICE_NAME_MAX); @@ -851,6 +851,10 @@ int c2_register_device(struct c2_dev *dev) dev->ibdev.post_recv = c2_post_receive; dev->ibdev.iwcm = kmalloc(sizeof(*dev->ibdev.iwcm), GFP_KERNEL); + if (dev->ibdev.iwcm == NULL) { + ret = -ENOMEM; + goto out_unregister_netdev; + } dev->ibdev.iwcm->add_ref = c2_add_ref; dev->ibdev.iwcm->rem_ref = c2_rem_ref; dev->ibdev.iwcm->get_qp = c2_get_qp; @@ -862,23 +866,25 @@ int c2_register_device(struct c2_dev *dev) ret = ib_register_device(&dev->ibdev); if (ret) - goto out1; + goto out_free_iwcm; for (i = 0; i < ARRAY_SIZE(c2_dev_attributes); ++i) { ret = device_create_file(&dev->ibdev.dev, c2_dev_attributes[i]); if (ret) - goto out0; + goto out_unregister_ibdev; } goto out3; -out0: +out_unregister_ibdev: ib_unregister_device(&dev->ibdev); -out1: +out_free_iwcm: + kfree(dev->ibdev.iwcm); +out_unregister_netdev: unregister_netdev(dev->pseudo_netdev); -out2: +out_free_netdev: free_netdev(dev->pseudo_netdev); -out3: +out: pr_debug("%s:%u ret=%d\n", __func__, __LINE__, ret); return ret; } From swise at opengridcomputing.com Tue Sep 1 07:10:45 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 01 Sep 2009 09:10:45 -0500 Subject: [ofa-general] Re: [PATCH] iw_cxgb3: no free of iwcm memory In-Reply-To: <1251591878.7353.6.camel@lappy> References: <1251591878.7353.6.camel@lappy> Message-ID: <4A9D2B65.3080302@opengridcomputing.com> Thanks jon. I just recently pushed this upstream and into ofed-1.5. Steve. From hnrose at comcast.net Tue Sep 1 07:42:30 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 1 Sep 2009 10:42:30 -0400 Subject: [ofa-general] [PATCH] opensm/osm_base.h: Add new SA ClassPortInfo:CapabilityMask2 bit allocations Message-ID: <20090901144230.GA19717@comcast.net> Per published MgtWG errata: RefID 4626 - reverse path PKey support in PathRecord responses RefID 4635 - multicast FDB top support RefID 4644 - hierarchy support Signed-off-by: Hal Rosenstock --- diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 0537002..06223ce 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved. * @@ -776,6 +776,41 @@ typedef enum _osm_thread_state { #define OSM_CAP2_IS_QOS_SUPPORTED (1 << 1) /***********/ +/****d* OpenSM: Base/OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED +* Name +* OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED +* +* DESCRIPTION +* Reverse path PKeys indicate in PathRecord responses +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_REVERSE_PATH_PKEY_SUPPPORTED (1 << 2) +/***********/ + +/****d* OpenSM: Base/OSM_CAP2_IS_MCAST_TOP_SUPPORTED +* Name +* OSM_CAP2_IS_MCAST_TOP_SUPPORTED +* +* DESCRIPTION +* SwitchInfo.MulticastFDBTop is supported +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_MCAST_TOP_SUPPORTED (1 << 3) +/***********/ + +/****d* OpenSM: Base/OSM_CAP2_IS_HIERARCHY_SUPPORTED +* Name +* +* DESCRIPTION +* Hierarchy info suppported +* +* SYNOPSIS +*/ +#define OSM_CAP2_IS_HIERARCHY_SUPPORTED (1 << 4) +/***********/ + /****d* OpenSM: Base/osm_signal_t * NAME * osm_signal_t From vst at vlnb.net Tue Sep 1 12:02:17 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Tue, 01 Sep 2009 23:02:17 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: <4A9D6FB9.1010509@vlnb.net> I'd suggest you to enable lockdep on the target. Google for more details how to do it. Also you should additional enable "mgmt_minor" SCST core trace level and only it. Don't enable "all", its output useful only in very special circumstances. Usually to investigate a problem like yours, the default flags in the debug build + "mgmt_minor" are sufficient. Vlad Chris Worley, on 09/01/2009 03:04 AM wrote: > On Wed, Aug 12, 2009 at 12:15 AM, Bart Van > Assche wrote: >> On Tue, Aug 11, 2009 at 11:52 PM, Chris Worley wrote: >>> I setup my target exactly as you prescribe... but my initiator is >>> still Windows (version of WInOF at top): performance as relayed by >>> IOMeter starts high and the average slowly decreases. Watching the >>> instantaneous throughput, there seem to be longer and longer lags of >>> poor performance. between moments of good performance. I need to run >>> this against a Linux initiator to see if the problems are w/ WinOF. >>> >>> Using OFED 1.4.1 (w/ the stock RHEL kernel) on the target, the >>> performance was steady and getting close to acceptable. In a 15 hour >>> test that cycles through sequential and random LBA's and R/W mixes >>> from block sizes from 1MB to 512B, it worked well and got decent >>> performance until it hit 1KB sequential reads which hung IOMeter; no >>> messages on the Linux side (all looked okay). IBSRP on the Windows >>> side just said "a reset to device was issued" every 15 to 30 seconds >>> after the problem started. I reloaded the IB stack on the Linux side, >>> and was able to get it restarted. >>> >>> Still a lot of combinations to test. >> Which trace settings are you using on the target ? Enabling the proper >> trace settings via /proc/scsi_tgt/trace_level might reveal whether you >> are e.g. hitting the QUEUE_FULL condition. See also scst/README. > > I've found a good kernel/scst mix to easily repeat this; I can get it > to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the > 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF > or OFED at all) and SCST rev 1062 on the target using one drive > (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being > used). > > Although the problem doesn't occur in Windows until blocks are <2KB > and the RHEL5.2/OFED configuration does not repeat the issue using a > Linux initiator, it seems like a very similar hang, so I'm hoping it's > the same issue. > > To repeat the issue, I run 8KB block random reads w/ 64 threads, > running AIO calls w/ a depth of 64 (using "fio" on the initiator): > > # fio --rw=randrw --bs=8k --rwmixread=100 --numjobs=64 --iodepth=64 > --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio > --filename=/dev/sdn --name=test --loops=10000 --size=16091503001 > > The "size" represents 10% of the drive. It doesn't seem to ever > happen on writes, but I've seen it happen on mixed reads/writes. > > With tracing set to "default", there was still nothing in the target > logs at the time of the hang. > > With tracing set thusly on the target: > > echo "all" >/proc/scsi_tgt/trace_level > echo "all" >/proc/scsi_tgt/vdisk/trace_level > > The last few lines of dmesg look like: > > [255354.313411] 0: 28 00 01 84 54 90 00 00 10 00 00 00 00 00 00 00 > (...T........... > [255354.313420] [0]: scst: scst_cmd_init_done:214:tag=62, lun=0, CDB > len=16, queue_type=1 (cmd ffff880102b4a568) > [255354.313443] [26358]: scst: scst_pre_parse:417:op_name > (cmd ffff880102b4a3a0), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > [255354.313420] [0]: scst_cmd_init_done:216:Recieving CDB: > [255354.313452] [8602]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880102b49e48 (sg_cnt 0, sg ffff880132579f60, sg[0].page > ffffe200042b7180) > [255354.313457] [8604]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880102b4a010 (sg_cnt 0, sg ffff8802e9806f60, sg[0].page > ffffe2000bc129c0) > [255354.313426] (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F > [255354.313426] 0: 28 00 01 bc 5d 10 00 00 10 00 00 00 00 00 00 00 > (...]........... > [255354.313468] [26358]: scst: scst_pre_parse:417:op_name > (cmd ffff880102b4a568), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > [255354.313484] [8602]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880102b4a1d8 (sg_cnt 0, sg ffff8802e98064c0, sg[0].page > ffffe2000bc633c0) > [255354.313551] [8604]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880102b4a3a0 (sg_cnt 0, sg ffff88018a877060, sg[0].page > ffffe20004300200) > [255354.313556] [8602]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880102b4a568 (sg_cnt 0, sg ffff880142581100, sg[0].page > ffffe20004066d40) > > ... and there's a section like: > > [255354.310177] 0: 28 00 01 25 df 50 00 00 10 00 00 00 00 00 00 00 > (..%.P.......... > [255354.310177] [0]: scst: scst_cmd_init_done:214:tag=57, lun=0, CDB > len=16, queue_type=1 (cmd ffff8801642e2730) > [255354.310177] [0]: scst_cmd_init_done:216:Recieving CDB: > [255354.310177] (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F > [255354.310177] 0: 28 00 01 5e 22 c0 00 00 10 00 00 00 00 00 00 00 > (..^"........... > [255354.310966] [26369]: scst: scst_pre_parse:417:op_name > (cmd ffff880168a9e3a0), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > [255354.310973] [26361]: scst: scst_pre_parse:417:op_name > (cmd ffff880168a9e010), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > [255354.310980] [26365]: scst: scst_pre_parse:417:op_name > (cmd ffff880168a9e1d8), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > [255354.310986] [26359]: scst: scst_pre_parse:417:op_name > (cmd ffff880168a9de48), direction=2 (expected 2, set yes), > transfer_len=16 (expected len 8192), flags=1 > ... > [255354.311221] [8604]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880168a9e1d8 (sg_cnt 0, sg ffff880173ca8060, sg[0].page > ffffe20004325d00) > [255354.311226] [8602]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880168a9ee50 (sg_cnt 0, sg ffff880173ca8c40, sg[0].page > ffffe20005847ec0) > [255354.311233] [8604]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880168a9dc80 (sg_cnt 0, sg ffff8802f0143c40, sg[0].page > ffffe2000bc04880) > [255354.311238] [8602]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880168a9e568 (sg_cnt 0, sg ffff8802f08361a0, sg[0].page > ffffe2000bbf2400) > [255354.311242] [8604]: scst: scst_xmit_response:3004:Xmitting data > for cmd ffff880168a9d560 (sg_cnt 0, sg ffff88010acd74c0, sg[0].page > ffffe200047e7280) > > ... but, prior to that, messages are unreadably garbled, as in: > > Aug 31 22:37:00 nameme kernel: t]9l ft48 r(09 ,83_5p s20 sg:303 > _00s3]c_=cs _00ad0000e_003a6_0031_4(ea5 9arg )_2As_05s_8[7:c8[f3 _178 > 087gff0 .R nt]9i0tmpd1:ft st06s68 5i9[301602_106)o6 _001e4 0)0 > .3E3_28a9102 pft0>e_o[.eo[<_2n05 98_0f8_i xpe1f0 D<98s np8one:21_0 > 30f3006=e_ ax R8gs=h62]= 2.pd_ pad555mlf > 1_]f8=.05lf i7gxs_ac3 m_0c0:]5i3087[_ 5e sg,00[dc3e,_ 0[ ( 1<[t]F] > ..eb 4t_ ah1,_1_]10.h45_]2,5__12C5o 37 d_.)b_g4f850s, t1e c80.ite.8pE > ue2.4f[.ft0 5c5_1effft 5530 f len=16, 5v03,em_cs4e 05fc78.5r5. n > ,45ft45ff .t)m9.8)9.8077=s _C 3 i8 .tlsf5_[0s0 (2u fu 4 > 5fco5fnr.n0a05_34f__4fd_4n Bs60fn4pB.tor7=s > _i8s7=0_.tl:c>l3e0.51_654.30350en.m C30 C3 e f.dtm0=2_1e0n]6qe d.>_ > 76 d=f _esr_tp 9_50.tnf50[cs., > Aug 31 22:37:00 nameme kernel: e .0 5 B , 45 0 Aug 31 22:37:00 nameme kernel: c2< s0< cm38cf58.[f10 002< c3De > _)088m8 9c5299pected__F > Aug 31 22:37:00 nameme kernel: tran50 pt48)=8]=s59etl5pe4e6d)0c6 > ei_2(e_<3cc_ ea51es_0_sras A >cmdtesafe4 3[m 3.rer7:[ 1b00s5 > Aug 31 22:37:00 nameme kernel: ] 2a015ffs.35fff B__ a > 6cmd9spre3se9_2e3806(3_csA_ 1 ns38ge0sre0 > Aug 31 22:37:00 nameme kernel: .,76.90330B005]08s3 __ r40r._5x, :2ec_ :06cs1_0ti1d l:253064enfe7]0 abd5 0f>196.t b 7.(008ni] > 0s09.r650t, <24]__ s1=in03 s0p c2>>[4ein.1:ooD..ps210a>[25534_r6,:t > n4.]4(8 e2 .r c 2n1g9360]10>( 00 00 00 00[fd[2 > [2g_re53 le_6c_md8t_ftc883tf03c m_0 :8r8fmd63m3:0] 25 c6>[2n_e:fa2e84_0 > Aug 31 22:37:00 nameme kernel: c, > Aug 31 22:37:00 nameme kernel: .=0>5f=1s5=1d6_(de:d > 2l_25:0edg25fm>ff40 l440 e,AFg l)AF0 0o[1088. 1aggB > 0n=d9(16a.5oeX6csf00s0: ._, (=10es_(1 7 5c___oR5st_42p3d 7 > C9d=5_:(3__7mD4_ 0m4_ed > 04,5.,[s55.d4c,,25=,c8__q,[(meet9303_mr0ue9m0u_032__fy2se > Aug 31 22:37:00 nameme kernel: > y>i > > ... so other suggestions on trace settings would be appreciated. > > Thanks, > > Chris >> Bart. >> > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From worleys at gmail.com Tue Sep 1 12:24:20 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 1 Sep 2009 13:24:20 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4A9D6FB9.1010509@vlnb.net> References: <4A9D6FB9.1010509@vlnb.net> Message-ID: On Tue, Sep 1, 2009 at 1:02 PM, Vladislav Bolkhovitin wrote: > I'd suggest you to enable lockdep on the target. Google for more details how > to do it. > > Also you should additional enable "mgmt_minor" SCST core trace level and > only it. Don't enable "all", its output useful only in very special > circumstances. Could you be more explicit in how to enable specific trace levels? For example, "all" causes the following: # cat /proc/scsi_tgt/vdisk/trace_level out_of_mem | minor | sg | mem | buff | entryexit | pid | line | function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | order # cat /proc/scsi_tgt/trace_level out_of_mem | minor | sg | mem | buff | entryexit | pid | line | function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | retry | scsi_serializing | recv_bot | send_bot | recv_top | send_top I tried echoing just some of those flags to cut down on excess verbosity, but would get errors like: # echo "minor | sg | mem" >/proc/scsi_tgt/vdisk/trace_level bash: echo: write error: Invalid argument root at fusion-io:/boot# dmesg | tail -1l [330010.019198] scst: ***ERROR***: Unknown action "minor | sg | mem" > Usually to investigate a problem like yours, the default > flags in the debug build + "mgmt_minor" are sufficient. I tried "default" and didn't get any messages on the hang. Thanks, Chris > > Vlad > > Chris Worley, on 09/01/2009 03:04 AM wrote: >> >> On Wed, Aug 12, 2009 at 12:15 AM, Bart Van >> Assche wrote: >>> >>> On Tue, Aug 11, 2009 at 11:52 PM, Chris Worley wrote: >>>> >>>> I setup my target exactly as you prescribe... but my initiator is >>>> still Windows (version of WInOF at top): performance as relayed by >>>> IOMeter starts high and the average slowly decreases.  Watching the >>>> instantaneous throughput, there seem to be longer and longer lags of >>>> poor performance. between moments of good performance.  I need to run >>>> this against a Linux initiator to see if the problems are w/ WinOF. >>>> >>>> Using OFED 1.4.1 (w/ the stock RHEL kernel) on the target, the >>>> performance was steady and getting close to acceptable.  In a 15 hour >>>> test that cycles through sequential and random LBA's and R/W mixes >>>> from block sizes from 1MB to 512B, it worked well and got decent >>>> performance until it hit 1KB sequential reads which hung IOMeter; no >>>> messages on the Linux side (all looked okay).  IBSRP on the Windows >>>> side just said "a reset to device was issued" every 15 to 30 seconds >>>> after the problem started. I reloaded the IB stack on the Linux side, >>>> and was able to get it restarted. >>>> >>>> Still a lot of combinations to test. >>> >>> Which trace settings are you using on the target ? Enabling the proper >>> trace settings via /proc/scsi_tgt/trace_level might reveal whether you >>> are e.g. hitting the QUEUE_FULL condition. See also scst/README. >> >> I've found a good kernel/scst mix to easily repeat this; I can get it >> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >> or OFED at all) and SCST rev 1062 on the target using one drive >> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >> used). >> >> Although the problem doesn't occur in Windows until blocks are <2KB >> and the RHEL5.2/OFED configuration does not repeat the issue using a >> Linux initiator, it seems like a very similar hang, so I'm hoping it's >> the same issue. >> >> To repeat the issue, I run 8KB block random reads w/ 64 threads, >> running AIO calls w/ a depth of 64 (using "fio" on the initiator): >> >> # fio --rw=randrw --bs=8k --rwmixread=100 --numjobs=64 --iodepth=64 >> --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio >> --filename=/dev/sdn --name=test --loops=10000 --size=16091503001 >> >> The "size" represents 10% of the drive.  It doesn't seem to ever >> happen on writes, but I've seen it happen on mixed reads/writes. >> >> With tracing set to "default", there was still nothing in the target >> logs at the time of the hang. >> >> With tracing set thusly on the target: >> >> echo "all" >/proc/scsi_tgt/trace_level >> echo "all" >/proc/scsi_tgt/vdisk/trace_level >> >> The last few lines of dmesg look like: >> >> [255354.313411]    0: 28 00 01 84 54 90 00 00 10 00 00 00 00 00 00 00 >>  (...T........... >> [255354.313420] [0]: scst: scst_cmd_init_done:214:tag=62, lun=0, CDB >> len=16, queue_type=1 (cmd ffff880102b4a568) >> [255354.313443] [26358]: scst: scst_pre_parse:417:op_name >> (cmd ffff880102b4a3a0), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> [255354.313420] [0]: scst_cmd_init_done:216:Recieving CDB: >> [255354.313452] [8602]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880102b49e48 (sg_cnt 0, sg ffff880132579f60, sg[0].page >> ffffe200042b7180) >> [255354.313457] [8604]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880102b4a010 (sg_cnt 0, sg ffff8802e9806f60, sg[0].page >> ffffe2000bc129c0) >> [255354.313426]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >> [255354.313426]    0: 28 00 01 bc 5d 10 00 00 10 00 00 00 00 00 00 00 >>  (...]........... >> [255354.313468] [26358]: scst: scst_pre_parse:417:op_name >> (cmd ffff880102b4a568), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> [255354.313484] [8602]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880102b4a1d8 (sg_cnt 0, sg ffff8802e98064c0, sg[0].page >> ffffe2000bc633c0) >> [255354.313551] [8604]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880102b4a3a0 (sg_cnt 0, sg ffff88018a877060, sg[0].page >> ffffe20004300200) >> [255354.313556] [8602]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880102b4a568 (sg_cnt 0, sg ffff880142581100, sg[0].page >> ffffe20004066d40) >> >> ... and there's a section like: >> >> [255354.310177]    0: 28 00 01 25 df 50 00 00 10 00 00 00 00 00 00 00 >>  (..%.P.......... >> [255354.310177] [0]: scst: scst_cmd_init_done:214:tag=57, lun=0, CDB >> len=16, queue_type=1 (cmd ffff8801642e2730) >> [255354.310177] [0]: scst_cmd_init_done:216:Recieving CDB: >> [255354.310177]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >> [255354.310177]    0: 28 00 01 5e 22 c0 00 00 10 00 00 00 00 00 00 00 >>  (..^"........... >> [255354.310966] [26369]: scst: scst_pre_parse:417:op_name >> (cmd ffff880168a9e3a0), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> [255354.310973] [26361]: scst: scst_pre_parse:417:op_name >> (cmd ffff880168a9e010), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> [255354.310980] [26365]: scst: scst_pre_parse:417:op_name >> (cmd ffff880168a9e1d8), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> [255354.310986] [26359]: scst: scst_pre_parse:417:op_name >> (cmd ffff880168a9de48), direction=2 (expected 2, set yes), >> transfer_len=16 (expected len 8192), flags=1 >> ... >> [255354.311221] [8604]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880168a9e1d8 (sg_cnt 0, sg ffff880173ca8060, sg[0].page >> ffffe20004325d00) >> [255354.311226] [8602]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880168a9ee50 (sg_cnt 0, sg ffff880173ca8c40, sg[0].page >> ffffe20005847ec0) >> [255354.311233] [8604]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880168a9dc80 (sg_cnt 0, sg ffff8802f0143c40, sg[0].page >> ffffe2000bc04880) >> [255354.311238] [8602]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880168a9e568 (sg_cnt 0, sg ffff8802f08361a0, sg[0].page >> ffffe2000bbf2400) >> [255354.311242] [8604]: scst: scst_xmit_response:3004:Xmitting data >> for cmd ffff880168a9d560 (sg_cnt 0, sg ffff88010acd74c0, sg[0].page >> ffffe200047e7280) >> >> ... but, prior to that, messages are unreadably garbled, as in: >> >> Aug 31 22:37:00 nameme kernel: t]9l ft48 r(09 ,83_5p  s20 sg:303 >> _00s3]c_=cs  _00ad0000e_003a6_0031_4(ea5 9arg )_2As_05s_8[7:c8[f3 _178 >> 087gff0 .R nt]9i0tmpd1:ft st06s68 5i9[301602_106)o6 _001e4 0)0 >> .3E3_28a9102 pft0>e_o[.eo[<_2n05 98_0f8_i xpe1f0 D<98s np8one:21_0 >> 30f3006=e_ ax R8gs=h62]= 2.pd_ pad555mlf >> 1_]f8=.05lf i7gxs_ac3 m_0c0:]5i3087[_ 5e sg,00[dc3e,_ 0[ ( 1<[t]F] >> ..eb 4t_ ah1,_1_]10.h45_]2,5__12C5o 37 d_.)b_g4f850s, t1e c80.ite.8pE >> ue2.4f[.ft0 5c5_1effft 5530 f len=16, 5v03,em_cs4e 05fc78.5r5. n >> ,45ft45ff> .t)m9.8)9.8077=s  _C 3 i8 .tlsf5_[0s0 (2u fu 4 >> 5fco5fnr.n0a05_34f__4fd_4n Bs60fn4pB.tor7=s >> _i8s7=0_.tl:c>l3e0.51_654.30350en.m C30 C3 e f.dtm0=2_1e0n]6qe  d.>_ >> 76 d=f _esr_tp 9_50.tnf50[cs., >> Aug 31 22:37:00 nameme kernel: e .0 5 B , 45 0> Aug 31 22:37:00 nameme kernel:  c2< s0< cm38cf58.[f10 002< c3De >> _)088m8 9c5299pected__F >> Aug 31 22:37:00 nameme kernel: tran50 pt48)=8]=s59etl5pe4e6d)0c6 >> ei_2(e_<3cc_ ea51es_0_sras A >cmdtesafe4 3[m 3.rer7:[ 1b00s5 >> Aug 31 22:37:00 nameme kernel: ] 2a015ffs.35fff  B__ a >> 6cmd9spre3se9_2e3806(3_csA_  1 ns38ge0sre0 >> Aug 31 22:37:00 nameme kernel: > .,76.90330B005]08s3 __ r40r._5x,> :2ec_ :06cs1_0ti1d l:253064enfe7]0 abd5 0f>196.t b 7.(008ni] >> 0s09.r650t, <24]__ s1=in03 s0p c2>>[4ein.1:ooD..ps210a>[25534_r6,:t >> n4.]4(8 e2 .r c 2n1g9360]10>(  00 00 00 00[fd[2 >> [2g_re53  le_6c_md8t_ftc883tf03c  m_0 :8r8fmd63m3:0] 25 c6>[2n_e:fa2e84_0 >> Aug 31 22:37:00 nameme kernel: c, >> Aug 31 22:37:00 nameme kernel: .=0>5f=1s5=1d6_(de:d >> 2l_25:0edg25fm>ff40 l440 e,AFg l)AF0 0o[1088. 1aggB >> 0n=d9(16a.5oeX6csf00s0: ._, (=10es_(1 7 5c___oR5st_42p3d 7 >> C9d=5_:(3__7mD4_ 0m4_ed >> 04,5.,[s55.d4c,,25=,c8__q,[(meet9303_mr0ue9m0u_032__fy2se >> Aug 31 22:37:00 nameme kernel: >  y>i >> >> ... so other suggestions on trace settings would be appreciated. >> >> Thanks, >> >> Chris >>> >>> Bart. >>> >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >> 30-Day trial. Simplify your report design, integration and deployment - and >> focus on what you do best, core application coding. Discover what's new with >> Crystal Reports now.  http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Scst-devel mailing list >> Scst-devel at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > From rdreier at cisco.com Tue Sep 1 12:45:28 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 01 Sep 2009 12:45:28 -0700 Subject: [ofa-general] [PATCH] IPoIB: check multicast address format In-Reply-To: <20090821000431.GA5713@obsidianresearch.com> (Jason Gunthorpe's message of "Thu, 20 Aug 2009 18:04:31 -0600") References: <20090821000431.GA5713@obsidianresearch.com> Message-ID: The idea seems sound but checkpatch.pl gives 6 errors for this small patch! Also: > +static int check_mcast(const u8 *addr,unsigned int addrlen, > + const u8 *broadcast) name of the function could make it clearer what the expected return value is ... eg mcast_addr_is_valid() or something like that. > + if (addrlen != 20) We have INFINIBAND_ALEN defined, seems better than a magic # here. > + if (memcmp(addr,broadcast,6) != 0) Personal taste here, but "if (foo != 0)" always seems silly to me when we could just do "if (foo)" -- haven't looked at what usage of memcmp() is more idiomatic in the kernel tho. - R. From rdreier at cisco.com Tue Sep 1 12:55:15 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 01 Sep 2009 12:55:15 -0700 Subject: [ofa-general] Re: [PATCH] IB/ehca: Fix CQE flags reporting In-Reply-To: <200909011355.34319.fenkes@de.ibm.com> (Joachim Fenkes's message of "Tue, 1 Sep 2009 13:55:33 +0200") References: <200806061835.43802.fenkes@de.ibm.com> <48499C11.7030504@gmail.com> <200909011355.34319.fenkes@de.ibm.com> Message-ID: applied, thanks From rdreier at cisco.com Tue Sep 1 13:06:44 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 01 Sep 2009 13:06:44 -0700 Subject: [ofa-general] [PATCHv2 RESEND] IB/IPoIB: Don't let a bad muticast address in the join list stop subsequent joins In-Reply-To: <4A929AC7.4060402@Voltaire.COM> (Moni Shoua's message of "Mon, 24 Aug 2009 16:51:03 +0300") References: <4A929AC7.4060402@Voltaire.COM> Message-ID: > Illegal multicast address can be handed for IPoIB from userspace. For example > the command ip maddr add 33:33:00:00:00:01 dev ib0 injects an illegal muticast > address to IPoIB that will start a join task for this address. However, whenever > an illegal multicast address is passed to IPoIB it stops all subsequent > requests from join attempts. That happens because IPoIB joins to multicast > addresses in the order they arrived and doesn't handle the next address until the > current address join finishes with success. > > This patch moves the multicast address to the end of the list after a join attempt. > Even if the join fails the next attempt will be with a different address. > > Signed-off-by: Moni Shoua Was a consensus ever reached on this patch? > - if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) > - queue_delayed_work(ipoib_workqueue, &priv->mcast_task, > - mcast->backoff * HZ); > + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) { > + list_for_each_entry(next_mcast, &priv->multicast_list, list) { > + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &next_mcast->flags) > + && !test_bit(IPOIB_MCAST_FLAG_BUSY, &next_mcast->flags) > + && !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &next_mcast->flags)) > + break; > + } > + if (&next_mcast->list != &priv->multicast_list) > + queue_delayed_work(ipoib_workqueue, &priv->mcast_task, > + next_mcast->backoff * HZ); > + } I have to admit this duplicated loop doesn't look that attractive to me... maybe factor it out into a helper or something? - R. From jgunthorpe at obsidianresearch.com Tue Sep 1 13:27:47 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Tue, 1 Sep 2009 14:27:47 -0600 Subject: [ofa-general] [PATCH] IPoIB: check multicast address format (V2) In-Reply-To: References: <20090821000431.GA5713@obsidianresearch.com> Message-ID: <20090901202747.GP406@obsidianresearch.com> Check that the format of the multicast link address is correct before taking it from dev->mc_list to priv->multicast_list. This way we never try to send a bogus address to the SA, and prevents badness from erronous 'ip maddr addr add', broken bonding drivers, or whatever. Signed-off-by: Jason Gunthorpe --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 19 +++++++++++++++++++ 1 files changed, 19 insertions(+), 0 deletions(-) > The idea seems sound but checkpatch.pl gives 6 errors for this small > patch! Also: Indeed, sorry, I don't do this very often, forgot that step. I added the 6 missing spaces. > name of the function could make it clearer what the expected return > value is ... eg mcast_addr_is_valid() or something like that. Yes, done > > + if (addrlen != 20) > We have INFINIBAND_ALEN defined, seems better than a magic # here. Great, I looked for that for a few mins.. > > + if (memcmp(addr,broadcast,6) != 0) > Personal taste here, but "if (foo != 0)" always seems silly to me when > we could just do "if (foo)" -- haven't looked at what usage of memcmp() > is more idiomatic in the kernel tho. OK, much of the rest of the kernel is without the operator. I do it this way because the discordance of: if (cmp(a, b)) means a is not b, if (!cmp(a, b)) means a is b if (is_something(a)) means a is something hurts my brain, even though I know that is how it all works.. diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 425e311..a6485c4 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -758,6 +758,20 @@ void ipoib_mcast_dev_flush(struct net_device *dev) } } +static int ipoib_mcast_addr_is_valid(const u8 *addr, unsigned int addrlen, + const u8 *broadcast) +{ + if (addrlen != INFINIBAND_ALEN) + return 0; + /* reserved QPN, prefix, scope */ + if (memcmp(addr, broadcast, 6)) + return 0; + /* signature lower, pkey */ + if (memcmp(addr + 7, broadcast + 7, 3)) + return 0; + return 1; +} + void ipoib_mcast_restart_task(struct work_struct *work) { struct ipoib_dev_priv *priv = @@ -791,6 +805,11 @@ void ipoib_mcast_restart_task(struct work_struct *work) for (mclist = dev->mc_list; mclist; mclist = mclist->next) { union ib_gid mgid; + if (!ipoib_mcast_addr_is_valid(mclist->dmi_addr, + mclist->dmi_addrlen, + dev->broadcast)) + continue; + memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); mcast = __ipoib_mcast_find(dev, &mgid); -- 1.5.4.2 From rdreier at cisco.com Tue Sep 1 13:57:30 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 01 Sep 2009 13:57:30 -0700 Subject: [ofa-general] [PATCH] IB: dereference of dev->ibdev.iwcm in c2_register_device() In-Reply-To: <4A9D299C.3030104@gmail.com> (Roel Kluin's message of "Tue, 01 Sep 2009 16:03:08 +0200") References: <4A998EC2.70500@gmail.com> <4A9D299C.3030104@gmail.com> Message-ID: I tend to prefer patches that compile :) -- diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c index 3f2ee64..ad723bd 100644 --- a/drivers/infiniband/hw/amso1100/c2_provider.c +++ b/drivers/infiniband/hw/amso1100/c2_provider.c @@ -874,7 +874,7 @@ int c2_register_device(struct c2_dev *dev) if (ret) goto out_unregister_ibdev; } - goto out3; + goto out; out_unregister_ibdev: ib_unregister_device(&dev->ibdev); but anyway, applied with that fix, thanks. From rdreier at cisco.com Tue Sep 1 14:07:30 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 01 Sep 2009 14:07:30 -0700 Subject: [ofa-general] Re: Opinions on moving Linux InfiniBand/RDMA mailing list to vger? In-Reply-To: <20090820.160800.50693597.davem@davemloft.net> (David Miller's message of "Thu, 20 Aug 2009 16:08:00 -0700 (PDT)") References: <20090820.160800.50693597.davem@davemloft.net> Message-ID: > n> linux-rdma at vger.kernel.org > It's there, ready and waiting, should you choose to use it :-) Thanks again... how do we get archive links added -- is it manually? Right now we have http://www.spinics.net/lists/linux-rdma/ http://www.mail-archive.com/linux-rdma at vger.kernel.org/ should be up once the archive at mail-archive.com subscription is approved (probably by you too :) Thanks, Roland From worleys at gmail.com Tue Sep 1 15:09:32 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 1 Sep 2009 16:09:32 -0600 Subject: [ofa-general] Proper method for SRP failover when using two IB ports redundantly Message-ID: For example, with both ports active, LVM sees each drive twice: # vgscan Found duplicate PV jBN52kfgOF5ypZauFtnxtgBeSegOkLAu: using /dev/sdc not /dev/sdb Is there a proper way to set one IB port dormant waiting for the other to fail? Thanks, Chris From davem at davemloft.net Tue Sep 1 15:37:51 2009 From: davem at davemloft.net (David Miller) Date: Tue, 01 Sep 2009 15:37:51 -0700 (PDT) Subject: [ofa-general] Re: Opinions on moving Linux InfiniBand/RDMA mailing list to vger? In-Reply-To: References: <20090820.160800.50693597.davem@davemloft.net> Message-ID: <20090901.153751.86971232.davem@davemloft.net> From: Roland Dreier Date: Tue, 01 Sep 2009 14:07:30 -0700 > > > n> linux-rdma at vger.kernel.org > > > It's there, ready and waiting, should you choose to use it :-) > > Thanks again... how do we get archive links added -- is it manually? Yes. > Right now we have http://www.spinics.net/lists/linux-rdma/ > > http://www.mail-archive.com/linux-rdma at vger.kernel.org/ should be up > once the archive at mail-archive.com subscription is approved (probably by > you too :) Both added, thanks. From worleys at gmail.com Tue Sep 1 16:44:53 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 1 Sep 2009 17:44:53 -0600 Subject: [ofa-general] Re: Proper method for SRP failover when using two IB ports redundantly In-Reply-To: References: Message-ID: Never mind, this question has already been answered by Scott Weitzenkamp: You need to configure Device Mapper Multipath or some other multipathing software to get HA. What OS are you running? Steps for RHEL are: 1) Edit /etc/multipath.conf and comment out devnode_blacklist (RHEL4) or blacklist (RHEL5) entry. 2) Run "chkconfig multipathd on". 3) Reboot. 4) After reboot, /dev/mapper should be populated with mutipath block device entries. 5) You can run "multipath -l" to view the multipath status. Steps for SLES10 are similar: 1) Run "chkconfig boot.multipath on". 2) Run "chkconfig multipathd on". 3) Reboot. 4) After reboot, /dev/mapper should be populated with mutipath block device entries. 5) You can run "multipath -l" to view the multipath status. You use the /dev/mapper block devices, not /dev/sd* block devices. On Tue, Sep 1, 2009 at 4:09 PM, Chris Worley wrote: > For example, with both ports active, LVM sees each drive twice: > > # vgscan >  Found duplicate PV jBN52kfgOF5ypZauFtnxtgBeSegOkLAu: using /dev/sdc > not /dev/sdb > > Is there a proper way to set one IB port dormant waiting for the other to fail? > > Thanks, > > Chris > From jackm at dev.mellanox.co.il Tue Sep 1 23:46:24 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 2 Sep 2009 09:46:24 +0300 Subject: [ofa-general] Installing SDP on existing OFED 1.3.1 install - DRBD SDP/Infiniband Support In-Reply-To: References: Message-ID: <200909020946.25044.jackm@dev.mellanox.co.il> On Tuesday 01 September 2009 16:24, Robert Dunklewrote: > Hi Jack, > > Thanks for the tip. (*Embarassed*) > > I think a Kernel upgrade since might have broken the source RPM. Is > there any way for me to fix this? (I have an identical hardware server > that was originally installed with SDP, details of that below, it seems > the system without sdp that I'm trying to add it to got a minor kernel > upgrade) > > Thanks again, > > Rob You need to see where the kernel is taking ib_sdp from. Do "modinfo ib_sdp" to see the ib_sdp.ko file which gets loaded. Note that on the working system, you have: /lib/modules/2.6.18-92.1.6.el5xen/updates/kernel/drivers/infiniband/ulp/sdp/ib_sdp.ko Note **updates** in the above directory path -- indicating that this sdp was installed as part of your OFED install. You have no such path for ib_sdp.ko in your broken system. 1. Was OFED installed on the broken system with your xen kernel? (check this by seeing if directory /lib/modules/2.6.18-92.1.13.el5xen/updates exists). 2. Where did you place the ib_sdp.ko module that you built? 3. The version disagreement below indicates that the sdp module expects different versions of ib_core.ko and rdma_cm.ko than are currently loaded. -Jack > > Error on system I'm trying to add to ("Broken System"): > ib_sdp: disagrees about version of symbol ib_unregister_client > ib_sdp: Unknown symbol ib_unregister_client > ib_sdp: disagrees about version of symbol ib_create_cq > ib_sdp: Unknown symbol ib_create_cq > ib_sdp: disagrees about version of symbol rdma_resolve_addr > ib_sdp: Unknown symbol rdma_resolve_addr > ib_sdp: disagrees about version of symbol ib_dereg_mr > ib_sdp: Unknown symbol ib_dereg_mr > ib_sdp: disagrees about version of symbol rdma_reject > ib_sdp: Unknown symbol rdma_reject > ib_sdp: disagrees about version of symbol rdma_disconnect > ib_sdp: Unknown symbol rdma_disconnect > ib_sdp: disagrees about version of symbol rdma_resolve_route > ib_sdp: Unknown symbol rdma_resolve_route > ib_sdp: disagrees about version of symbol rdma_bind_addr > ib_sdp: Unknown symbol rdma_bind_addr > ib_sdp: disagrees about version of symbol ib_register_client > ib_sdp: Unknown symbol ib_register_client > ib_sdp: disagrees about version of symbol rdma_create_qp > ib_sdp: Unknown symbol rdma_create_qp > ib_sdp: disagrees about version of symbol ib_destroy_cq > ib_sdp: Unknown symbol ib_destroy_cq > ib_sdp: disagrees about version of symbol rdma_create_id > ib_sdp: Unknown symbol rdma_create_id > ib_sdp: disagrees about version of symbol rdma_notify > ib_sdp: Unknown symbol rdma_notify > ib_sdp: disagrees about version of symbol rdma_listen > ib_sdp: Unknown symbol rdma_listen > ib_sdp: disagrees about version of symbol ib_get_dma_mr > ib_sdp: Unknown symbol ib_get_dma_mr > ib_sdp: disagrees about version of symbol ib_alloc_pd > ib_sdp: Unknown symbol ib_alloc_pd > ib_sdp: disagrees about version of symbol rdma_connect > ib_sdp: Unknown symbol rdma_connect > ib_sdp: disagrees about version of symbol rdma_destroy_id > ib_sdp: Unknown symbol rdma_destroy_id > ib_sdp: disagrees about version of symbol rdma_accept > ib_sdp: Unknown symbol rdma_accept > ib_sdp: disagrees about version of symbol ib_destroy_qp > ib_sdp: Unknown symbol ib_destroy_qp > ib_sdp: disagrees about version of symbol ib_dealloc_pd > ib_sdp: Unknown symbol ib_dealloc_pd > > Working system files: > /lib/modules/2.6.18-92.1.6.el5xen/kernel/drivers/infiniband/ulp/sdp > /lib/modules/2.6.18-92.1.6.el5xen/kernel/drivers/infiniband/ulp/sdp/ib_s > dp.ko > /lib/modules/2.6.18-92.1.6.el5xen/updates/kernel/drivers/infiniband/ulp/ > sdp > /lib/modules/2.6.18-92.1.6.el5xen/updates/kernel/drivers/infiniband/ulp/ > sdp/ib_s > dp.ko > /lib/modules/2.6.18-92.el5/kernel/drivers/infiniband/ulp/sdp > /lib/modules/2.6.18-92.el5/kernel/drivers/infiniband/ulp/sdp/ib_sdp.ko > > Broken System Files: > # locate sdp | more > /etc/libsdp.conf > /lib/modules/2.6.18-92.1.13.el5xen/kernel/drivers/infiniband/ulp/sdp > /lib/modules/2.6.18-92.1.13.el5xen/kernel/drivers/infiniband/ulp/sdp/ib_ > sdp.ko > /lib/modules/2.6.18-92.el5/kernel/drivers/infiniband/ulp/sdp > /lib/modules/2.6.18-92.el5/kernel/drivers/infiniband/ulp/sdp/ib_sdp.ko > > Uname -r on Broken: > 2.6.18-92.1.13.el5xen > > Uname -r on Works: > 2.6.18-92.1.6.el5xen > > > -----Original Message----- > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > Sent: 01 September 2009 14:05 > To: Robert Dunkley > Cc: general at lists.openfabrics.org > Subject: Re: [ofa-general] Installing SDP on existing OFED 1.3.1 install > - DRBD SDP/Infiniband Support > > On Tuesday 01 September 2009 14:52, Robert Dunkley wrote: > > Hi Jack, > > > > Thanks for the reply, it now tries to load the ib_sdp module but > fails: > > > > # /etc/rc.d/init.d/openibd restart > > Unloading HCA driver: [ OK ] > > Loading HCA driver and Access Layer: [ OK ] > > Setting up InfiniBand network interfaces: > > Bringing up interface ib0: [ OK ] > > Setting up service network . . . [ done ] > > Loading ib_sdp [FAILED] > > > > Where does the full log for this go? Am I missing some sort of > > dependency? (Loaded modules shown below) > > > do "dmesg" from a console window to see what the failure is. > - Jack > > # /etc/rc.d/init.d/openibd status > > > > HCA driver loaded > > > > Configured devices: > > ib0 > > > > Currently active devices: > > ib0 > > > > The following OFED modules are loaded: > > > > rdma_ucm > > rdma_cm > > ib_addr > > ib_ipoib > > mlx4_core > > mlx4_ib > > ib_mthca > > ib_uverbs > > ib_umad > > ib_sa > > ib_cm > > ib_mad > > ib_core > > iw_cxgb3 > > > > > > Thanks, > > > > Rob > > > > > > -----Original Message----- > > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > > Sent: 01 September 2009 12:17 > > To: general at lists.openfabrics.org > > Cc: Robert Dunkley > > Subject: Re: [ofa-general] Installing SDP on existing OFED 1.3.1 > install > > - DRBD SDP/Infiniband Support > > > > On Tuesday 01 September 2009 13:44, Robert Dunkley wrote: > > > Hi everyone, > > > > > > A DRBD release candidate with specific SDP/Infiniband support was > > > released last week. > > > > > > I have an existing OFED 1.3.1 install without the SDP protocol > loaded, > > I > > > need to add it. I still have the original source I installed with > and > > > found what looked like a suitable SRPM, I built the SRPM, installed > > the > > > resulting RPM and then restarted OpenSM and OpenIBD but OpenIBD does > > not > > > seem to have loaded "ib_sdp". I don't want to reboot this server. > Does > > > anyone know where I am going wrong? > > > > > Try adding the lines: > > > > # Load SDP module > > SDP_LOAD=yes > > > > to file /etc/infiniband/openib.conf > > > > and then restart the driver > > > > (if SDP_LOAD is already in the file, and set to "no", just change it > to > > "yes"). > > > > -Jack > > > > > Thanks, > > > > > > Rob > > > > > > The SAQ Group > > > > > > Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ > > > SAQ is the trading name of SEMTEC Limited. Registered in England & > > Wales > > > Company Number: 06481952 > > > > > > http://www.saqnet.co.uk AS29219 > > > > > > SAQ Group Delivers high quality, honestly priced communication and > > I.T. services to UK Business. > > > > > > Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : > > Transit : Backups : Managed Networks : Remote Support. > > > > > > ISPA Member > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > > > From sashak at voltaire.com Wed Sep 2 02:51:49 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 2 Sep 2009 12:51:49 +0300 Subject: [ofa-general] Re: [PATCHv2] opensm: Parallelize (Stripe) MFT sets across switches In-Reply-To: <20090831133934.GA10155@comcast.net> References: <20090831133934.GA10155@comcast.net> Message-ID: <20090902095149.GF24631@me> On 09:39 Mon 31 Aug , Hal Rosenstock wrote: > > Similar to previous patch to "Parallelize (Stripe) LFT sets across switches". > Currently, MADs are pipelined to a single switch first which effectively > serializes these requests. This patch pipelines the MFT set MADs across > switches first (before cycling to the next MFT block) so that multiple > switches can be responding concurrently. Speedup is dependent on number > of MFT blocks in use (number of MLIDs) which is dependent on the number > of multicast groups. > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From vlad at lists.openfabrics.org Wed Sep 2 03:05:50 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 2 Sep 2009 03:05:50 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090902-0200 daily build status Message-ID: <20090902100550.AAAADE60FD6@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: \ -I/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp/arch//include \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(cong)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/.tmp_cong.o /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:36:35: error: asm-generic/bitops/le.h: No such file or directory make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090902-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From sashak at voltaire.com Wed Sep 2 03:00:31 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 2 Sep 2009 13:00:31 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_base.h: Add new SA ClassPortInfo:CapabilityMask2 bit allocations In-Reply-To: <20090901144230.GA19717@comcast.net> References: <20090901144230.GA19717@comcast.net> Message-ID: <20090902100031.GG24631@me> On 10:42 Tue 01 Sep , Hal Rosenstock wrote: > > Per published MgtWG errata: > RefID 4626 - reverse path PKey support in PathRecord responses > RefID 4635 - multicast FDB top support > RefID 4644 - hierarchy support > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From vst at vlnb.net Wed Sep 2 03:01:26 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Wed, 02 Sep 2009 14:01:26 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9D6FB9.1010509@vlnb.net> Message-ID: <4A9E4276.7040108@vlnb.net> Chris Worley, on 09/01/2009 11:24 PM wrote: > On Tue, Sep 1, 2009 at 1:02 PM, Vladislav Bolkhovitin wrote: >> I'd suggest you to enable lockdep on the target. Google for more details how >> to do it. >> >> Also you should additional enable "mgmt_minor" SCST core trace level and >> only it. Don't enable "all", its output useful only in very special >> circumstances. > > Could you be more explicit in how to enable specific trace levels? > > For example, "all" causes the following: > > # cat /proc/scsi_tgt/vdisk/trace_level > out_of_mem | minor | sg | mem | buff | entryexit | pid | line | > function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | > order > # cat /proc/scsi_tgt/trace_level > out_of_mem | minor | sg | mem | buff | entryexit | pid | line | > function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | > retry | scsi_serializing | recv_bot | send_bot | recv_top | send_top > > I tried echoing just some of those flags to cut down on excess > verbosity, but would get errors like: > > # echo "minor | sg | mem" >/proc/scsi_tgt/vdisk/trace_level > bash: echo: write error: Invalid argument > root at fusion-io:/boot# dmesg | tail -1l > [330010.019198] scst: ***ERROR***: Unknown action "minor | sg | mem" > >> Usually to investigate a problem like yours, the default >> flags in the debug build + "mgmt_minor" are sufficient. > > I tried "default" and didn't get any messages on the hang. See /proc/scsi_tgt/help for help about all SCST proc commands. (The latest commits have some cleanups in this area.) > Thanks, > > Chris >> Vlad >> >> Chris Worley, on 09/01/2009 03:04 AM wrote: >>> On Wed, Aug 12, 2009 at 12:15 AM, Bart Van >>> Assche wrote: >>>> On Tue, Aug 11, 2009 at 11:52 PM, Chris Worley wrote: >>>>> I setup my target exactly as you prescribe... but my initiator is >>>>> still Windows (version of WInOF at top): performance as relayed by >>>>> IOMeter starts high and the average slowly decreases. Watching the >>>>> instantaneous throughput, there seem to be longer and longer lags of >>>>> poor performance. between moments of good performance. I need to run >>>>> this against a Linux initiator to see if the problems are w/ WinOF. >>>>> >>>>> Using OFED 1.4.1 (w/ the stock RHEL kernel) on the target, the >>>>> performance was steady and getting close to acceptable. In a 15 hour >>>>> test that cycles through sequential and random LBA's and R/W mixes >>>>> from block sizes from 1MB to 512B, it worked well and got decent >>>>> performance until it hit 1KB sequential reads which hung IOMeter; no >>>>> messages on the Linux side (all looked okay). IBSRP on the Windows >>>>> side just said "a reset to device was issued" every 15 to 30 seconds >>>>> after the problem started. I reloaded the IB stack on the Linux side, >>>>> and was able to get it restarted. >>>>> >>>>> Still a lot of combinations to test. >>>> Which trace settings are you using on the target ? Enabling the proper >>>> trace settings via /proc/scsi_tgt/trace_level might reveal whether you >>>> are e.g. hitting the QUEUE_FULL condition. See also scst/README. >>> I've found a good kernel/scst mix to easily repeat this; I can get it >>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>> or OFED at all) and SCST rev 1062 on the target using one drive >>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>> used). >>> >>> Although the problem doesn't occur in Windows until blocks are <2KB >>> and the RHEL5.2/OFED configuration does not repeat the issue using a >>> Linux initiator, it seems like a very similar hang, so I'm hoping it's >>> the same issue. >>> >>> To repeat the issue, I run 8KB block random reads w/ 64 threads, >>> running AIO calls w/ a depth of 64 (using "fio" on the initiator): >>> >>> # fio --rw=randrw --bs=8k --rwmixread=100 --numjobs=64 --iodepth=64 >>> --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio >>> --filename=/dev/sdn --name=test --loops=10000 --size=16091503001 >>> >>> The "size" represents 10% of the drive. It doesn't seem to ever >>> happen on writes, but I've seen it happen on mixed reads/writes. >>> >>> With tracing set to "default", there was still nothing in the target >>> logs at the time of the hang. >>> >>> With tracing set thusly on the target: >>> >>> echo "all" >/proc/scsi_tgt/trace_level >>> echo "all" >/proc/scsi_tgt/vdisk/trace_level >>> >>> The last few lines of dmesg look like: >>> >>> [255354.313411] 0: 28 00 01 84 54 90 00 00 10 00 00 00 00 00 00 00 >>> (...T........... >>> [255354.313420] [0]: scst: scst_cmd_init_done:214:tag=62, lun=0, CDB >>> len=16, queue_type=1 (cmd ffff880102b4a568) >>> [255354.313443] [26358]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880102b4a3a0), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> [255354.313420] [0]: scst_cmd_init_done:216:Recieving CDB: >>> [255354.313452] [8602]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880102b49e48 (sg_cnt 0, sg ffff880132579f60, sg[0].page >>> ffffe200042b7180) >>> [255354.313457] [8604]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880102b4a010 (sg_cnt 0, sg ffff8802e9806f60, sg[0].page >>> ffffe2000bc129c0) >>> [255354.313426] (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>> [255354.313426] 0: 28 00 01 bc 5d 10 00 00 10 00 00 00 00 00 00 00 >>> (...]........... >>> [255354.313468] [26358]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880102b4a568), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> [255354.313484] [8602]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880102b4a1d8 (sg_cnt 0, sg ffff8802e98064c0, sg[0].page >>> ffffe2000bc633c0) >>> [255354.313551] [8604]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880102b4a3a0 (sg_cnt 0, sg ffff88018a877060, sg[0].page >>> ffffe20004300200) >>> [255354.313556] [8602]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880102b4a568 (sg_cnt 0, sg ffff880142581100, sg[0].page >>> ffffe20004066d40) >>> >>> ... and there's a section like: >>> >>> [255354.310177] 0: 28 00 01 25 df 50 00 00 10 00 00 00 00 00 00 00 >>> (..%.P.......... >>> [255354.310177] [0]: scst: scst_cmd_init_done:214:tag=57, lun=0, CDB >>> len=16, queue_type=1 (cmd ffff8801642e2730) >>> [255354.310177] [0]: scst_cmd_init_done:216:Recieving CDB: >>> [255354.310177] (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>> [255354.310177] 0: 28 00 01 5e 22 c0 00 00 10 00 00 00 00 00 00 00 >>> (..^"........... >>> [255354.310966] [26369]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880168a9e3a0), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> [255354.310973] [26361]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880168a9e010), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> [255354.310980] [26365]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880168a9e1d8), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> [255354.310986] [26359]: scst: scst_pre_parse:417:op_name >>> (cmd ffff880168a9de48), direction=2 (expected 2, set yes), >>> transfer_len=16 (expected len 8192), flags=1 >>> ... >>> [255354.311221] [8604]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880168a9e1d8 (sg_cnt 0, sg ffff880173ca8060, sg[0].page >>> ffffe20004325d00) >>> [255354.311226] [8602]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880168a9ee50 (sg_cnt 0, sg ffff880173ca8c40, sg[0].page >>> ffffe20005847ec0) >>> [255354.311233] [8604]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880168a9dc80 (sg_cnt 0, sg ffff8802f0143c40, sg[0].page >>> ffffe2000bc04880) >>> [255354.311238] [8602]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880168a9e568 (sg_cnt 0, sg ffff8802f08361a0, sg[0].page >>> ffffe2000bbf2400) >>> [255354.311242] [8604]: scst: scst_xmit_response:3004:Xmitting data >>> for cmd ffff880168a9d560 (sg_cnt 0, sg ffff88010acd74c0, sg[0].page >>> ffffe200047e7280) >>> >>> ... but, prior to that, messages are unreadably garbled, as in: >>> >>> Aug 31 22:37:00 nameme kernel: t]9l ft48 r(09 ,83_5p s20 sg:303 >>> _00s3]c_=cs _00ad0000e_003a6_0031_4(ea5 9arg )_2As_05s_8[7:c8[f3 _178 >>> 087gff0 .R nt]9i0tmpd1:ft st06s68 5i9[301602_106)o6 _001e4 0)0 >>> .3E3_28a9102 pft0>e_o[.eo[<_2n05 98_0f8_i xpe1f0 D<98s np8one:21_0 >>> 30f3006=e_ ax R8gs=h62]= 2.pd_ pad555mlf >>> 1_]f8=.05lf i7gxs_ac3 m_0c0:]5i3087[_ 5e sg,00[dc3e,_ 0[ ( 1<[t]F] >>> ..eb 4t_ ah1,_1_]10.h45_]2,5__12C5o 37 d_.)b_g4f850s, t1e c80.ite.8pE >>> ue2.4f[.ft0 5c5_1effft 5530 f len=16, 5v03,em_cs4e 05fc78.5r5. n >>> ,45ft45ff>> .t)m9.8)9.8077=s _C 3 i8 .tlsf5_[0s0 (2u fu 4 >>> 5fco5fnr.n0a05_34f__4fd_4n Bs60fn4pB.tor7=s >>> _i8s7=0_.tl:c>l3e0.51_654.30350en.m C30 C3 e f.dtm0=2_1e0n]6qe d.>_ >>> 76 d=f _esr_tp 9_50.tnf50[cs., >>> Aug 31 22:37:00 nameme kernel: e .0 5 B , 45 0>> Aug 31 22:37:00 nameme kernel: c2< s0< cm38cf58.[f10 002< c3De >>> _)088m8 9c5299pected__F >>> Aug 31 22:37:00 nameme kernel: tran50 pt48)=8]=s59etl5pe4e6d)0c6 >>> ei_2(e_<3cc_ ea51es_0_sras A >cmdtesafe4 3[m 3.rer7:[ 1b00s5 >>> Aug 31 22:37:00 nameme kernel: ] 2a015ffs.35fff B__ a >>> 6cmd9spre3se9_2e3806(3_csA_ 1 ns38ge0sre0 >>> Aug 31 22:37:00 nameme kernel: >> .,76.90330B005]08s3 __ r40r._5x,>> :2ec_ :06cs1_0ti1d l:253064enfe7]0 abd5 0f>196.t b 7.(008ni] >>> 0s09.r650t, <24]__ s1=in03 s0p c2>>[4ein.1:ooD..ps210a>[25534_r6,:t >>> n4.]4(8 e2 .r c 2n1g9360]10>( 00 00 00 00[fd[2 >>> [2g_re53 le_6c_md8t_ftc883tf03c m_0 :8r8fmd63m3:0] 25 c6>[2n_e:fa2e84_0 >>> Aug 31 22:37:00 nameme kernel: c, >>> Aug 31 22:37:00 nameme kernel: .=0>5f=1s5=1d6_(de:d >>> 2l_25:0edg25fm>ff40 l440 e,AFg l)AF0 0o[1088. 1aggB >>> 0n=d9(16a.5oeX6csf00s0: ._, (=10es_(1 7 5c___oR5st_42p3d 7 >>> C9d=5_:(3__7mD4_ 0m4_ed >>> 04,5.,[s55.d4c,,25=,c8__q,[(meet9303_mr0ue9m0u_032__fy2se >>> Aug 31 22:37:00 nameme kernel: > y>i >>> >>> ... so other suggestions on trace settings would be appreciated. >>> >>> Thanks, >>> >>> Chris >>>> Bart. From sebastien.dugue at bull.net Wed Sep 2 03:05:16 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:05:16 +0200 Subject: [ofa-general] [PATCH 1/6] ibutils/ibdm: Fix 'invalid conversion from const char* to char*' build error In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120516.57ca4e2b@frecb007965> This occurs under FC11 with gcc 4.4.0-4. Signed-off-by: Sebastien Dugue --- ibdm/ibdm/SysDef.cpp | 2 +- ibdm/ibdm/TopoMatch.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ibdm/ibdm/SysDef.cpp b/ibdm/ibdm/SysDef.cpp index 4d103cb..5616c64 100644 --- a/ibdm/ibdm/SysDef.cpp +++ b/ibdm/ibdm/SysDef.cpp @@ -79,7 +79,7 @@ IBSystemsCollection::makeSysNodes( // the device number should be embedded in the master name of // the node: MT23108 ... - char *p_digit; + const char *p_digit; if ((p_digit = strpbrk(p_inst->master.c_str(), "0123456789")) != NULL) sscanf(p_digit,"%u", &p_node->devId); diff --git a/ibdm/ibdm/TopoMatch.cpp b/ibdm/ibdm/TopoMatch.cpp index 11c9fdc..cbbb346 100644 --- a/ibdm/ibdm/TopoMatch.cpp +++ b/ibdm/ibdm/TopoMatch.cpp @@ -676,7 +676,7 @@ TopoReportMismatchedNode( IBSystem *p_system = p_node->p_system; // we always mark the board of the node by examining all but the "UXXX" - char *p_lastSlash = rindex(p_node->name.c_str(), '/'); + const char *p_lastSlash = rindex(p_node->name.c_str(), '/'); char nodeBoardName[512]; int boardNameLength; if (!p_lastSlash) { -- 1.6.3.1 From sebastien.dugue at bull.net Wed Sep 2 03:05:51 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:05:51 +0200 Subject: [ofa-general] [PATCH 2/6] ibutils/ibdm: Add -fPIC to libreplace build In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120551.5bfd3fe0@frecb007965> This allows to build under FC11. Otherwise, building shared libraries using libreplace results in the following error: .../ibutils/ibdm/replace/libreplace.a(regex.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC .../ibutils/ibdm/replace/libreplace.a: could not read symbols: Bad value Signed-off-by: Sebastien Dugue --- ibdm/replace/Makefile.am | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/ibdm/replace/Makefile.am b/ibdm/replace/Makefile.am index a878772..f03bb30 100644 --- a/ibdm/replace/Makefile.am +++ b/ibdm/replace/Makefile.am @@ -5,3 +5,4 @@ INCLUDES = -I$(top_builddir) -I$(top_srcdir) noinst_LIBRARIES = libreplace.a libreplace_a_SOURCES = libreplace_a_LIBADD = @LIBOBJS@ +AM_CFLAGS = -fPIC -- 1.6.3.1 From sebastien.dugue at bull.net Wed Sep 2 03:03:53 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:03:53 +0200 Subject: [ofa-general] [PATCH 0/6] ibutils: Build fixes for FC11 Message-ID: <20090902120353.3ee1a8e2@frecb007965> Hi, here are some fixes I had to apply in order to be able to build under FC11 due to some changes in the toolchain. Sebastien. From sebastien.dugue at bull.net Wed Sep 2 03:09:11 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:09:11 +0200 Subject: [ofa-general] [PATCH 5/6] ibutils: Allow parallel build In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120911.28e49f38@frecb007965> Signed-off-by: Sebastien Dugue --- ibdm/src/Makefile.am | 4 ++-- ibmgtsim/src/Makefile.am | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/ibdm/src/Makefile.am b/ibdm/src/Makefile.am index b763387..682fb80 100644 --- a/ibdm/src/Makefile.am +++ b/ibdm/src/Makefile.am @@ -42,7 +42,7 @@ pkginclude_HEADERS = ibsysapi.h AM_CPPFLAGS = -I$(top_srcdir)/ibdm bin_PROGRAMS = ibdmchk ibdmtr ibtopodiff ibnlparse -LDADD = -L../ibdm -libdmcom +LDADD = -L../ibdm -libdmcom -L../replace -lreplace AM_LDFLAGS = -static # Why do we need to provide extra dep? the LDADD should have worked isnt it? @@ -61,7 +61,7 @@ ibnlparse_SOURCES = test_ibnl_parser.cpp lib_LTLIBRARIES = libibsysapi.la libibsysapi_la_SOURCES = ibsysapi.cpp libibsysapi_la_LDFLAGS = -version-info 1:0:0 -libibsysapi_la_LIBADD = -L../ibdm -libdmcom +libibsysapi_la_LIBADD = -L../ibdm -libdmcom -L../replace -lreplace #regexp_test_SOURCES = regexp_test.cpp diff --git a/ibmgtsim/src/Makefile.am b/ibmgtsim/src/Makefile.am index f23f2d8..f379a2a 100644 --- a/ibmgtsim/src/Makefile.am +++ b/ibmgtsim/src/Makefile.am @@ -55,7 +55,7 @@ bin_PROGRAMS = ibmssh ibmsquit #ibmgtsim test_msgmgr test_client test_server ibm if IBDM_REF_IS_USED # we assume we are building a parallel tree IBDM_PREFIX=$prefix -IBDM_LIB=-L../../ibdm/ibdm -libdmcom +IBDM_LIB=-L../../ibdm/ibdm -libdmcom -L../../ibdm/replace -lreplace IBDM_INC=-I$(srcdir)/../../ibdm/ibdm IBDM_IFC=$(srcdir)/../../ibdm/ibdm/ibdm.i else @@ -83,6 +83,8 @@ ibmssh_LDFLAGS = -static -Wl,-rpath -Wl,$(TCL_PREFIX)/lib \ ibmssh_LDADD = -libmscli $(IBDM_LIB) $(TCL_LIBS) -lpthread +ibmssh_DEPENDENCIES = libibmscli.la + # SWIG FILES: SWIG_IFC_FILES= $(srcdir)/sim.i $(srcdir)/ib_types.i $(IBDM_IFC) \ $(srcdir)/inttypes.i $(srcdir)/mads.i -- 1.6.3.1 From sebastien.dugue at bull.net Wed Sep 2 03:09:58 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:09:58 +0200 Subject: [ofa-general] [PATCH 6/6] ibutils: Fix libibsysapi build for old autotools In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120958.27fe75c5@frecb007965> Signed-off-by: Sebastien Dugue --- ibdm/src/Makefile.am | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/ibdm/src/Makefile.am b/ibdm/src/Makefile.am index 682fb80..ec9cfbd 100644 --- a/ibdm/src/Makefile.am +++ b/ibdm/src/Makefile.am @@ -43,7 +43,6 @@ AM_CPPFLAGS = -I$(top_srcdir)/ibdm bin_PROGRAMS = ibdmchk ibdmtr ibtopodiff ibnlparse LDADD = -L../ibdm -libdmcom -L../replace -lreplace -AM_LDFLAGS = -static # Why do we need to provide extra dep? the LDADD should have worked isnt it? # Deprecated : ibdmsim_DEPENDENCIES=../ibdm/libibdmcom.la -- 1.6.3.1 From sebastien.dugue at bull.net Wed Sep 2 03:07:55 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:07:55 +0200 Subject: [ofa-general] [PATCH 4/6] ibutils: Add libibsysapi.so to the spec file In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120755.6a6ad964@frecb007965> Signed-off-by: Sebastien Dugue --- ibutils.spec.in | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/ibutils.spec.in b/ibutils.spec.in index abc54dd..e27dbaf 100644 --- a/ibutils.spec.in +++ b/ibutils.spec.in @@ -130,6 +130,7 @@ rm -rf $RPM_BUILD_DIR/%{name}-%{version} %{_prefix}/bin/IBMgtSim %{_libdir}/libibmscli.so* %{_libdir}/libibmscli.a +%{_libdir}/libibsysapi.so* %{_libdir}/libibsysapi.a %{_prefix}/include/ibmgtsim %{_prefix}/share/ibmgtsim -- 1.6.3.1 From sebastien.dugue at bull.net Wed Sep 2 03:06:46 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Wed, 2 Sep 2009 12:06:46 +0200 Subject: [ofa-general] [PATCH 3/6] ibutils/ibdm: Fix libibsysapi build In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <20090902120646.0bc3db4b@frecb007965> Add libibdmcom linker path to allow build under FC11. Signed-off-by: Sebastien Dugue --- ibdm/src/Makefile.am | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/ibdm/src/Makefile.am b/ibdm/src/Makefile.am index 8b2f9ba..b763387 100644 --- a/ibdm/src/Makefile.am +++ b/ibdm/src/Makefile.am @@ -61,7 +61,7 @@ ibnlparse_SOURCES = test_ibnl_parser.cpp lib_LTLIBRARIES = libibsysapi.la libibsysapi_la_SOURCES = ibsysapi.cpp libibsysapi_la_LDFLAGS = -version-info 1:0:0 -libibsysapi_la_LIBADD = -libdmcom +libibsysapi_la_LIBADD = -L../ibdm -libdmcom #regexp_test_SOURCES = regexp_test.cpp -- 1.6.3.1 From hnrose at comcast.net Wed Sep 2 07:36:27 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 2 Sep 2009 10:36:27 -0400 Subject: [ofa-general] [PATCH] libibmad/dump.c: In mad_dump_portcapmask, decode new capabilities Message-ID: <20090902143627.GB10980@comcast.net> Per published MgtWG errata RefID 4484 - vendor specific MADs table support RefID 4626 - reverse path PKey support in PathRecord responses RefID 4635 - multicast FDB top support RefID 4644 - hierarchy support Signed-off-by: Hal Rosenstock --- diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index d97d359..1b287c0 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -2,6 +2,7 @@ * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2009 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -519,8 +520,14 @@ void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) if (mask & (1 << 27)) s += sprintf(s, "\t\t\t\tIsLinkSpeedWidthPairsTableSupported\n"); + if (mask & (1 << 28)) + s += sprintf(s, "\t\t\t\tIsVendorSpecificMadsTableSupported\n"); + if (mask & (1 << 29)) + s += sprintf(s, "\t\t\t\tIsiMcastPkeyTrapSuppressionSupported\n"); if (mask & (1 << 30)) s += sprintf(s, "\t\t\t\tIsMulticastFDBTopSupported\n"); + if (mask & (1 << 31)) + s += sprintf(s, "\t\t\t\tIsHierarchyInfoSupported\n"); if (s != buf) *(--s) = 0; From hnrose at comcast.net Wed Sep 2 07:31:33 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 2 Sep 2009 10:31:33 -0400 Subject: [ofa-general] [PATCH] opensm: Add support for MulticastFDBTop Message-ID: <20090902143133.GA10980@comcast.net> Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_mcast_mgr.c:mcast_mgr_set_mftables call new routine mcast_mgr_set_mfttop to set MulticastFDBTop in SwitchInfo based on max_block_in_use when switch port 0 indicates IsMulticastFDBTop is supported. Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index d7c5ce1..3671e08 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1066,6 +1066,83 @@ Exit: /********************************************************************** **********************************************************************/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + boolean_t set_swinfo_require = FALSE; + uint16_t mcast_top; + uint8_t life_state; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw->p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp->port_info.capability_mask & IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. + */ + si = p_sw->switch_info; + if (p_tbl->max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl->max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top != si.mcast_top) { + set_swinfo_require = TRUE; + si.mcast_top = mcast_top; + } + + /* check to see if the change state bit is on. If it is - then + we need to clear it. */ + if (ib_switch_info_get_state_change(&si)) + life_state = ((sm->p_subn->opt.packet_life_time << 3) + | (si.life_state & IB_SWITCH_PSC)) & 0xfc; + else + life_state = (sm->p_subn->opt.packet_life_time << 3) & 0xf8; + + if (life_state != si.life_state || + ib_switch_info_get_state_change(&si)) { + set_swinfo_require = TRUE; + si.life_state = life_state; + } + + if (set_swinfo_require) { + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Setting switch MFT top to MLID 0x%x\n", + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) & si, + sizeof(si), IB_MAD_ATTR_SWITCH_INFO, + 0, CL_DISP_MSGID_NONE, &context); + + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1B: " + "Sending SwitchInfo attribute failed (%s)\n", + ib_get_err_str(status)); + } + } +} + +/********************************************************************** + **********************************************************************/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = &sm->p_subn->sw_guid_tbl; @@ -1081,6 +1158,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) > max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif if (sa->p_subn->opt.qos) - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); + else + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); if (!sa->p_subn->opt.disable_multicast) p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); From hnrose at comcast.net Wed Sep 2 07:42:50 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 2 Sep 2009 10:42:50 -0400 Subject: [ofa-general] [PATCH] libibmad/mad.h: Add a couple of SM class attribute IDs Message-ID: <20090902144250.GC10980@comcast.net> VendorSpecificMadsTable added by MgtWG errata RefID 4482 Signed-off-by: Hal Rosenstock --- diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h index 5f3b52b..94b64cf 100644 --- a/libibmad/include/infiniband/mad.h +++ b/libibmad/include/infiniband/mad.h @@ -133,6 +133,8 @@ enum SMI_ATTR_ID { IB_ATTR_VL_ARBITRATION = 0x18, IB_ATTR_LINEARFORWTBL = 0x19, IB_ATTR_MULTICASTFORWTBL = 0x1b, + IB_ATTR_LINKSPEEDWIDTHPAIRSTBL = 0x1c, + IB_ATTR_VENDORMADSTBL = 0x1d, IB_ATTR_SMINFO = 0x20, IB_ATTR_LAST From monis at Voltaire.COM Wed Sep 2 08:43:07 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Wed, 02 Sep 2009 18:43:07 +0300 Subject: [ofa-general] [PATCHv2 RESEND] IB/IPoIB: Don't let a bad muticast address in the join list stop subsequent joins In-Reply-To: References: <4A929AC7.4060402@Voltaire.COM> Message-ID: <4A9E928B.6000507@Voltaire.COM> Roland Dreier wrote: > > Illegal multicast address can be handed for IPoIB from userspace. For example > > the command ip maddr add 33:33:00:00:00:01 dev ib0 injects an illegal muticast > > address to IPoIB that will start a join task for this address. However, whenever > > an illegal multicast address is passed to IPoIB it stops all subsequent > > requests from join attempts. That happens because IPoIB joins to multicast > > addresses in the order they arrived and doesn't handle the next address until the > > current address join finishes with success. > > > > This patch moves the multicast address to the end of the list after a join attempt. > > Even if the join fails the next attempt will be with a different address. > > > > Signed-off-by: Moni Shoua > > Was a consensus ever reached on this patch? Not yet. I got a comment from Jason which I intend to refer to soon. > > > - if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) > > - queue_delayed_work(ipoib_workqueue, &priv->mcast_task, > > - mcast->backoff * HZ); > > + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) { > > + list_for_each_entry(next_mcast, &priv->multicast_list, list) { > > + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &next_mcast->flags) > > + && !test_bit(IPOIB_MCAST_FLAG_BUSY, &next_mcast->flags) > > + && !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &next_mcast->flags)) > > + break; > > + } > > + if (&next_mcast->list != &priv->multicast_list) > > + queue_delayed_work(ipoib_workqueue, &priv->mcast_task, > > + next_mcast->backoff * HZ); > > + } > > I have to admit this duplicated loop doesn't look that attractive to > me... maybe factor it out into a helper or something? I'll take this comment to into the next version of the patch. thanks > > - R. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From Fabrice.Boyrie at univ-montp2.fr Wed Sep 2 09:39:14 2009 From: Fabrice.Boyrie at univ-montp2.fr (BOYRIE Fabrice) Date: Wed, 2 Sep 2009 18:39:14 +0200 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian Message-ID: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> Hello Hoping I'm in the good mailing list. I've a problem with ofed 1.4.2 on Centos 5.3. We have a new cluster with QDR infiniband. I've installed ofed from source using the install.pl script with the default values. I've used default kernel from Centos (2.6.18-128.7.1.el5) When a node starts, openibd and opensmd services are launched. Infiniband is working ibv_devinfo hca_id: mlx4_0 fw_ver: 2.6.000 node_guid: 0002:c903:0004:3efc sys_image_guid: 0002:c903:0004:3eff vendor_id: 0x02c9 vendor_part_id: 26428 hw_ver: 0xA0 board_id: MT_0C40110009 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 9 port_lid: 17 port_lmc: 0x00 If I launch MPI program, eg vasp, it works using infiniband transport and the performance is good. So no problem until I want to launch a program not using infiniband: Gaussian. With some big calculus and with %ncpu=8, Gaussian abort with the following message ntrbks: Input/output error I launched several times Gaussian and it always aborted at the same point. If I launch the same gaussian on the same input file on our old cluster (same Centos 5.3, same kernel, but without infiniband), it works. Searching the source code for ntrbks shows a call to fstatfs. So I've straced Gaussian on the two clusters. Here is the relevant part. New cluster: [pid 5715] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.i"...,"0", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.r"..., "0", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.d"..., "0", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.s"..., "0", "/tmp/CpRh_H_Ph_EneTS1/Gau-5714.i"..., "0", "junk.out", "0", ...], [/* 65 vars */] PANIC: attached pid 5816 exited with 0 [pid 5715] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 [pid 5715] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, f_blocks=13562292, f_bfree=12353558, f_bavail=12353558, f_files=434124416, f_ffree=434090957, f_fsid={0, 0}, f_namelen=255, f_frsize =32768}) = 0 [pid 5715] read(5, "\10\0\0\0\0\0\0\0", 8) = 8 [pid 5715] read(5, "\10\0\0\0\0\0\0\0\0\320\10\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0 \0\0"..., 320032) = 320032 [pid 5715] fstatfs(5, 0x7fff553be6d0) = -1 EIO (Input/output error) Old cluster: [pid 8605] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "junk.out", "0", ...], [/* 8 2 vars */]PANIC: attached pid 8701 exited with 0 [pid 8605] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, f_blocks=9150944, f_bfree=686850, f_bavail=686850, f_files=88123232, f_ffree=87917195, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 [pid 8605] read(5, "'\0\0\0\0\0\0\0", 8) = 8 [pid 8605] read(5, "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"..., 320032) = 320032 [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 [pid 8605] write(5, "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"..., 320032) = 320032 [pid 8605] close(5) = 0 If I put the input file on a local directory instead of a nfs one, Gaussian works. There is no messages in dmesg or in /var/log directory on the node or on the nfs server. On the node, /home is mounted as 192.168.1.100:/home on /home type nfs (rw,nosuid,rsize=32768,proto=tcp,addr=192.168.1.100) 192.168.1.xxx is the ethernet network (the nfs server has not infiniband card). On the node, it is enough to do «ifconfig ib0 down service opensmd stop » to have Gaussian working on the nfs directory. («ifconfig ib0 down» or «service opensmd stop» alone is not enough) So it seems there is an interaction between nfs access and openfabric. But why ? And how to solve it ? Thanks in advance Fabrice BOYRIE From chris.tilt at rnanetworks.com Wed Sep 2 09:36:33 2009 From: chris.tilt at rnanetworks.com (Chris Tilt) Date: Wed, 2 Sep 2009 12:36:33 -0400 Subject: [ofa-general] Connection timeout on localhost (using libsdp) Message-ID: Hello, I am very hopeful of getting libsdp working with an existing application. Specifically, I am trying to port Erlang to use SDP for it's "distributed Erlang" mechanism. With LD_PRELOAD, this may be very easy. However, I am having trouble with one of it's daemon processes (a port map deamon called "epmd"). Conceptually, it's a very simple C program that runs either as a daemon or as a client that connects to the daemon to do queries. To be sure that I was getting SDP, I changed the source to use AF_INET_SDP by actually looking up that value in the include files (it was 27) and substituting that in place of AF_INET. I started the process as a daemon with debugging on and it reported as normal "listening on port...". When I tried running it as a client, it attempted to connect to the known port on "localhost" and got a "Connection timeout" as reported by perror(2). Ideas? Cheers, Chris P.S. There are many users of Infiniband that would dearly love to see distributed Erlang running FAST on their systems, so this would potentially help a lot of customers! Thanks for your help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From worleys at gmail.com Wed Sep 2 10:27:44 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 11:27:44 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4A9E4276.7040108@vlnb.net> References: <4A9D6FB9.1010509@vlnb.net> <4A9E4276.7040108@vlnb.net> Message-ID: On Wed, Sep 2, 2009 at 4:01 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/01/2009 11:24 PM wrote: >> >> On Tue, Sep 1, 2009 at 1:02 PM, Vladislav Bolkhovitin wrote: >>> >>> I'd suggest you to enable lockdep on the target. Google for more details >>> how >>> to do it. >>> >>> Also you should additional enable "mgmt_minor" SCST core trace level and >>> only it. Don't enable "all", its output useful only in very special >>> circumstances. >> >> Could you be more explicit in how to enable specific trace levels? >> >> For example, "all" causes the following: >> >> # cat /proc/scsi_tgt/vdisk/trace_level >> out_of_mem | minor | sg | mem | buff | entryexit | pid | line | >> function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | >> order >> # cat /proc/scsi_tgt/trace_level >> out_of_mem | minor | sg | mem | buff | entryexit | pid | line | >> function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | >> retry | scsi_serializing | recv_bot | send_bot | recv_top | send_top >> >> I tried echoing just some of those flags to cut down on excess >> verbosity, but would get errors like: >> >> # echo "minor | sg | mem" >/proc/scsi_tgt/vdisk/trace_level >> bash: echo: write error: Invalid argument >> root at fusion-io:/boot# dmesg | tail -1l >> [330010.019198] scst: ***ERROR***: Unknown action "minor | sg | mem" >> >>> Usually to investigate a problem like yours, the default >>> flags in the debug build + "mgmt_minor" are sufficient. >> >> I tried "default" and didn't get any messages on the hang. > > See /proc/scsi_tgt/help for help about all SCST proc commands. (The latest > commits have some cleanups in this area.) I found three "trace tokens" that were causing the majority of the messages overflow: echo "all" >/proc/scsi_tgt/vdisk/trace_level (echo "del scsi" ;echo "del recv_bot";echo "del send_bot") >/proc/scsi_tgt/vdisk/trace_level echo "all" >/proc/scsi_tgt/trace_level (echo "del scsi" ;echo "del recv_bot";echo "del send_bot") >/proc/scsi_tgt/trace_level Then, the majority of messages were the pair: Sep 2 17:12:22 nameme kernel: [408676.552666] [0]: scst: __scst_init_cmd:3361:Too many pending commands (50) in session, returning BUSY to initiator "0x0002c903000260470002c90300026047" Sep 2 17:12:22 nameme kernel: [408676.552670] [0]: scst: scst_set_busy:366:Sending QUEUE_FULL status to initiator 0x0002c903000260470002c90300026047 (cmds count 79, queue_type 1, sess->init_phase 3) # grep -e "returning BUSY to initiator" /var/log/messages | wc -l 221834 # grep -e "Sending QUEUE_FULL status to initiator" /var/log/messages | wc -l 167948 Maybe the initiator has stopped due to its "busy" tracking... deciding the target cannot continue? Is the queue settable to larger than 50 outstanding commands? Thanks, Chris > >> Thanks, >> >> Chris >>> >>> Vlad >>> >>> Chris Worley, on 09/01/2009 03:04 AM wrote: >>>> >>>> On Wed, Aug 12, 2009 at 12:15 AM, Bart Van >>>> Assche wrote: >>>>> >>>>> On Tue, Aug 11, 2009 at 11:52 PM, Chris Worley >>>>> wrote: >>>>>> >>>>>> I setup my target exactly as you prescribe... but my initiator is >>>>>> still Windows (version of WInOF at top): performance as relayed by >>>>>> IOMeter starts high and the average slowly decreases.  Watching the >>>>>> instantaneous throughput, there seem to be longer and longer lags of >>>>>> poor performance. between moments of good performance.  I need to run >>>>>> this against a Linux initiator to see if the problems are w/ WinOF. >>>>>> >>>>>> Using OFED 1.4.1 (w/ the stock RHEL kernel) on the target, the >>>>>> performance was steady and getting close to acceptable.  In a 15 hour >>>>>> test that cycles through sequential and random LBA's and R/W mixes >>>>>> from block sizes from 1MB to 512B, it worked well and got decent >>>>>> performance until it hit 1KB sequential reads which hung IOMeter; no >>>>>> messages on the Linux side (all looked okay).  IBSRP on the Windows >>>>>> side just said "a reset to device was issued" every 15 to 30 seconds >>>>>> after the problem started. I reloaded the IB stack on the Linux side, >>>>>> and was able to get it restarted. >>>>>> >>>>>> Still a lot of combinations to test. >>>>> >>>>> Which trace settings are you using on the target ? Enabling the proper >>>>> trace settings via /proc/scsi_tgt/trace_level might reveal whether you >>>>> are e.g. hitting the QUEUE_FULL condition. See also scst/README. >>>> >>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>> used). >>>> >>>> Although the problem doesn't occur in Windows until blocks are <2KB >>>> and the RHEL5.2/OFED configuration does not repeat the issue using a >>>> Linux initiator, it seems like a very similar hang, so I'm hoping it's >>>> the same issue. >>>> >>>> To repeat the issue, I run 8KB block random reads w/ 64 threads, >>>> running AIO calls w/ a depth of 64 (using "fio" on the initiator): >>>> >>>> # fio --rw=randrw --bs=8k --rwmixread=100 --numjobs=64 --iodepth=64 >>>> --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio >>>> --filename=/dev/sdn --name=test --loops=10000 --size=16091503001 >>>> >>>> The "size" represents 10% of the drive.  It doesn't seem to ever >>>> happen on writes, but I've seen it happen on mixed reads/writes. >>>> >>>> With tracing set to "default", there was still nothing in the target >>>> logs at the time of the hang. >>>> >>>> With tracing set thusly on the target: >>>> >>>> echo "all" >/proc/scsi_tgt/trace_level >>>> echo "all" >/proc/scsi_tgt/vdisk/trace_level >>>> >>>> The last few lines of dmesg look like: >>>> >>>> [255354.313411]    0: 28 00 01 84 54 90 00 00 10 00 00 00 00 00 00 00 >>>>  (...T........... >>>> [255354.313420] [0]: scst: scst_cmd_init_done:214:tag=62, lun=0, CDB >>>> len=16, queue_type=1 (cmd ffff880102b4a568) >>>> [255354.313443] [26358]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880102b4a3a0), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> [255354.313420] [0]: scst_cmd_init_done:216:Recieving CDB: >>>> [255354.313452] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880102b49e48 (sg_cnt 0, sg ffff880132579f60, sg[0].page >>>> ffffe200042b7180) >>>> [255354.313457] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880102b4a010 (sg_cnt 0, sg ffff8802e9806f60, sg[0].page >>>> ffffe2000bc129c0) >>>> [255354.313426]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>>> [255354.313426]    0: 28 00 01 bc 5d 10 00 00 10 00 00 00 00 00 00 00 >>>>  (...]........... >>>> [255354.313468] [26358]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880102b4a568), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> [255354.313484] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880102b4a1d8 (sg_cnt 0, sg ffff8802e98064c0, sg[0].page >>>> ffffe2000bc633c0) >>>> [255354.313551] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880102b4a3a0 (sg_cnt 0, sg ffff88018a877060, sg[0].page >>>> ffffe20004300200) >>>> [255354.313556] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880102b4a568 (sg_cnt 0, sg ffff880142581100, sg[0].page >>>> ffffe20004066d40) >>>> >>>> ... and there's a section like: >>>> >>>> [255354.310177]    0: 28 00 01 25 df 50 00 00 10 00 00 00 00 00 00 00 >>>>  (..%.P.......... >>>> [255354.310177] [0]: scst: scst_cmd_init_done:214:tag=57, lun=0, CDB >>>> len=16, queue_type=1 (cmd ffff8801642e2730) >>>> [255354.310177] [0]: scst_cmd_init_done:216:Recieving CDB: >>>> [255354.310177]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>>> [255354.310177]    0: 28 00 01 5e 22 c0 00 00 10 00 00 00 00 00 00 00 >>>>  (..^"........... >>>> [255354.310966] [26369]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880168a9e3a0), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> [255354.310973] [26361]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880168a9e010), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> [255354.310980] [26365]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880168a9e1d8), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> [255354.310986] [26359]: scst: scst_pre_parse:417:op_name >>>> (cmd ffff880168a9de48), direction=2 (expected 2, set yes), >>>> transfer_len=16 (expected len 8192), flags=1 >>>> ... >>>> [255354.311221] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880168a9e1d8 (sg_cnt 0, sg ffff880173ca8060, sg[0].page >>>> ffffe20004325d00) >>>> [255354.311226] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880168a9ee50 (sg_cnt 0, sg ffff880173ca8c40, sg[0].page >>>> ffffe20005847ec0) >>>> [255354.311233] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880168a9dc80 (sg_cnt 0, sg ffff8802f0143c40, sg[0].page >>>> ffffe2000bc04880) >>>> [255354.311238] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880168a9e568 (sg_cnt 0, sg ffff8802f08361a0, sg[0].page >>>> ffffe2000bbf2400) >>>> [255354.311242] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>> for cmd ffff880168a9d560 (sg_cnt 0, sg ffff88010acd74c0, sg[0].page >>>> ffffe200047e7280) >>>> >>>> ... but, prior to that, messages are unreadably garbled, as in: >>>> >>>> Aug 31 22:37:00 nameme kernel: t]9l ft48 r(09 ,83_5p  s20 sg:303 >>>> _00s3]c_=cs  _00ad0000e_003a6_0031_4(ea5 9arg )_2As_05s_8[7:c8[f3 _178 >>>> 087gff0 .R nt]9i0tmpd1:ft st06s68 5i9[301602_106)o6 _001e4 0)0 >>>> .3E3_28a9102 pft0>e_o[.eo[<_2n05 98_0f8_i xpe1f0 D<98s np8one:21_0 >>>> 30f3006=e_ ax R8gs=h62]= 2.pd_ pad555mlf >>>> 1_]f8=.05lf i7gxs_ac3 m_0c0:]5i3087[_ 5e sg,00[dc3e,_ 0[ ( 1<[t]F] >>>> ..eb 4t_ ah1,_1_]10.h45_]2,5__12C5o 37 d_.)b_g4f850s, t1e c80.ite.8pE >>>> ue2.4f[.ft0 5c5_1effft 5530 f len=16, 5v03,em_cs4e 05fc78.5r5. n >>>> ,45ft45ff>>> .t)m9.8)9.8077=s  _C 3 i8 .tlsf5_[0s0 (2u fu 4 >>>> 5fco5fnr.n0a05_34f__4fd_4n Bs60fn4pB.tor7=s >>>> _i8s7=0_.tl:c>l3e0.51_654.30350en.m C30 C3 e f.dtm0=2_1e0n]6qe  d.>_ >>>> 76 d=f _esr_tp 9_50.tnf50[cs., >>>> Aug 31 22:37:00 nameme kernel: e .0 5 B , 45 0>>> Aug 31 22:37:00 nameme kernel:  c2< s0< cm38cf58.[f10 002< c3De >>>> _)088m8 9c5299pected__F >>>> Aug 31 22:37:00 nameme kernel: tran50 pt48)=8]=s59etl5pe4e6d)0c6 >>>> ei_2(e_<3cc_ ea51es_0_sras A >cmdtesafe4 3[m 3.rer7:[ 1b00s5 >>>> Aug 31 22:37:00 nameme kernel: ] 2a015ffs.35fff  B__ a >>>> 6cmd9spre3se9_2e3806(3_csA_  1 ns38ge0sre0 >>>> Aug 31 22:37:00 nameme kernel: >>> .,76.90330B005]08s3 __ r40r._5x,>>> :2ec_ :06cs1_0ti1d l:253064enfe7]0 abd5 0f>196.t b 7.(008ni] >>>> 0s09.r650t, <24]__ s1=in03 s0p c2>>[4ein.1:ooD..ps210a>[25534_r6,:t >>>> n4.]4(8 e2 .r c 2n1g9360]10>(  00 00 00 00[fd[2 >>>> [2g_re53  le_6c_md8t_ftc883tf03c  m_0 :8r8fmd63m3:0] 25 >>>> c6>[2n_e:fa2e84_0 >>>> Aug 31 22:37:00 nameme kernel: c, >>>> Aug 31 22:37:00 nameme kernel: .=0>5f=1s5=1d6_(de:d >>>> 2l_25:0edg25fm>ff40 l440 e,AFg l)AF0 0o[1088. 1aggB >>>> 0n=d9(16a.5oeX6csf00s0: ._, (=10es_(1 7 5c___oR5st_42p3d 7 >>>> C9d=5_:(3__7mD4_ 0m4_ed >>>> 04,5.,[s55.d4c,,25=,c8__q,[(meet9303_mr0ue9m0u_032__fy2se >>>> Aug 31 22:37:00 nameme kernel: >  y>i >>>> >>>> ... so other suggestions on trace settings would be appreciated. >>>> >>>> Thanks, >>>> >>>> Chris >>>>> >>>>> Bart. > > From rdreier at cisco.com Wed Sep 2 10:42:44 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Sep 2009 10:42:44 -0700 Subject: [ofa-general] [PATCH] IPoIB: check multicast address format (V2) In-Reply-To: <20090901202747.GP406@obsidianresearch.com> (Jason Gunthorpe's message of "Tue, 1 Sep 2009 14:27:47 -0600") References: <20090821000431.GA5713@obsidianresearch.com> <20090901202747.GP406@obsidianresearch.com> Message-ID: thanks for updating, applied. From rdreier at cisco.com Wed Sep 2 10:47:17 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Sep 2009 10:47:17 -0700 Subject: [ofa-general] Re: [PATCHv4] IB/mad: Allow tuning of QP0 and QP1 sizes In-Reply-To: <20090814113143.GA18401@comcast.net> (Hal Rosenstock's message of "Fri, 14 Aug 2009 07:31:43 -0400") References: <20090814113143.GA18401@comcast.net> Message-ID: applied -- would be nice to have a way to do this automatically instead of yet another tunable to sysadmins to worry about, but oh well. From worleys at gmail.com Wed Sep 2 11:01:54 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 12:01:54 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9D6FB9.1010509@vlnb.net> <4A9E4276.7040108@vlnb.net> Message-ID: On Wed, Sep 2, 2009 at 11:27 AM, Chris Worley wrote: > On Wed, Sep 2, 2009 at 4:01 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/01/2009 11:24 PM wrote: >>> >>> On Tue, Sep 1, 2009 at 1:02 PM, Vladislav Bolkhovitin wrote: >>>> >>>> I'd suggest you to enable lockdep on the target. Google for more details >>>> how >>>> to do it. >>>> >>>> Also you should additional enable "mgmt_minor" SCST core trace level and >>>> only it. Don't enable "all", its output useful only in very special >>>> circumstances. >>> >>> Could you be more explicit in how to enable specific trace levels? >>> >>> For example, "all" causes the following: >>> >>> # cat /proc/scsi_tgt/vdisk/trace_level >>> out_of_mem | minor | sg | mem | buff | entryexit | pid | line | >>> function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | >>> order >>> # cat /proc/scsi_tgt/trace_level >>> out_of_mem | minor | sg | mem | buff | entryexit | pid | line | >>> function | debug | special | scsi | mgmt | mgmt_minor | mgmt_dbg | >>> retry | scsi_serializing | recv_bot | send_bot | recv_top | send_top >>> >>> I tried echoing just some of those flags to cut down on excess >>> verbosity, but would get errors like: >>> >>> # echo "minor | sg | mem" >/proc/scsi_tgt/vdisk/trace_level >>> bash: echo: write error: Invalid argument >>> # dmesg | tail -1l >>> [330010.019198] scst: ***ERROR***: Unknown action "minor | sg | mem" >>> >>>> Usually to investigate a problem like yours, the default >>>> flags in the debug build + "mgmt_minor" are sufficient. >>> >>> I tried "default" and didn't get any messages on the hang. >> >> See /proc/scsi_tgt/help for help about all SCST proc commands. (The latest >> commits have some cleanups in this area.) > > I found three "trace tokens" that were causing the majority of the > messages overflow: > > echo "all" >/proc/scsi_tgt/vdisk/trace_level > (echo "del scsi" ;echo "del recv_bot";echo "del send_bot") >>/proc/scsi_tgt/vdisk/trace_level > echo "all" >/proc/scsi_tgt/trace_level > (echo "del scsi" ;echo "del recv_bot";echo "del send_bot") >>/proc/scsi_tgt/trace_level > > Then, the majority of messages were the pair: > > Sep  2 17:12:22 nameme kernel: [408676.552666] [0]: scst: > __scst_init_cmd:3361:Too many pending commands (50) in session, > returning BUSY to initiator "0x0002c903000260470002c90300026047" > Sep  2 17:12:22 nameme kernel: [408676.552670] [0]: scst: > scst_set_busy:366:Sending QUEUE_FULL status to initiator > 0x0002c903000260470002c90300026047 (cmds count 79, queue_type 1, > sess->init_phase 3) > > # grep -e "returning BUSY to initiator" /var/log/messages | wc -l > 221834 > # grep -e "Sending QUEUE_FULL status to initiator" /var/log/messages | wc -l > 167948 I got rid of those two messages, along with: "Xmitting data for cmd" from scst_targ.c, and I couldn't enable the "scsi" trace token, but could re-enable the "recv_bot", and the logs were filled with: [410785.218625] [0]: scst_cmd_init_done:216:Recieving CDB: [410785.218625] (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F [410785.218628] 0: 28 00 00 b8 33 90 00 00 10 00 00 00 00 00 00 00 (...3........... > > Maybe the initiator has stopped due to its "busy" tracking... deciding > the target cannot continue? Note that once this occurs, I don't need to restart the target, I just need to add the target drives on the initiator again... adding more entries to /proc/partitions... I can't use the partitions that are hung on the initiator, but I can use the newly added partitions. This makes me think this is an initiator issue. Is there any debugging to enable for ib_srp? But the contradictory results show the problem seems to follow the target used, and should therefor be a target issue (i.e. the 8KB problem w/ the Ubuntu target happens w/ Ubuntu or WinOF initiator, and the <2KB problem w/ the OFED/RHEL stack works fine w/ an RHEL initiator, but not WinOF). Thanks, Chris > > Is the queue settable to larger than 50 outstanding commands? > > Thanks, > > Chris >> >>> Thanks, >>> >>> Chris >>>> >>>> Vlad >>>> >>>> Chris Worley, on 09/01/2009 03:04 AM wrote: >>>>> >>>>> On Wed, Aug 12, 2009 at 12:15 AM, Bart Van >>>>> Assche wrote: >>>>>> >>>>>> On Tue, Aug 11, 2009 at 11:52 PM, Chris Worley >>>>>> wrote: >>>>>>> >>>>>>> I setup my target exactly as you prescribe... but my initiator is >>>>>>> still Windows (version of WInOF at top): performance as relayed by >>>>>>> IOMeter starts high and the average slowly decreases.  Watching the >>>>>>> instantaneous throughput, there seem to be longer and longer lags of >>>>>>> poor performance. between moments of good performance.  I need to run >>>>>>> this against a Linux initiator to see if the problems are w/ WinOF. >>>>>>> >>>>>>> Using OFED 1.4.1 (w/ the stock RHEL kernel) on the target, the >>>>>>> performance was steady and getting close to acceptable.  In a 15 hour >>>>>>> test that cycles through sequential and random LBA's and R/W mixes >>>>>>> from block sizes from 1MB to 512B, it worked well and got decent >>>>>>> performance until it hit 1KB sequential reads which hung IOMeter; no >>>>>>> messages on the Linux side (all looked okay).  IBSRP on the Windows >>>>>>> side just said "a reset to device was issued" every 15 to 30 seconds >>>>>>> after the problem started. I reloaded the IB stack on the Linux side, >>>>>>> and was able to get it restarted. >>>>>>> >>>>>>> Still a lot of combinations to test. >>>>>> >>>>>> Which trace settings are you using on the target ? Enabling the proper >>>>>> trace settings via /proc/scsi_tgt/trace_level might reveal whether you >>>>>> are e.g. hitting the QUEUE_FULL condition. See also scst/README. >>>>> >>>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>>> used). >>>>> >>>>> Although the problem doesn't occur in Windows until blocks are <2KB >>>>> and the RHEL5.2/OFED configuration does not repeat the issue using a >>>>> Linux initiator, it seems like a very similar hang, so I'm hoping it's >>>>> the same issue. >>>>> >>>>> To repeat the issue, I run 8KB block random reads w/ 64 threads, >>>>> running AIO calls w/ a depth of 64 (using "fio" on the initiator): >>>>> >>>>> # fio --rw=randrw --bs=8k --rwmixread=100 --numjobs=64 --iodepth=64 >>>>> --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio >>>>> --filename=/dev/sdn --name=test --loops=10000 --size=16091503001 >>>>> >>>>> The "size" represents 10% of the drive.  It doesn't seem to ever >>>>> happen on writes, but I've seen it happen on mixed reads/writes. >>>>> >>>>> With tracing set to "default", there was still nothing in the target >>>>> logs at the time of the hang. >>>>> >>>>> With tracing set thusly on the target: >>>>> >>>>> echo "all" >/proc/scsi_tgt/trace_level >>>>> echo "all" >/proc/scsi_tgt/vdisk/trace_level >>>>> >>>>> The last few lines of dmesg look like: >>>>> >>>>> [255354.313411]    0: 28 00 01 84 54 90 00 00 10 00 00 00 00 00 00 00 >>>>>  (...T........... >>>>> [255354.313420] [0]: scst: scst_cmd_init_done:214:tag=62, lun=0, CDB >>>>> len=16, queue_type=1 (cmd ffff880102b4a568) >>>>> [255354.313443] [26358]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880102b4a3a0), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> [255354.313420] [0]: scst_cmd_init_done:216:Recieving CDB: >>>>> [255354.313452] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880102b49e48 (sg_cnt 0, sg ffff880132579f60, sg[0].page >>>>> ffffe200042b7180) >>>>> [255354.313457] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880102b4a010 (sg_cnt 0, sg ffff8802e9806f60, sg[0].page >>>>> ffffe2000bc129c0) >>>>> [255354.313426]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>>>> [255354.313426]    0: 28 00 01 bc 5d 10 00 00 10 00 00 00 00 00 00 00 >>>>>  (...]........... >>>>> [255354.313468] [26358]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880102b4a568), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> [255354.313484] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880102b4a1d8 (sg_cnt 0, sg ffff8802e98064c0, sg[0].page >>>>> ffffe2000bc633c0) >>>>> [255354.313551] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880102b4a3a0 (sg_cnt 0, sg ffff88018a877060, sg[0].page >>>>> ffffe20004300200) >>>>> [255354.313556] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880102b4a568 (sg_cnt 0, sg ffff880142581100, sg[0].page >>>>> ffffe20004066d40) >>>>> >>>>> ... and there's a section like: >>>>> >>>>> [255354.310177]    0: 28 00 01 25 df 50 00 00 10 00 00 00 00 00 00 00 >>>>>  (..%.P.......... >>>>> [255354.310177] [0]: scst: scst_cmd_init_done:214:tag=57, lun=0, CDB >>>>> len=16, queue_type=1 (cmd ffff8801642e2730) >>>>> [255354.310177] [0]: scst_cmd_init_done:216:Recieving CDB: >>>>> [255354.310177]  (h)___0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F >>>>> [255354.310177]    0: 28 00 01 5e 22 c0 00 00 10 00 00 00 00 00 00 00 >>>>>  (..^"........... >>>>> [255354.310966] [26369]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880168a9e3a0), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> [255354.310973] [26361]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880168a9e010), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> [255354.310980] [26365]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880168a9e1d8), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> [255354.310986] [26359]: scst: scst_pre_parse:417:op_name >>>>> (cmd ffff880168a9de48), direction=2 (expected 2, set yes), >>>>> transfer_len=16 (expected len 8192), flags=1 >>>>> ... >>>>> [255354.311221] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880168a9e1d8 (sg_cnt 0, sg ffff880173ca8060, sg[0].page >>>>> ffffe20004325d00) >>>>> [255354.311226] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880168a9ee50 (sg_cnt 0, sg ffff880173ca8c40, sg[0].page >>>>> ffffe20005847ec0) >>>>> [255354.311233] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880168a9dc80 (sg_cnt 0, sg ffff8802f0143c40, sg[0].page >>>>> ffffe2000bc04880) >>>>> [255354.311238] [8602]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880168a9e568 (sg_cnt 0, sg ffff8802f08361a0, sg[0].page >>>>> ffffe2000bbf2400) >>>>> [255354.311242] [8604]: scst: scst_xmit_response:3004:Xmitting data >>>>> for cmd ffff880168a9d560 (sg_cnt 0, sg ffff88010acd74c0, sg[0].page >>>>> ffffe200047e7280) >>>>> >>>>> ... but, prior to that, messages are unreadably garbled, as in: >>>>> >>>>> Aug 31 22:37:00 nameme kernel: t]9l ft48 r(09 ,83_5p  s20 sg:303 >>>>> _00s3]c_=cs  _00ad0000e_003a6_0031_4(ea5 9arg )_2As_05s_8[7:c8[f3 _178 >>>>> 087gff0 .R nt]9i0tmpd1:ft st06s68 5i9[301602_106)o6 _001e4 0)0 >>>>> .3E3_28a9102 pft0>e_o[.eo[<_2n05 98_0f8_i xpe1f0 D<98s np8one:21_0 >>>>> 30f3006=e_ ax R8gs=h62]= 2.pd_ pad555mlf >>>>> 1_]f8=.05lf i7gxs_ac3 m_0c0:]5i3087[_ 5e sg,00[dc3e,_ 0[ ( 1<[t]F] >>>>> ..eb 4t_ ah1,_1_]10.h45_]2,5__12C5o 37 d_.)b_g4f850s, t1e c80.ite.8pE >>>>> ue2.4f[.ft0 5c5_1effft 5530 f len=16, 5v03,em_cs4e 05fc78.5r5. n >>>>> ,45ft45ff>>>> .t)m9.8)9.8077=s  _C 3 i8 .tlsf5_[0s0 (2u fu 4 >>>>> 5fco5fnr.n0a05_34f__4fd_4n Bs60fn4pB.tor7=s >>>>> _i8s7=0_.tl:c>l3e0.51_654.30350en.m C30 C3 e f.dtm0=2_1e0n]6qe  d.>_ >>>>> 76 d=f _esr_tp 9_50.tnf50[cs., >>>>> Aug 31 22:37:00 nameme kernel: e .0 5 B , 45 0>>>> Aug 31 22:37:00 nameme kernel:  c2< s0< cm38cf58.[f10 002< c3De >>>>> _)088m8 9c5299pected__F >>>>> Aug 31 22:37:00 nameme kernel: tran50 pt48)=8]=s59etl5pe4e6d)0c6 >>>>> ei_2(e_<3cc_ ea51es_0_sras A >cmdtesafe4 3[m 3.rer7:[ 1b00s5 >>>>> Aug 31 22:37:00 nameme kernel: ] 2a015ffs.35fff  B__ a >>>>> 6cmd9spre3se9_2e3806(3_csA_  1 ns38ge0sre0 >>>>> Aug 31 22:37:00 nameme kernel: >>>> .,76.90330B005]08s3 __ r40r._5x,>>>> :2ec_ :06cs1_0ti1d l:253064enfe7]0 abd5 0f>196.t b 7.(008ni] >>>>> 0s09.r650t, <24]__ s1=in03 s0p c2>>[4ein.1:ooD..ps210a>[25534_r6,:t >>>>> n4.]4(8 e2 .r c 2n1g9360]10>(  00 00 00 00[fd[2 >>>>> [2g_re53  le_6c_md8t_ftc883tf03c  m_0 :8r8fmd63m3:0] 25 >>>>> c6>[2n_e:fa2e84_0 >>>>> Aug 31 22:37:00 nameme kernel: c, >>>>> Aug 31 22:37:00 nameme kernel: .=0>5f=1s5=1d6_(de:d >>>>> 2l_25:0edg25fm>ff40 l440 e,AFg l)AF0 0o[1088. 1aggB >>>>> 0n=d9(16a.5oeX6csf00s0: ._, (=10es_(1 7 5c___oR5st_42p3d 7 >>>>> C9d=5_:(3__7mD4_ 0m4_ed >>>>> 04,5.,[s55.d4c,,25=,c8__q,[(meet9303_mr0ue9m0u_032__fy2se >>>>> Aug 31 22:37:00 nameme kernel: >  y>i >>>>> >>>>> ... so other suggestions on trace settings would be appreciated. >>>>> >>>>> Thanks, >>>>> >>>>> Chris >>>>>> >>>>>> Bart. >> >> > From bart.vanassche at gmail.com Wed Sep 2 11:05:22 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 2 Sep 2009 20:05:22 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9D6FB9.1010509@vlnb.net> <4A9E4276.7040108@vlnb.net> Message-ID: On Wed, Sep 2, 2009 at 7:27 PM, Chris Worley wrote: > > Then, the majority of messages were the pair: > > Sep  2 17:12:22 nameme kernel: [408676.552666] [0]: scst: > __scst_init_cmd:3361:Too many pending commands (50) in session, > returning BUSY to initiator "0x0002c903000260470002c90300026047" > Sep  2 17:12:22 nameme kernel: [408676.552670] [0]: scst: > scst_set_busy:366:Sending QUEUE_FULL status to initiator > 0x0002c903000260470002c90300026047 (cmds count 79, queue_type 1, > sess->init_phase 3) > > # grep -e "returning BUSY to initiator" /var/log/messages | wc -l > 221834 > # grep -e "Sending QUEUE_FULL status to initiator" /var/log/messages | wc -l > 167948 > > Maybe the initiator has stopped due to its "busy" tracking... deciding > the target cannot continue? > > Is the queue settable to larger than 50 outstanding commands? Definitely. The details are explained in the document srpt/README. Bart. From worleys at gmail.com Wed Sep 2 11:17:03 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 12:17:03 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9D6FB9.1010509@vlnb.net> <4A9E4276.7040108@vlnb.net> Message-ID: On Wed, Sep 2, 2009 at 12:05 PM, Bart Van Assche wrote: > On Wed, Sep 2, 2009 at 7:27 PM, Chris Worley wrote: >> >> Then, the majority of messages were the pair: >> >> Sep  2 17:12:22 nameme kernel: [408676.552666] [0]: scst: >> __scst_init_cmd:3361:Too many pending commands (50) in session, >> returning BUSY to initiator "0x0002c903000260470002c90300026047" >> Sep  2 17:12:22 nameme kernel: [408676.552670] [0]: scst: >> scst_set_busy:366:Sending QUEUE_FULL status to initiator >> 0x0002c903000260470002c90300026047 (cmds count 79, queue_type 1, >> sess->init_phase 3) >> >> # grep -e "returning BUSY to initiator" /var/log/messages | wc -l >> 221834 >> # grep -e "Sending QUEUE_FULL status to initiator" /var/log/messages | wc -l >> 167948 >> >> Maybe the initiator has stopped due to its "busy" tracking... deciding >> the target cannot continue? >> >> Is the queue settable to larger than 50 outstanding commands? > > Definitely. The details are explained in the document srpt/README. This is looking god so far; I've set it to 256, and am not seeing any queue-full messages. This may be the work-around. Thanks, Chris > > Bart. > From worleys at gmail.com Wed Sep 2 11:52:22 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 12:52:22 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9D6FB9.1010509@vlnb.net> <4A9E4276.7040108@vlnb.net> Message-ID: On Wed, Sep 2, 2009 at 12:17 PM, Chris Worley wrote: > On Wed, Sep 2, 2009 at 12:05 PM, Bart Van > Assche wrote: >> On Wed, Sep 2, 2009 at 7:27 PM, Chris Worley wrote: >>> >>> Then, the majority of messages were the pair: >>> >>> Sep  2 17:12:22 nameme kernel: [408676.552666] [0]: scst: >>> __scst_init_cmd:3361:Too many pending commands (50) in session, >>> returning BUSY to initiator "0x0002c903000260470002c90300026047" >>> Sep  2 17:12:22 nameme kernel: [408676.552670] [0]: scst: >>> scst_set_busy:366:Sending QUEUE_FULL status to initiator >>> 0x0002c903000260470002c90300026047 (cmds count 79, queue_type 1, >>> sess->init_phase 3) >>> >>> # grep -e "returning BUSY to initiator" /var/log/messages | wc -l >>> 221834 >>> # grep -e "Sending QUEUE_FULL status to initiator" /var/log/messages | wc -l >>> 167948 >>> >>> Maybe the initiator has stopped due to its "busy" tracking... deciding >>> the target cannot continue? >>> >>> Is the queue settable to larger than 50 outstanding commands? >> >> Definitely. The details are explained in the document srpt/README. > > This is looking god so far; I've set it to 256, and am not seeing any > queue-full messages.  This may be the work-around. No such luck; it takes much longer for the hang to occur, but it eventually occurs :( Thanks, Chris > > Thanks, > > Chris >> >> Bart. >> > From bart.vanassche at gmail.com Wed Sep 2 12:31:14 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 2 Sep 2009 21:31:14 +0200 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: > [ ... ] > I've found a good kernel/scst mix to easily repeat this; I can get it > to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the > 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF > or OFED at all) and SCST rev 1062 on the target using one drive > (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being > used). > [ ... ] Is there a special reason why you are using the 2.6.27-14-server kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. Bart. From tziporet at dev.mellanox.co.il Wed Sep 2 12:37:22 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 02 Sep 2009 22:37:22 +0300 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> Message-ID: <4A9EC972.9050106@mellanox.co.il> BOYRIE Fabrice wrote: > Hello > > Hoping I'm in the good mailing list. > I've a problem with ofed 1.4.2 on Centos 5.3. > > > We have a new cluster with QDR infiniband. > I've installed ofed from source using the install.pl script with the > default values. > I've used default kernel from Centos (2.6.18-128.7.1.el5) > When a node starts, openibd and opensmd services are launched. > > > Infiniband is working > > ibv_devinfo > hca_id: mlx4_0 > fw_ver: 2.6.000 > node_guid: 0002:c903:0004:3efc > sys_image_guid: 0002:c903:0004:3eff > vendor_id: 0x02c9 > vendor_part_id: 26428 > hw_ver: 0xA0 > board_id: MT_0C40110009 > phys_port_cnt: 1 > port: 1 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 9 > port_lid: 17 > port_lmc: 0x00 > > If I launch MPI program, eg vasp, it works using infiniband transport > and the performance is good. > > So no problem until I want to launch a program not using infiniband: > Gaussian. > > With some big calculus and with %ncpu=8, Gaussian abort with the > following message > ntrbks: Input/output error > I launched several times Gaussian and it always aborted at the same > point. > > If I launch the same gaussian on the same input file on our old > cluster (same Centos 5.3, same kernel, but without infiniband), it works. > > Searching the source code for ntrbks shows a call to fstatfs. > > So I've straced Gaussian on the two clusters. Here is the relevant > part. > > New cluster: > > [pid 5715] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", > ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", > "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.i"...,"0", > "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.r"..., "0", > "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.d"..., "0", > "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.s"..., "0", > "/tmp/CpRh_H_Ph_EneTS1/Gau-5714.i"..., "0", "junk.out", "0", ...], > [/* 65 vars */] PANIC: attached pid 5816 exited with 0 > [pid 5715] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 > [pid 5715] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, > f_blocks=13562292, f_bfree=12353558, f_bavail=12353558, > f_files=434124416, f_ffree=434090957, f_fsid={0, 0}, f_namelen=255, > f_frsize =32768}) = 0 > [pid 5715] read(5, "\10\0\0\0\0\0\0\0", 8) = 8 > [pid 5715] read(5, > "\10\0\0\0\0\0\0\0\0\320\10\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0 > \0\0"..., 320032) = 320032 > [pid 5715] fstatfs(5, 0x7fff553be6d0) = -1 EIO (Input/output error) > > > Old cluster: > > [pid 8605] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", > ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", > "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", > "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", > "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", > "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", > "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "junk.out", "0", ...], [/* 8 > 2 vars */]PANIC: attached pid 8701 exited with 0 > [pid 8605] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 > > [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, > f_blocks=9150944, f_bfree=686850, f_bavail=686850, f_files=88123232, > f_ffree=87917195, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 > [pid 8605] read(5, "'\0\0\0\0\0\0\0", 8) = 8 > [pid 8605] read(5, > "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 > \0"..., 320032) = 320032 > [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, > f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, > f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 > [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, > f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, > f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 > [pid 8605] write(5, > "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 > \0"..., > 320032) = 320032 > [pid 8605] close(5) = 0 > > > > > If I put the input file on a local directory instead of a nfs one, > Gaussian works. > There is no messages in dmesg or in /var/log directory on the node or on > the nfs server. > > On the node, /home is mounted as > 192.168.1.100:/home on /home type nfs > (rw,nosuid,rsize=32768,proto=tcp,addr=192.168.1.100) > > 192.168.1.xxx is the ethernet network (the nfs server has not > infiniband card). > On the node, it is enough to do > «ifconfig ib0 down > service opensmd stop > » to have Gaussian working on the nfs directory. > > («ifconfig ib0 down» or «service opensmd stop» alone is not enough) > > > So it seems there is an interaction between nfs access and openfabric. > But why ? And how to solve it ? > > > > It seems issues of NFS/RDMA backports. Can you install OFED without NFS/RDMA? You can change the conf file for this Jon/Steve/Jeff - are you familiar with this issue? Tziporet From worleys at gmail.com Wed Sep 2 12:53:02 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 13:53:02 -0600 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Wed, Sep 2, 2009 at 1:31 PM, Bart Van Assche wrote: > On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >> [ ... ] >> I've found a good kernel/scst mix to easily repeat this; I can get it >> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >> or OFED at all) and SCST rev 1062 on the target using one drive >> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >> used). >> [ ... ] > > Is there a special reason why you are using the 2.6.27-14-server > kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. No special reason other than it didn't get upgraded w/ the rest of the distro... started w/ 8.10. Do you think that kernel is better? Thanks, Chris > > Bart. > From changquing.tang at hp.com Wed Sep 2 12:53:13 2009 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 2 Sep 2009 19:53:13 +0000 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel Message-ID: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> Roland or Mellanox Engineers: We setup completion channel for a completion queue. We want to check if there is any event available, and suppose there is NO event on both completion channel and completion queue. What we can do is: 1. call select() on completion channel with zero timeout and return 0. 2. call ibv_poll_cq() directly and return 0. Question: Which way has lower overhead ? We know select() has to switch to kernel mode, does ibv_poll_cq() switch to kernel mode and may cause a context switch ? Thanks. --CQ Tang From bart.vanassche at gmail.com Wed Sep 2 13:00:18 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 2 Sep 2009 22:00:18 +0200 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Wed, Sep 2, 2009 at 9:53 PM, Chris Worley wrote: > On Wed, Sep 2, 2009 at 1:31 PM, Bart Van Assche wrote: >> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>> [ ... ] >>> I've found a good kernel/scst mix to easily repeat this; I can get it >>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>> or OFED at all) and SCST rev 1062 on the target using one drive >>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>> used). >>> [ ... ] >> >> Is there a special reason why you are using the 2.6.27-14-server >> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. > > No special reason other than it didn't get upgraded w/ the rest of the > distro... started w/ 8.10. > > Do you think that kernel is better? I noticed this while trying to reproduce this issue. I have no opinion yet about which of these two kernels is better. I'll downgrade the Ubuntu kernel in my setup. Bart. From worleys at gmail.com Wed Sep 2 13:03:58 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 14:03:58 -0600 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Wed, Sep 2, 2009 at 1:53 PM, Chris Worley wrote: > On Wed, Sep 2, 2009 at 1:31 PM, Bart Van Assche wrote: >> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>> [ ... ] >>> I've found a good kernel/scst mix to easily repeat this; I can get it >>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>> or OFED at all) and SCST rev 1062 on the target using one drive >>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>> used). >>> [ ... ] >> >> Is there a special reason why you are using the 2.6.27-14-server >> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. > > No special reason other than it didn't get upgraded w/ the rest of the > distro... started w/ 8.10. > > Do you think that kernel is better? You are correct... the system is 8.10, not 9.04. Chris > > Thanks, > > Chris >> >> Bart. >> > From tziporet at dev.mellanox.co.il Wed Sep 2 13:10:09 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 02 Sep 2009 23:10:09 +0300 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> Message-ID: <4A9ED121.5020703@mellanox.co.il> Tang, Changqing wrote: > Roland or Mellanox Engineers: > > We setup completion channel for a completion queue. We want to check if there is any > event available, and suppose there is NO event on both completion channel and completion queue. > > What we can do is: > > 1. call select() on completion channel with zero timeout and return 0. > 2. call ibv_poll_cq() directly and return 0. > > Question: > > Which way has lower overhead ? ibv_poll_cq > We know select() has to switch to kernel mode, does ibv_poll_cq() > switch to kernel mode and may cause a context switch ? > No - it's pure user space function Tziporet From Jeffrey.C.Becker at nasa.gov Wed Sep 2 14:15:49 2009 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Wed, 02 Sep 2009 14:15:49 -0700 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <4A9EC972.9050106@mellanox.co.il> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> Message-ID: <4A9EE085.9050102@nasa.gov> Tziporet Koren wrote: > BOYRIE Fabrice wrote: > >> Hello >> >> Hoping I'm in the good mailing list. >> I've a problem with ofed 1.4.2 on Centos 5.3. >> Salut Fabrice! Does it also happen with OFED 1.5 alpha? Thanks. -jeff >> >> We have a new cluster with QDR infiniband. >> I've installed ofed from source using the install.pl script with the >> default values. >> I've used default kernel from Centos (2.6.18-128.7.1.el5) >> When a node starts, openibd and opensmd services are launched. >> >> >> Infiniband is working >> >> ibv_devinfo >> hca_id: mlx4_0 >> fw_ver: 2.6.000 >> node_guid: 0002:c903:0004:3efc >> sys_image_guid: 0002:c903:0004:3eff >> vendor_id: 0x02c9 >> vendor_part_id: 26428 >> hw_ver: 0xA0 >> board_id: MT_0C40110009 >> phys_port_cnt: 1 >> port: 1 >> state: PORT_ACTIVE (4) >> max_mtu: 2048 (4) >> active_mtu: 2048 (4) >> sm_lid: 9 >> port_lid: 17 >> port_lmc: 0x00 >> >> If I launch MPI program, eg vasp, it works using infiniband transport >> and the performance is good. >> >> So no problem until I want to launch a program not using infiniband: >> Gaussian. >> >> With some big calculus and with %ncpu=8, Gaussian abort with the >> following message >> ntrbks: Input/output error >> I launched several times Gaussian and it always aborted at the same >> point. >> >> If I launch the same gaussian on the same input file on our old >> cluster (same Centos 5.3, same kernel, but without infiniband), it works. >> >> Searching the source code for ntrbks shows a call to fstatfs. >> >> So I've straced Gaussian on the two clusters. Here is the relevant >> part. >> >> New cluster: >> >> [pid 5715] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", >> ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", >> "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.i"...,"0", >> "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.r"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.d"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1/Gau-5715.s"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1/Gau-5714.i"..., "0", "junk.out", "0", ...], >> [/* 65 vars */] PANIC: attached pid 5816 exited with 0 >> [pid 5715] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 >> [pid 5715] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, >> f_blocks=13562292, f_bfree=12353558, f_bavail=12353558, >> f_files=434124416, f_ffree=434090957, f_fsid={0, 0}, f_namelen=255, >> f_frsize =32768}) = 0 >> [pid 5715] read(5, "\10\0\0\0\0\0\0\0", 8) = 8 >> [pid 5715] read(5, >> "\10\0\0\0\0\0\0\0\0\320\10\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0 >> \0\0"..., 320032) = 320032 >> [pid 5715] fstatfs(5, 0x7fff553be6d0) = -1 EIO (Input/output error) >> >> >> Old cluster: >> >> [pid 8605] execve("/opt/Gaussian/g03_e01-pgf//g03/l1002.exe", >> ["/opt/Gaussian/g03_e01-pgf//g03/l"..., "1258291200", >> "CpRh_H_Ph_EneTS1.chk", "1", "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", >> "/tmp/CpRh_H_Ph_EneTS1-8-pgf-2/Ga"..., "0", "junk.out", "0", ...], [/* 8 >> 2 vars */]PANIC: attached pid 8701 exited with 0 >> [pid 8605] open("CpRh_H_Ph_EneTS1.chk", O_RDWR) = 5 >> >> [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, >> f_blocks=9150944, f_bfree=686850, f_bavail=686850, f_files=88123232, >> f_ffree=87917195, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 >> [pid 8605] read(5, "'\0\0\0\0\0\0\0", 8) = 8 >> [pid 8605] read(5, >> "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 >> \0"..., 320032) = 320032 >> [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, >> f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, >> f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 >> [pid 8605] fstatfs(5, {f_type="NFS_SUPER_MAGIC", f_bsize=32768, >> f_blocks=9150944, f_bfree=683022, f_bavail=683022, f_files=87633232, >> f_ffree=87427156, f_fsid={0, 0}, f_namelen=255, f_frsize=32768}) = 0 >> [pid 8605] write(5, >> "'\0\0\0\0\0\0\0\0\360$\0\0\0\0\0\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0 >> \0"..., >> 320032) = 320032 >> [pid 8605] close(5) = 0 >> >> >> >> >> If I put the input file on a local directory instead of a nfs one, >> Gaussian works. >> There is no messages in dmesg or in /var/log directory on the node or on >> the nfs server. >> >> On the node, /home is mounted as >> 192.168.1.100:/home on /home type nfs >> (rw,nosuid,rsize=32768,proto=tcp,addr=192.168.1.100) >> >> 192.168.1.xxx is the ethernet network (the nfs server has not >> infiniband card). >> On the node, it is enough to do >> «ifconfig ib0 down >> service opensmd stop >> » to have Gaussian working on the nfs directory. >> >> («ifconfig ib0 down» or «service opensmd stop» alone is not enough) >> >> >> So it seems there is an interaction between nfs access and openfabric. >> But why ? And how to solve it ? >> >> >> >> >> > It seems issues of NFS/RDMA backports. > Can you install OFED without NFS/RDMA? > You can change the conf file for this > > Jon/Steve/Jeff - are you familiar with this issue? > > Tziporet > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From changquing.tang at hp.com Wed Sep 2 15:17:37 2009 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 2 Sep 2009 22:17:37 +0000 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: <4A9ED121.5020703@mellanox.co.il> References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> <4A9ED121.5020703@mellanox.co.il> Message-ID: <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> > > > > 1. call select() on completion channel with zero > timeout and return 0. > > 2. call ibv_poll_cq() directly and return 0. > > > > Question: > > > > Which way has lower overhead ? > ibv_poll_cq > > We know select() has to switch to kernel mode, does ibv_poll_cq() > > switch to kernel mode and may cause a context switch ? > > > No - it's pure user space function But I just check the source code, ibv_poll_cq() is actually ibv_cmd_poll_cq(), and ibv_cmd_poll_cq() calls write() system call on the IB device. Doesn't this write() system call switch to kernel mode and possiblely casuse a context switch ? --CQ > > Tziporet From sean.hefty at intel.com Wed Sep 2 15:25:08 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 2 Sep 2009 15:25:08 -0700 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> <4A9ED121.5020703@mellanox.co.il> <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> Message-ID: >But I just check the source code, ibv_poll_cq() is actually ibv_cmd_poll_cq(), >and ibv_cmd_poll_cq() calls write() system call on the IB device. > >Doesn't this write() system call switch to kernel mode and possiblely casuse >a context switch ? See verbs.h: static inline int ibv_poll_cq(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc) { return cq->context->ops.poll_cq(cq, num_entries, wc); } The userspace provider library sets poll_cq to an internal call. - Sean From changquing.tang at hp.com Wed Sep 2 16:00:11 2009 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 2 Sep 2009 23:00:11 +0000 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> <4A9ED121.5020703@mellanox.co.il> <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> Message-ID: <58C6777539C300489D145B0F8E29C328169862C54D@GVW0673EXC.americas.hpqcorp.net> Sean: I understand that ops.poll_cq is actually ibv_cmd_poll_cq(), right ? Do you mean during ibv_poll_cq() call, there is no system call involved ? --CQ > -----Original Message----- > From: Sean Hefty [mailto:sean.hefty at intel.com] > Sent: Wednesday, September 02, 2009 5:25 PM > To: Tang, Changqing; tziporet at dev.mellanox.co.il > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] performance to call ibv_poll_cq() > vs. call select() on completion channel > > >But I just check the source code, ibv_poll_cq() is actually > >ibv_cmd_poll_cq(), and ibv_cmd_poll_cq() calls write() > system call on the IB device. > > > >Doesn't this write() system call switch to kernel mode and > possiblely > >casuse a context switch ? > > See verbs.h: > > static inline int ibv_poll_cq(struct ibv_cq *cq, int > num_entries, struct ibv_wc > *wc) > { > return cq->context->ops.poll_cq(cq, num_entries, wc); } > > The userspace provider library sets poll_cq to an internal call. > > - Sean > > From rdreier at cisco.com Wed Sep 2 16:08:39 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Sep 2009 16:08:39 -0700 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: <58C6777539C300489D145B0F8E29C328169862C54D@GVW0673EXC.americas.hpqcorp.net> (Changqing Tang's message of "Wed, 2 Sep 2009 23:00:11 +0000") References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> <4A9ED121.5020703@mellanox.co.il> <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> <58C6777539C300489D145B0F8E29C328169862C54D@GVW0673EXC.americas.hpqcorp.net> Message-ID: > I understand that ops.poll_cq is actually ibv_cmd_poll_cq(), right ? No, not for most devices. Look at libmthca, etc to see what the poll_cq method is set to. > Do you mean during ibv_poll_cq() call, there is no system call involved ? Right, for most devices poll_cq can be done just by looking at memory in userspace without involving the kernel at all. From rdreier at cisco.com Wed Sep 2 16:12:24 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Sep 2009 16:12:24 -0700 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.32 Message-ID: Since 2.6.31-rc8 has been out more than a week already, it's probably a good time to talk about 2.6.32 merge plans. All the pending things that I'm aware of are listed below. Boilerplate: If something isn't already in my tree and it isn't listed below, I probably missed it or dropped it unintentionally. Please remind me. As usual, when submitting a patch: - Give a good changelog that explains what issue your patch addresses, how you address the issue, how serious the issue is, and any other information that would be useful to someone evaluating your patch or reading it years from now. - Please make sure that you include a "Signed-off-by:" line, and put any extra junk that should not go into the final kernel log *after* the "---" line so that git tools strip it off automatically. Make the subject line be appropriate for inclusion in the kernel log as well once the leading "[PATCH ...]" stuff is stripped off. I waste a lot of time fixing patches by hand that could otherwise be spent doing something productive like watching youtube. - Run your patch through checkpatch.pl so I don't have to nag you to fix trivial issues (or spend time fixing them myself). - Read your patch over so I don't see a memory leak or deadlock as soon as I look at it. - Build your patch with sparse checking ("C=2 CF=-D__CHECK_ENDIAN__") and make sure it doesn't introduce new warnings. (A big bonus in goodwill for sending patches that fix old warnings) - Test your patch on a kernel with things like slab debugging and lockdep turned on. And while you're waiting for me to get to your patch, I sure wouldn't mind if you read and commented on someone else's patch. None of this means you shouldn't remind me about pending patches, since I often lose track of things and drop them accidentally. Core: - Userspace MMU notifiers ("ummunotify") -- my code looks great to me, seems to have passed review on linux-kernel, and has spent time in -mm and -next. Jeff Squyres has put together a proof-of-concept implementation of Open MPI support, so I think all things are go for asking that this be pulled into 2.6.32. I'll send Linus a separate pull request for this once the merge window opens. - Make queue sizes tunable in the MAD module. - Various fixes and cleanups, including a fix for a lockdep issue in the MAD module. This fix may evolve if the core kernel API grows __cancel_delayed_work() to help with this. ULPs: - For IPoIB, I merged Jason's patch to check multicast addresses, and further fixes in this area may be forthcoming. HW specific: - A couple of minor amso1100 cleanups/fixes. - A bunch of fixes to the cxgb3 and nes iWARP drivers. - A few ehca fixes. - Some mlx4 and mthca fixes, including using the PCI device name in interrupt names so multiple devices can be managed. Here are a few topics that I believe will not be ready in time for the 2.6.32 window and will need to wait for 2.6.33 at least: - Jack's XRC patch set. I still need time to work through and clean up the code. (I actually did make some progress on this during this cycle -- if you look closely you can see my xrc branch has changed -- but not enough to actually finish unfortunately) Here all the patches I already have in my for-next branch: Alexander Schmidt (1): IB/ehca: Make port autodetect mode the default Arputham Benjamin (2): mlx4_core: Distinguish multiple devices in /proc/interrupts IB/mthca: Distinguish multiple devices in /proc/interrupts Chien Tung (1): RDMA/nes: Map MTU to IB_MTU_* and correctly report link state Don Wood (10): RDMA/nes: Update refcnt during disconnect RDMA/nes: Allocate work item for disconnect event handling RDMA/nes: Change memory allocation for cqp request to GFP_ATOMIC RDMA/nes: Clean out CQ completions when QP is destroyed RDMA/nes: Add CQ error handling RDMA/nes: Implement Terminate Packet RDMA/nes: Use flush mechanism to set status for wqe in error RDMA/nes: Make poll_cq return correct number of wqes during flush RDMA/nes: Use the flush code to fill in cqe error RDMA/nes: Rework the disconn routine for terminate and flushing Hal Rosenstock (1): IB/mad: Allow tuning of QP0 and QP1 sizes Jack Morgenstein (3): IB/mthca: Don't allow userspace open while recovering from catastrophic error IB/mlx4: Don't allow userspace open while recovering from catastrophic error IB/uverbs: Return ENOSYS for unimplemented commands (not EINVAL) Jason Gunthorpe (1): IPoIB: Check multicast address format Joachim Fenkes (2): IB/ehca: Construct MAD redirect replies from request MAD IB/ehca: Fix CQE flags reporting Marcin Slusarz (1): IB: Use printk_once() for driver versions Roel Kluin (2): IB/ipath: strncpy() doesn't always NUL-terminate RDMA/amso1100: Check kmalloc() result in c2_register_device() Roland Dreier (13): IPoIB: Remove unused includes IB: Use DEFINE_SPINLOCK() for static spinlocks mlx4_core: Use pci_request_regions() mlx4_core: Remove unnecessary includes of IB/mthca: Remove unnecessary include of IB/mthca: Remove unnecessary include of ummunotify: Userspace support for MMU notifications IB/mad: Check hop count field in directed route MAD to avoid array overflow IPoIB: Drop priv->lock before calling ipoib_send() IB/mlx4: Annotate CQ locking IB/mthca: Annotate CQ locking IB/mad: Fix possible lock-lock-timer deadlock mlx4_core: Allocate and map sufficient ICM memory for EQ context Steve Wise (5): RDMA/cxgb3: iwch_unregister_device leaks memory RDMA/cxgb3: Set the appropriate IO channel in rdma_init work requests RDMA/cxgb3: Handle port events properly RDMA/cxgb3: Don't free endpoints early RDMA/cxgb3: Wake up any waiters on peer close/abort Tobias Klauser (1): RDMA/amso1100: Use %pM conversion specifier Yevgeny Petrilin (1): mlx4_core: Avoid double free_icms Yossi Etigin (1): IB/core: Fix send multicast group leave retry From changquing.tang at hp.com Wed Sep 2 16:13:20 2009 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 2 Sep 2009 23:13:20 +0000 Subject: [ofa-general] performance to call ibv_poll_cq() vs. call select() on completion channel In-Reply-To: References: <58C6777539C300489D145B0F8E29C32816985D2D4C@GVW0673EXC.americas.hpqcorp.net> <4A9ED121.5020703@mellanox.co.il> <58C6777539C300489D145B0F8E29C328169862C50D@GVW0673EXC.americas.hpqcorp.net> <58C6777539C300489D145B0F8E29C328169862C54D@GVW0673EXC.americas.hpqcorp.net> Message-ID: <58C6777539C300489D145B0F8E29C328169862C557@GVW0673EXC.americas.hpqcorp.net> I did not understand the relation between ops.poll_cq() and ibv_cmd_poll_cq() correctly. It is clear now. Thank you. --CQ > -----Original Message----- > From: Roland Dreier [mailto:rdreier at cisco.com] > Sent: Wednesday, September 02, 2009 6:09 PM > To: Tang, Changqing > Cc: Sean Hefty; tziporet at dev.mellanox.co.il; > general at lists.openfabrics.org > Subject: Re: [ofa-general] performance to call ibv_poll_cq() > vs. call select() on completion channel > > > > I understand that ops.poll_cq is actually > ibv_cmd_poll_cq(), right ? > > No, not for most devices. Look at libmthca, etc to see what > the poll_cq method is set to. > > > Do you mean during ibv_poll_cq() call, there is no > system call involved ? > > Right, for most devices poll_cq can be done just by looking > at memory in userspace without involving the kernel at all. > From worleys at gmail.com Wed Sep 2 21:08:25 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 2 Sep 2009 22:08:25 -0600 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Wed, Sep 2, 2009 at 2:58 PM, Chris Worley wrote: > On Wed, Sep 2, 2009 at 2:00 PM, Bart Van Assche wrote: >> On Wed, Sep 2, 2009 at 9:53 PM, Chris Worley wrote: >>> On Wed, Sep 2, 2009 at 1:31 PM, Bart Van Assche wrote: >>>> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>>>> [ ... ] >>>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>>> used). >>>>> [ ... ] >>>> >>>> Is there a special reason why you are using the 2.6.27-14-server >>>> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. >>> >>> No special reason other than it didn't get upgraded w/ the rest of the >>> distro... started w/ 8.10. > > I'm upgrading too, to 9.04. I tried the 2.6.28-15-server kernel (along w/ the 9.04 upgrade), and it does repeat the issue. In trying to build a kernel w/ lockdep support as Vlad requested, my lack of Debian knowledge shone through, and, although I believe I followed all the instructions correctly, I'm not sure if I have a 2.6.28-15 or 2.6.28-10 kernel. Anyway, the issue is still repeatable. Whatever kernel that is, I have SRP hung currently. What should I look for in /proc/lockd*? I don't think it's a kernel lock... I think it's a protocol lock, as I can rmmod the target kernel modules (scst_vdisk, scst, and ib_srpt) when the initiator gets in this state. Thanks, Chris > > Chris >>> >>> Do you think that kernel is better? >> >> I noticed this while trying to reproduce this issue. I have no opinion >> yet about which of these two kernels is better. I'll downgrade the >> Ubuntu kernel in my setup. >> >> Bart. >> > From bart.vanassche at gmail.com Wed Sep 2 22:59:21 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 3 Sep 2009 07:59:21 +0200 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Thu, Sep 3, 2009 at 6:08 AM, Chris Worley wrote: > In trying to build a kernel w/ lockdep support as Vlad requested, my > lack of Debian knowledge shone through, and, although I believe I > followed all the instructions correctly, I'm not sure if I have a > 2.6.28-15 or 2.6.28-10 kernel.  Anyway, the issue is still repeatable. > > Whatever kernel that is, I have SRP hung currently.  What should I > look for in /proc/lockd*? Lockdep sends its error messages to the kernel ring buffer (dmesg / /var/log/messages). An example of an error message generated by lockdep can be found here: http://lkml.org/lkml/2007/5/22/443. Bart. From amirv at mellanox.co.il Wed Sep 2 23:33:17 2009 From: amirv at mellanox.co.il (Amir Vadai) Date: Thu, 03 Sep 2009 09:33:17 +0300 Subject: [ofa-general] Re: SDP connection to local host In-Reply-To: References: Message-ID: <4A9F632D.5050100@mellanox.co.il> Hi, You can't connect to localhost/127.0.0.1/0.0.0.0/lo interface with SDP. These ip's are not of IB interfaces so SDP (actually CM) is having trouble connecting to it. If you would have used libsdp instead of changing your source code - you could configure it to use TCP to these addresses instead of SDP - and use SDP to external IP's. - Amir On 09/02/2009 07:23 PM, Chris Tilt wrote: > Hello, > > I was reading the bug > https://bugs.openfabrics.org/show_bug.cgi?id=1279 '"already connected > successful" very slow' and I was hoping to understand my own issue. I > am porting an application to use SDP and it is getting a timeout when > trying to connect to a port on local host. Perror printed "Connection > timed out". The application is known to work with AF_INET and the only > change I made in the program was to make it AF_INET_SDP, which I > confirmed is defined as 27 on my linux system. Any hints would be great. > > Cheers, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From amirv at mellanox.co.il Wed Sep 2 23:39:17 2009 From: amirv at mellanox.co.il (Amir Vadai) Date: Thu, 03 Sep 2009 09:39:17 +0300 Subject: [ofa-general] Connection timeout on localhost (using libsdp) In-Reply-To: References: Message-ID: <4A9F6495.3000305@mellanox.co.il> Hi, I noticed this mail after the private mail... As I said before - you should run epmd using LD_PRELOAD and make sure that in libsdp.conf calls to 127.0.0.1 are done using TCP and not SDP. -- Amir Vadai Software Eng. Mellanox Technologies mailto: amirv at mellanox.co.il Tel +972-3-6259539 On 09/02/2009 07:36 PM, Chris Tilt wrote: > Hello, > > I am very hopeful of getting libsdp working with an existing > application. Specifically, I am trying to port Erlang to use SDP for > it’s “distributed Erlang” mechanism. With LD_PRELOAD, this may be very > easy. However, I am having trouble with one of it’s daemon processes (a > port map deamon called “epmd”). Conceptually, it’s a very simple C > program that runs either as a daemon or as a client that connects to the > daemon to do queries. To be sure that I was getting SDP, I changed the > source to use AF_INET_SDP by actually looking up that value in the > include files (it was 27) and substituting that in place of AF_INET. I > started the process as a daemon with debugging on and it reported as > normal “listening on port...”. When I tried running it as a client, it > attempted to connect to the known port on “localhost” and got a > “Connection timeout” as reported by perror(2). > > Ideas? > > Cheers, Chris > > P.S. There are many users of Infiniband that would dearly love to see > distributed Erlang running FAST on their systems, so this would > potentially help a lot of customers! Thanks for your help. > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From Fabrice.Boyrie at univ-montp2.fr Thu Sep 3 00:42:23 2009 From: Fabrice.Boyrie at univ-montp2.fr (BOYRIE Fabrice) Date: Thu, 3 Sep 2009 09:42:23 +0200 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <4A9EE085.9050102@nasa.gov> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <4A9EE085.9050102@nasa.gov> Message-ID: <20090903074223.GD15063@lapinou.lsd.univ-montp2.fr> On Wed, Sep 02, 2009 at 02:15:49PM -0700, Jeff Becker wrote: > Tziporet Koren wrote: > > BOYRIE Fabrice wrote: > > > >> Hello > >> > >> Hoping I'm in the good mailing list. > >> I've a problem with ofed 1.4.2 on Centos 5.3. > >> > > Salut Fabrice! > > Does it also happen with OFED 1.5 alpha? Thanks. Hello OK, I tried with OFED-1.5-20090902-0600.tgz, Gaussian works. The problem semmes solved. I'd prefer to avoid alpha software on our cluster, but if there is no other possibility. Fabrice BOYRIE From Fabrice.Boyrie at univ-montp2.fr Thu Sep 3 00:51:18 2009 From: Fabrice.Boyrie at univ-montp2.fr (BOYRIE Fabrice) Date: Thu, 3 Sep 2009 09:51:18 +0200 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <4A9EC972.9050106@mellanox.co.il> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> Message-ID: <20090903075118.GE15063@lapinou.lsd.univ-montp2.fr> On Wed, Sep 02, 2009 at 10:37:22PM +0300, Tziporet Koren wrote: > It seems issues of NFS/RDMA backports. > Can you install OFED without NFS/RDMA? > You can change the conf file for this I've tried the following configuration and rebooted the node. The bug is always here. Did I forgot to disable something ? Fabrice BOYRIE cat ofed.conf kernel-ib=y core=y mthca=y mlx4=y mlx4_en=y cxgb3=y nes=y ipath=y ipoib=y sdp=y srp=y srpt=y rds=y qlgc_vnic=y kernel-ib-devel=n ib-bonding=n ib-bonding-debuginfo=n libibverbs=y libibverbs-devel=y libibverbs-devel-static=y libibverbs-utils=y libibverbs-debuginfo=y libmthca=y libmthca-devel-static=y libmthca-debuginfo=y libmlx4=y libmlx4-devel=y libmlx4-debuginfo=n libcxgb3=n libcxgb3-devel=n libcxgb3-debuginfo=n libnes=y libnes-devel-static=y libnes-debuginfo=y libipathverbs=y libipathverbs-devel=y libipathverbs-debuginfo=n libibcm=y libibcm-devel=y libibcm-debuginfo=n libibcommon=y libibcommon-devel=y libibcommon-static=y libibcommon-debuginfo=n libibumad=y libibumad-devel=y libibumad-static=y libibumad-debuginfo=n libibmad=y libibmad-devel=y libibmad-static=y libibmad-debuginfo=y ibsim=y ibsim-debuginfo=n librdmacm=n librdmacm-utils=y librdmacm-devel=y librdmacm-debuginfo=n libsdp=y libsdp-devel=y libsdp-debuginfo=y opensm=y opensm-libs=y opensm-devel=y opensm-debuginfo=n opensm-static=y compat-dapl=y compat-dapl-devel=y dapl=y dapl-devel=y dapl-devel-static=y dapl-utils=y dapl-debuginfo=n perftest=y mstflint=y tvflash=y qlvnictools=y sdpnetstat=y srptools=y rds-tools=y rnfs-utils=n ibutils=y infiniband-diags=y qperf=y qperf-debuginfo=y ofed-docs=y ofed-scripts=y tgt-generic=y mpi-selector=y mvapich_gcc=y mvapich2_gcc=y openmpi_gcc=y mpitests_mvapich_gcc=y mpitests_mvapich2_gcc=y mpitests_openmpi_gcc=y build32=0 prefix=/usr mvapich2_conf_impl=ofa mvapich2_conf_romio=1 mvapich2_conf_shared_libs=1 mvapich2_conf_ckpt=0 mvapich2_conf_vcluster=small From vlad at lists.openfabrics.org Thu Sep 3 03:06:49 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 3 Sep 2009 03:06:49 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090903-0200 daily build status Message-ID: <20090903100649.72079E61DE1@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: \ -I/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp/arch//include \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(cong)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/.tmp_cong.o /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:36:35: error: asm-generic/bitops/le.h: No such file or directory make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090903-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From Brian.Murrell at Sun.COM Thu Sep 3 03:46:12 2009 From: Brian.Murrell at Sun.COM (Brian J. Murrell) Date: Thu, 03 Sep 2009 06:46:12 -0400 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <4A9EE085.9050102@nasa.gov> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <4A9EE085.9050102@nasa.gov> Message-ID: <1251974772.12806.1393.camel@pc.interlinx.bc.ca> On Wed, 2009-09-02 at 14:15 -0700, Jeff Becker wrote: > > Does it also happen with OFED 1.5 alpha? Thanks. Was bug 1671 landed to 1.5? > Tziporet Koren wrote: > > > It seems issues of NFS/RDMA backports. > > Can you install OFED without NFS/RDMA? But in 1.4.x, simply disabling NFS/RDMA does not prevent use of the backport headers that NFS/RDMA brings in. That's why I filed bug 1671. Jon Mason provided me a patch for bug 1671 which I backported to 1.4.x and his patch resolved problems I was having because I was getting NFS/RDMA headers even though I had selected to not build NFS/RDMA. Jon: can you attach your patch to bug 1671, just so it's on record? Maybe that patch will solve OP's problems, although I have to admit having jumped into this thread late and not fully understanding it's origins. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: From vst at vlnb.net Thu Sep 3 04:32:21 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Thu, 03 Sep 2009 15:32:21 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: <4A9FA945.4070408@vlnb.net> Chris Worley, on 09/03/2009 08:08 AM wrote: > On Wed, Sep 2, 2009 at 2:58 PM, Chris Worley wrote: >> On Wed, Sep 2, 2009 at 2:00 PM, Bart Van Assche wrote: >>> On Wed, Sep 2, 2009 at 9:53 PM, Chris Worley wrote: >>>> On Wed, Sep 2, 2009 at 1:31 PM, Bart Van Assche wrote: >>>>> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>>>>> [ ... ] >>>>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>>>> used). >>>>>> [ ... ] >>>>> Is there a special reason why you are using the 2.6.27-14-server >>>>> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. >>>> No special reason other than it didn't get upgraded w/ the rest of the >>>> distro... started w/ 8.10. >> I'm upgrading too, to 9.04. > > I tried the 2.6.28-15-server kernel (along w/ the 9.04 upgrade), and > it does repeat the issue. > > In trying to build a kernel w/ lockdep support as Vlad requested, my > lack of Debian knowledge shone through, and, although I believe I > followed all the instructions correctly, I'm not sure if I have a > 2.6.28-15 or 2.6.28-10 kernel. Anyway, the issue is still repeatable. > > Whatever kernel that is, I have SRP hung currently. What should I > look for in /proc/lockd*? > > I don't think it's a kernel lock... I think it's a protocol lock, as I > can rmmod the target kernel modules (scst_vdisk, scst, and ib_srpt) > when the initiator gets in this state. Since you can rmmod SCST modules, then this shouldn't be SCST or backstorage SW/HW issue, because that means there are no stuck or lost SCSI commands. So, it should be issue of either SRP target/initiator, or OFED on the target or initiator, or your IB hardware on any node. You should enable lockdep on both target and initiator (better with other kernel debug facilities enabled, see the attached file as a sample) and reproduce the issue. There is a big chance that those facilities will spot what's going on wrong there. Vlad > Thanks, > > Chris >> Chris >>>> Do you think that kernel is better? >>> I noticed this while trying to reproduce this issue. I have no opinion >>> yet about which of these two kernels is better. I'll downgrade the >>> Ubuntu kernel in my setup. >>> >>> Bart. >>> > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > -------------- next part -------------- A non-text attachment was scrubbed... Name: kern_dbg.diff Type: text/x-patch Size: 4270 bytes Desc: not available URL: From tziporet at dev.mellanox.co.il Thu Sep 3 05:01:08 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Thu, 03 Sep 2009 15:01:08 +0300 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.32 In-Reply-To: References: Message-ID: <4A9FB004.6020904@mellanox.co.il> Roland Dreier wrote: > Here are a few topics that I believe will not be ready in time for the > 2.6.32 window and will need to wait for 2.6.33 at least: > > - Jack's XRC patch set. I still need time to work through and clean > up the code. (I actually did make some progress on this during > this cycle -- if you look closely you can see my xrc branch has > changed -- but not enough to actually finish unfortunately) > > What about RDMAoE? Patches were sent few weeks ago and it seems you ignore them. Tziporet From hnrose at comcast.net Thu Sep 3 06:00:36 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Thu, 3 Sep 2009 09:00:36 -0400 Subject: [ofa-general] [PATCH] opensm doc: Indicated limited (rather than partial) partition membership Message-ID: <20090903130036.GA18519@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index ead3f76..f855268 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -10,7 +10,7 @@ when partition configuration file does not exist or cannot be accessed. The default partition has P_Key value 0x7fff. OpenSM's port will have full membership in default partition. All other end ports will have -partial membership. +limited membership. File Format diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index b23a973..5ad7631 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 "May 28, 2009" "OpenIB" "OpenIB Management" +.TH OPENSM 8 "September 3, 2009" "OpenIB" "OpenIB Management" .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -428,7 +428,7 @@ when partition configuration file does not exist or cannot be accessed. The default partition has P_Key value 0x7fff. OpenSM\'s port will have full membership in default partition. All other end ports will have -partial membership. +limited membership. File Format From hal.rosenstock at gmail.com Thu Sep 3 06:17:40 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 3 Sep 2009 09:17:40 -0400 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> Message-ID: On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J wrote: > I am trying to partition my IB network but I don’t seem to be able to > understand the opensm man page. > > > > First it says “The default partition has P_Key value 0x7fff. OpenSM´s port > will have full membership in default partition. All other end ports will > have partial membership.” but I don’t see the difference defined between > full and partial membership anywhere. Is it possible the reference was to > full and limited membership instead? > Yes, partial == limited. I've just sent a patch to change that word in the man page and doc. > Does this partition have to exist on all CA’s so the SM can “talk” > them? > Yes, this is an IBA requirement. > Also it says the default partition will be created “unconditionally even > when partition configuration file does not exist or cannot be accessed.” > Will it also be created if the partition configuration file exists but does > not have a default partition defined? > No. > Second, I see where CA’s can be members of multiple partitions (have > multiple P_keys). If a CA is in multiple partitions (has multiple P_Keys > assigned to it), which partition does it “send” on when the CA has packets > to send if more than one partition can reach the destination CA? > That's up to the application/ULP to set the proper PKey index. The application/ULP needs to ensure the destination is reachable via a common PKey. It does that via some sort of PathRecord request to the SA. > Also do switches (or any non CA’s) have to have P_Keys assigned for any > reason? > Yes, but with OpenSM they do not need configuration. OpenSM detects which switches are leaf switches with peer CA ports and sets up their partition tables appropriately. > > > Just as a sanity check, my interpretation so far is that my network should > have a partition configuration file similar to the following. Can anyone > tell me if I have this correct? In this example configuration, I am trying > to create two partitions. One with rack one and two, the other with rack > three and four: > > > > #Default partition (for SM control of the CA’s) > > Default=0x7fff,ipoib,rate=7:ALL=limited; > Default=0x7fff,ipoib,rate=7:ALL,SELF=full; > #rack1 > > rack1=0x111,ipoib,rate=7,defmember=full:; > > #rack2 > > rack2=0x111,ipoib,rate=7,defmember=full:; > > #rack3 > > rack3=0x112,ipoib,rate=7,defmember=full:; > > #rack4 > > rack4=0x112,ipoib,rate=7,defmember=full:; > I've never done it this way but it does look like the partition create code will detect the duplicated partitions (0x111 and 0x112) and merge ports from rack2 with rack1 and rack4 with rack3. -- Hal > > > *Thanks,* > > *Don Meyer* > > *Senior Network/System Engineer/Programmer* > > US+ (253) 371-9532 iNet 8-371-9532 > > **Other names and brands may be claimed as the property of others* > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at dev.mellanox.co.il Thu Sep 3 06:43:59 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Thu, 03 Sep 2009 16:43:59 +0300 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> Message-ID: <4A9FC81F.4040009@dev.mellanox.co.il> Hal Rosenstock wrote: > > > On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J > > wrote: > ... > ... > Just as a sanity check, my interpretation so far is that my network > should have a partition configuration file similar to the following. > Can anyone tell me if I have this correct? In this example > configuration, I am trying to create two partitions. One with rack > one and two, the other with rack three and four: > > > > #Default partition (for SM control of the CA’s) > > Default=0x7fff,ipoib,rate=7:ALL=limited; > > Default=0x7fff,ipoib,rate=7:ALL,SELF=full; > > #rack1 > > rack1=0x111,ipoib,rate=7,defmember=full:; > > #rack2 > > rack2=0x111,ipoib,rate=7,defmember=full:; > > #rack3 > > rack3=0x112,ipoib,rate=7,defmember=full:; > > #rack4 > > rack4=0x112,ipoib,rate=7,defmember=full:; > > I've never done it this way but it does look like the partition create > code will detect the duplicated partitions (0x111 and 0x112) and merge > ports from rack2 with rack1 and rack4 with rack3. It will. Note that partition names are meaningless in terms of IB management. Basically they are used just for logging. The only real partition ID is its pkey. -- Yevgeny > -- Hal > > > > > *Thanks,* > > *Don Meyer* > > /Senior Network/System Engineer/Programmer/ > > US+ (253) 371-9532 iNet 8-371-9532 > > /*Other names and brands may be claimed as the property of others/ > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hal.rosenstock at gmail.com Thu Sep 3 06:46:02 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 3 Sep 2009 09:46:02 -0400 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: <4A9FC81F.4040009@dev.mellanox.co.il> References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> <4A9FC81F.4040009@dev.mellanox.co.il> Message-ID: On Thu, Sep 3, 2009 at 9:43 AM, Yevgeny Kliteynik < kliteyn at dev.mellanox.co.il> wrote: > Hal Rosenstock wrote: > >> >> On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J < >> donald.j.meyer at intel.com > wrote: >> ... >> ... >> Just as a sanity check, my interpretation so far is that my network >> should have a partition configuration file similar to the following. >> Can anyone tell me if I have this correct? In this example >> configuration, I am trying to create two partitions. One with rack >> one and two, the other with rack three and four: >> >> >> #Default partition (for SM control of the CA’s) >> >> Default=0x7fff,ipoib,rate=7:ALL=limited; >> >> Default=0x7fff,ipoib,rate=7:ALL,SELF=full; >> >> #rack1 >> >> rack1=0x111,ipoib,rate=7,defmember=full:; >> >> #rack2 >> >> rack2=0x111,ipoib,rate=7,defmember=full:; >> >> #rack3 >> >> rack3=0x112,ipoib,rate=7,defmember=full:; >> >> #rack4 >> >> rack4=0x112,ipoib,rate=7,defmember=full:; >> >> I've never done it this way but it does look like the partition create >> code will detect the duplicated partitions (0x111 and 0x112) and merge ports >> from rack2 with rack1 and rack4 with rack3. >> > > It will. > Note that partition names are meaningless in terms of IB management. > Basically they are used just for logging. The only real partition ID > is its pkey. > The low 7 bits (without membership bit) of pkey denotes partition. -- Hal > > -- Yevgeny > > -- Hal >> >> >> *Thanks,* >> >> *Don Meyer* >> >> /Senior Network/System Engineer/Programmer/ >> >> US+ (253) 371-9532 iNet 8-371-9532 >> >> /*Other names and brands may be claimed as the property of others/ >> >> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Thu Sep 3 07:45:30 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Sep 2009 09:45:30 -0500 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <20090903075118.GE15063@lapinou.lsd.univ-montp2.fr> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <20090903075118.GE15063@lapinou.lsd.univ-montp2.fr> Message-ID: <4A9FD68A.9020403@opengridcomputing.com> BOYRIE Fabrice wrote: > On Wed, Sep 02, 2009 at 10:37:22PM +0300, Tziporet Koren wrote: > >> It seems issues of NFS/RDMA backports. >> Can you install OFED without NFS/RDMA? >> You can change the conf file for this >> > > > I've tried the following configuration and rebooted the node. > The bug is always here. Did I forgot to disable something ? > > > I don't see the nfsrdma entry in your ofed.conf file. Was that generated by the 1.4.2 install.pl? > Fabrice BOYRIE > > cat ofed.conf > > > > kernel-ib=y > core=y > mthca=y > mlx4=y > mlx4_en=y > cxgb3=y > nes=y > ipath=y > ipoib=y > sdp=y > srp=y > srpt=y > rds=y > qlgc_vnic=y > kernel-ib-devel=n > ib-bonding=n > ib-bonding-debuginfo=n > libibverbs=y > libibverbs-devel=y > libibverbs-devel-static=y > libibverbs-utils=y > libibverbs-debuginfo=y > libmthca=y > libmthca-devel-static=y > libmthca-debuginfo=y > libmlx4=y > libmlx4-devel=y > libmlx4-debuginfo=n > libcxgb3=n > libcxgb3-devel=n > libcxgb3-debuginfo=n > libnes=y > libnes-devel-static=y > libnes-debuginfo=y > libipathverbs=y > libipathverbs-devel=y > libipathverbs-debuginfo=n > libibcm=y > libibcm-devel=y > libibcm-debuginfo=n > libibcommon=y > libibcommon-devel=y > libibcommon-static=y > libibcommon-debuginfo=n > libibumad=y > libibumad-devel=y > libibumad-static=y > libibumad-debuginfo=n > libibmad=y > libibmad-devel=y > libibmad-static=y > libibmad-debuginfo=y > ibsim=y > ibsim-debuginfo=n > librdmacm=n > librdmacm-utils=y > librdmacm-devel=y > librdmacm-debuginfo=n > libsdp=y > libsdp-devel=y > libsdp-debuginfo=y > opensm=y > opensm-libs=y > opensm-devel=y > opensm-debuginfo=n > opensm-static=y > compat-dapl=y > compat-dapl-devel=y > dapl=y > dapl-devel=y > dapl-devel-static=y > dapl-utils=y > dapl-debuginfo=n > perftest=y > mstflint=y > tvflash=y > qlvnictools=y > sdpnetstat=y > srptools=y > rds-tools=y > rnfs-utils=n > ibutils=y > infiniband-diags=y > qperf=y > qperf-debuginfo=y > ofed-docs=y > ofed-scripts=y > tgt-generic=y > mpi-selector=y > mvapich_gcc=y > mvapich2_gcc=y > openmpi_gcc=y > mpitests_mvapich_gcc=y > mpitests_mvapich2_gcc=y > mpitests_openmpi_gcc=y > build32=0 > prefix=/usr > mvapich2_conf_impl=ofa > mvapich2_conf_romio=1 > mvapich2_conf_shared_libs=1 > mvapich2_conf_ckpt=0 > mvapich2_conf_vcluster=small > From Fabrice.Boyrie at univ-montp2.fr Thu Sep 3 08:25:37 2009 From: Fabrice.Boyrie at univ-montp2.fr (BOYRIE Fabrice) Date: Thu, 3 Sep 2009 17:25:37 +0200 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <4A9FD68A.9020403@opengridcomputing.com> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <20090903075118.GE15063@lapinou.lsd.univ-montp2.fr> <4A9FD68A.9020403@opengridcomputing.com> Message-ID: <20090903152537.GG15063@lapinou.lsd.univ-montp2.fr> On Thu, Sep 03, 2009 at 09:45:30AM -0500, Steve Wise wrote: > > BOYRIE Fabrice wrote: >> On Wed, Sep 02, 2009 at 10:37:22PM +0300, Tziporet Koren wrote: >> >>> It seems issues of NFS/RDMA backports. >>> Can you install OFED without NFS/RDMA? >>> You can change the conf file for this >>> >> >> >> I've tried the following configuration and rebooted the node. The bug >> is always here. Did I forgot to disable something ? >> >> >> > > I don't see the nfsrdma entry in your ofed.conf file. Was that > generated by the 1.4.2 install.pl? Yes The only line with nfs in the ofed.conf is rnfs-utils=n I attach the ofed.conf, to avoid any copy/paste error. Fabrice BOYRIE -------------- next part -------------- kernel-ib=y core=y mthca=y mlx4=y mlx4_en=y cxgb3=y nes=y ipath=y ipoib=y sdp=y srp=y srpt=y rds=y qlgc_vnic=y kernel-ib-devel=n ib-bonding=n ib-bonding-debuginfo=n libibverbs=y libibverbs-devel=y libibverbs-devel-static=y libibverbs-utils=y libibverbs-debuginfo=y libmthca=y libmthca-devel-static=y libmthca-debuginfo=y libmlx4=y libmlx4-devel=y libmlx4-debuginfo=n libcxgb3=n libcxgb3-devel=n libcxgb3-debuginfo=n libnes=y libnes-devel-static=y libnes-debuginfo=y libipathverbs=y libipathverbs-devel=y libipathverbs-debuginfo=n libibcm=y libibcm-devel=y libibcm-debuginfo=n libibcommon=y libibcommon-devel=y libibcommon-static=y libibcommon-debuginfo=n libibumad=y libibumad-devel=y libibumad-static=y libibumad-debuginfo=n libibmad=y libibmad-devel=y libibmad-static=y libibmad-debuginfo=y ibsim=y ibsim-debuginfo=n librdmacm=n librdmacm-utils=y librdmacm-devel=y librdmacm-debuginfo=n libsdp=y libsdp-devel=y libsdp-debuginfo=y opensm=y opensm-libs=y opensm-devel=y opensm-debuginfo=n opensm-static=y compat-dapl=y compat-dapl-devel=y dapl=y dapl-devel=y dapl-devel-static=y dapl-utils=y dapl-debuginfo=n perftest=y mstflint=y tvflash=y qlvnictools=y sdpnetstat=y srptools=y rds-tools=y rnfs-utils=n ibutils=y infiniband-diags=y qperf=y qperf-debuginfo=y ofed-docs=y ofed-scripts=y tgt-generic=y mpi-selector=y mvapich_gcc=y mvapich2_gcc=y openmpi_gcc=y mpitests_mvapich_gcc=y mpitests_mvapich2_gcc=y mpitests_openmpi_gcc=y build32=0 prefix=/usr mvapich2_conf_impl=ofa mvapich2_conf_romio=1 mvapich2_conf_shared_libs=1 mvapich2_conf_ckpt=0 mvapich2_conf_vcluster=small From worleys at gmail.com Thu Sep 3 08:35:49 2009 From: worleys at gmail.com (Chris Worley) Date: Thu, 3 Sep 2009 09:35:49 -0600 Subject: [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: Message-ID: On Wed, Sep 2, 2009 at 11:59 PM, Bart Van Assche wrote: > On Thu, Sep 3, 2009 at 6:08 AM, Chris Worley wrote: >> In trying to build a kernel w/ lockdep support as Vlad requested, my >> lack of Debian knowledge shone through, and, although I believe I >> followed all the instructions correctly, I'm not sure if I have a >> 2.6.28-15 or 2.6.28-10 kernel.  Anyway, the issue is still repeatable. >> >> Whatever kernel that is, I have SRP hung currently.  What should I >> look for in /proc/lockd*? > > Lockdep sends its error messages to the kernel ring buffer (dmesg / > /var/log/messages). An example of an error message generated by > lockdep can be found here: http://lkml.org/lkml/2007/5/22/443. The fio hang shows no lock issues (all 64 fio processes are in this state): [ 3715.150077] INFO: task fio:27967 blocked for more than 120 seconds. [ 3715.150131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3715.150183] fio D ffff8803c183baa0 0 27967 1 [ 3715.150187] ffff8803c183ba38 0000000000000046 ffff8803c450a920 0000000000000001 [ 3715.150191] ffff8803c183b9a8 ffff8803c450a1f0 ffff8803c18ec3e0 ffff8803c450a560 [ 3715.150194] 00000001c450a8e8 0000000000000001 ffff8803c450a560 ffffffff80219c49 [ 3715.150197] Call Trace: [ 3715.150205] [] ? sched_clock+0x9/0x10 [ 3715.150210] [] ? get_request_wait+0xb2/0x1b0 [ 3715.150214] [] ? _spin_unlock_irq+0x2b/0x40 [ 3715.150218] [] io_schedule+0x37/0x50 [ 3715.150220] [] get_request_wait+0xb7/0x1b0 [ 3715.150224] [] ? autoremove_wake_function+0x0/0x40 [ 3715.150226] [] ? elv_merge+0x32/0x1e0 [ 3715.150228] [] __make_request+0x8e/0x4c0 [ 3715.150231] [] ? __lock_acquire+0x33c/0x1250 [ 3715.150234] [] generic_make_request+0x37e/0x4c0 [ 3715.150236] [] ? native_sched_clock+0x20/0x80 [ 3715.150238] [] ? sched_clock+0x9/0x10 [ 3715.150240] [] submit_bio+0x75/0xf0 [ 3715.150243] [] ? bio_set_pages_dirty+0x4c/0x70 [ 3715.150246] [] dio_bio_submit+0x5e/0x90 [ 3715.150248] [] __blockdev_direct_IO+0x582/0xce0 [ 3715.150250] [] blkdev_direct_IO+0x49/0x50 [ 3715.150252] [] ? blkdev_get_blocks+0x0/0xc0 [ 3715.150256] [] generic_file_aio_read+0x6a1/0x6c0 [ 3715.150258] [] ? __lock_acquire+0x33c/0x1250 [ 3715.150260] [] ? native_sched_clock+0x20/0x80 [ 3715.150262] [] ? sched_clock+0x9/0x10 [ 3715.150265] [] ? aio_run_iocb+0x49/0x160 [ 3715.150267] [] ? _spin_unlock_irq+0x2b/0x40 [ 3715.150269] [] ? generic_file_aio_read+0x0/0x6c0 [ 3715.150272] [] aio_rw_vect_retry+0x7c/0x210 [ 3715.150274] [] ? aio_rw_vect_retry+0x0/0x210 [ 3715.150276] [] aio_run_iocb+0x8e/0x160 [ 3715.150278] [] sys_io_submit+0x286/0x700 [ 3715.150281] [] system_call_fastpath+0x16/0x1b [ 3715.150283] no locks held by fio/27967. Looking at a dump of all tasks, I don't see a thread labeled "ib_srp", but here are some of the other threads involved, everybody seems to just be waiting: All the AIO threads are waiting, as in: [43440.068513] aio/7 S ffff88042f855ee0 0 88 2 [43440.068515] ffff880423c25ec0 0000000000000046 ffff880423c2a1f0 ffff880423c2a1f0 [43440.068518] ffffffff80267ed9 ffff880423c2a1f0 ffff88042e93c3e0 ffff880423c2a560 [43440.068521] 0000000723c25e60 00000000ffff8d2a ffff880423c2a560 0000000000000282 [43440.068524] Call Trace: [43440.068526] [] ? prepare_to_wait+0x49/0x80 [43440.068528] [] ? trace_hardirqs_on+0xd/0x10 [43440.068530] [] worker_thread+0xed/0x130 [43440.068532] [] ? autoremove_wake_function+0x0/0x40 [43440.068533] [] ? worker_thread+0x0/0x130 [43440.068535] [] kthread+0x49/0x90 [43440.068537] [] child_rip+0xa/0x11 [43440.068539] [] ? restore_args+0x0/0x30 [43440.068541] [] ? kthread+0x0/0x90 [43440.068543] [] ? child_rip+0x0/0x11 One scsi_eh thread is in a "kobject_put" state: [43440.068580] scsi_eh_0 S ffff880423d59ed0 0 92 2 [43440.068582] ffff880423d59e60 0000000000000046 ffff880423d59dd0 ffffffff80422037 [43440.068585] ffff8804243b4080 ffff880423c743e0 ffff88042e808000 ffff880423c74750 [43440.068588] 0000000223d59e00 00000000ffff945b ffff880423c74750 0000000000000000 [43440.068591] Call Trace: [43440.068593] [] ? kobject_put+0x27/0x60 [43440.068596] [] ? __scsi_iterate_devices+0x77/0xa0 [43440.068599] [] scsi_error_handler+0x8c/0x5d0 [43440.068601] [] ? complete+0x4b/0x60 [43440.068603] [] ? scsi_error_handler+0x0/0x5d0 [43440.068605] [] kthread+0x49/0x90 [43440.068607] [] child_rip+0xa/0x11 [43440.068609] [] ? restore_args+0x0/0x30 [43440.068611] [] ? kthread+0x0/0x90 [43440.068613] [] ? child_rip+0x0/0x11 ... all the rest are in trace_hardirqs_on_caller (except for one other, shown later): [43440.068614] scsi_eh_1 S ffff880423d5bed0 0 93 2 [43440.068616] ffff880423d5be60 0000000000000046 ffff880423d5bdd0 ffffffff8027a38b [43440.068619] ffff8804243b5080 ffff880423c721f0 ffff88042e8fa1f0 ffff880423c72560 [43440.068622] 0000000623d5be00 00000000ffff8d39 ffff880423c72560 0000000000000000 [43440.068625] Call Trace: [43440.068626] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.068628] [] ? __scsi_iterate_devices+0x6a/0xa0 [43440.068631] [] scsi_error_handler+0x8c/0x5d0 [43440.068632] [] ? complete+0x4b/0x60 [43440.068634] [] ? scsi_error_handler+0x0/0x5d0 [43440.068636] [] kthread+0x49/0x90 [43440.068638] [] child_rip+0xa/0x11 [43440.068640] [] ? restore_args+0x0/0x30 [43440.068642] [] ? kthread+0x0/0x90 [43440.068644] [] ? child_rip+0x0/0x11 The IB stack is in the following state: [43440.071112] mthca_catas S ffff88042f855ee0 0 2805 2 [43440.071114] ffff88042309bec0 0000000000000046 ffff880428c1a1f0 ffff880428c1a1f0 [43440.071117] ffffffff80267ed9 ffff880428c1a1f0 ffff88042e8fa1f0 ffff880428c1a560 [43440.071120] 000000062309be60 00000000ffff96d1 ffff880428c1a560 0000000000000282 [43440.071123] Call Trace: [43440.071125] [] ? prepare_to_wait+0x49/0x80 [43440.071127] [] ? trace_hardirqs_on+0xd/0x10 [43440.071129] [] worker_thread+0xed/0x130 [43440.071131] [] ? autoremove_wake_function+0x0/0x40 [43440.071133] [] ? worker_thread+0x0/0x130 [43440.071135] [] kthread+0x49/0x90 [43440.071136] [] child_rip+0xa/0x11 [43440.071138] [] ? restore_args+0x0/0x30 [43440.071140] [] ? kthread+0x0/0x90 [43440.071142] [] ? child_rip+0x0/0x11 [43440.071143] mlx4_err S ffff88042f855ee0 0 2808 2 [43440.071146] ffff88042d4ebec0 0000000000000046 ffff880428c1c3e0 ffff880428c1c3e0 [43440.071149] ffffffff80267ed9 ffff880428c1c3e0 ffff880428c18000 ffff880428c1c750 [43440.071152] 000000032d4ebe60 ffffffff8027a38b ffff880428c1c750 0000000000000282 [43440.071155] Call Trace: [43440.071157] [] ? prepare_to_wait+0x49/0x80 [43440.071159] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071160] [] ? trace_hardirqs_on+0xd/0x10 [43440.071162] [] worker_thread+0xed/0x130 [43440.071164] [] ? autoremove_wake_function+0x0/0x40 [43440.071166] [] ? worker_thread+0x0/0x130 [43440.071168] [] kthread+0x49/0x90 [43440.071170] [] child_rip+0xa/0x11 [43440.071172] [] ? restore_args+0x0/0x30 [43440.071173] [] ? kthread+0x0/0x90 [43440.071175] [] ? child_rip+0x0/0x11 [43440.071177] ib_mad1 S ffff88042f855ee0 0 2815 2 [43440.071179] ffff88042c4f3ec0 0000000000000046 ffff88042794c3e0 ffff88042794c3e0 [43440.071182] ffffffff80267ed9 ffff88042794c3e0 ffff88042efc43e0 ffff88042794c750 [43440.071185] 000000012c4f3e60 000000010041d148 ffff88042794c750 0000000000000282 [43440.071188] Call Trace: [43440.071190] [] ? prepare_to_wait+0x49/0x80 [43440.071191] [] ? trace_hardirqs_on+0xd/0x10 [43440.071193] [] worker_thread+0xed/0x130 [43440.071195] [] ? autoremove_wake_function+0x0/0x40 [43440.071197] [] ? worker_thread+0x0/0x130 [43440.071199] [] kthread+0x49/0x90 [43440.071201] [] child_rip+0xa/0x11 [43440.071202] [] ? restore_args+0x0/0x30 [43440.071204] [] ? kthread+0x0/0x90 [43440.071206] [] ? child_rip+0x0/0x11 [43440.071208] ib_mad2 S ffff88042f855ee0 0 2816 2 [43440.071210] ffff880422431ec0 0000000000000046 ffff88042c5343e0 ffff88042c5343e0 [43440.071213] ffffffff80267ed9 ffff88042c5343e0 ffff8804283fa1f0 ffff88042c534750 [43440.071216] 0000000622431e60 ffffffff8027a38b ffff88042c534750 0000000000000282 [43440.071219] Call Trace: [43440.071221] [] ? prepare_to_wait+0x49/0x80 [43440.071223] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071224] [] ? trace_hardirqs_on+0xd/0x10 [43440.071226] [] worker_thread+0xed/0x130 [43440.071228] [] ? autoremove_wake_function+0x0/0x40 [43440.071230] [] ? worker_thread+0x0/0x130 [43440.071232] [] kthread+0x49/0x90 [43440.071234] [] child_rip+0xa/0x11 [43440.071235] [] ? restore_args+0x0/0x30 [43440.071237] [] ? kthread+0x0/0x90 [43440.071239] [] ? child_rip+0x0/0x11 [43440.071241] ib_mcast S ffff88042f855ee0 0 2837 2 [43440.071243] ffff88042b783ec0 0000000000000046 ffff88042b6e0000 ffff88042b6e0000 [43440.071246] ffffffff80267ed9 ffff88042b6e0000 ffff88042e8fa1f0 ffff88042b6e0370 [43440.071249] 000000062b783e60 00000000ffff97c1 ffff88042b6e0370 0000000000000282 [43440.071252] Call Trace: [43440.071254] [] ? prepare_to_wait+0x49/0x80 [43440.071256] [] ? trace_hardirqs_on+0xd/0x10 [43440.071257] [] worker_thread+0xed/0x130 [43440.071259] [] ? autoremove_wake_function+0x0/0x40 [43440.071261] [] ? worker_thread+0x0/0x130 [43440.071263] [] kthread+0x49/0x90 [43440.071265] [] child_rip+0xa/0x11 [43440.071267] [] ? restore_args+0x0/0x30 [43440.071269] [] ? kthread+0x0/0x90 [43440.071270] [] ? child_rip+0x0/0x11 [43440.071272] ib_cm/0 S ffff88042f855ee0 0 2840 2 [43440.071274] ffff88042b79fec0 0000000000000046 ffff88042c5321f0 ffff88042c5321f0 [43440.071277] ffffffff80267ed9 ffff88042c5321f0 ffff8803c44343e0 ffff88042c532560 [43440.071280] 000000002b79fe60 ffffffff8027a38b ffff88042c532560 0000000000000282 [43440.071283] Call Trace: [43440.071285] [] ? prepare_to_wait+0x49/0x80 [43440.071287] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071288] [] ? trace_hardirqs_on+0xd/0x10 [43440.071290] [] worker_thread+0xed/0x130 [43440.071292] [] ? autoremove_wake_function+0x0/0x40 [43440.071294] [] ? worker_thread+0x0/0x130 [43440.071296] [] kthread+0x49/0x90 [43440.071298] [] child_rip+0xa/0x11 [43440.071299] [] ? restore_args+0x0/0x30 [43440.071301] [] ? kthread+0x0/0x90 [43440.071303] [] ? child_rip+0x0/0x11 [43440.071305] ib_cm/1 S ffff88042f855ee0 0 2841 2 [43440.071307] ffff88042b4d5ec0 0000000000000046 ffff88042b4d8000 ffff88042b4d8000 [43440.071310] ffffffff80267ed9 ffff88042b4d8000 ffff88042efc43e0 ffff88042b4d8370 [43440.071313] 000000012b4d5e60 000000010002d3a4 ffff88042b4d8370 0000000000000282 [43440.071316] Call Trace: [43440.071318] [] ? prepare_to_wait+0x49/0x80 [43440.071320] [] ? trace_hardirqs_on+0xd/0x10 [43440.071322] [] worker_thread+0xed/0x130 [43440.071324] [] ? autoremove_wake_function+0x0/0x40 [43440.071325] [] ? worker_thread+0x0/0x130 [43440.071327] [] kthread+0x49/0x90 [43440.071329] [] child_rip+0xa/0x11 [43440.071331] [] ? restore_args+0x0/0x30 [43440.071333] [] ? kthread+0x0/0x90 [43440.071334] [] ? child_rip+0x0/0x11 [43440.071336] ib_cm/2 S ffff88042f855ee0 0 2842 2 [43440.071338] ffff88042b4d7ec0 0000000000000046 ffff88042b4da1f0 ffff88042b4da1f0 [43440.071341] ffffffff80267ed9 ffff88042b4da1f0 ffff88042e808000 ffff88042b4da560 [43440.071344] 000000022b4d7e60 00000000ffff97c2 ffff88042b4da560 0000000000000282 [43440.071347] Call Trace: [43440.071349] [] ? prepare_to_wait+0x49/0x80 [43440.071351] [] ? trace_hardirqs_on+0xd/0x10 [43440.071353] [] worker_thread+0xed/0x130 [43440.071355] [] ? autoremove_wake_function+0x0/0x40 [43440.071356] [] ? worker_thread+0x0/0x130 [43440.071358] [] kthread+0x49/0x90 [43440.071360] [] child_rip+0xa/0x11 [43440.071362] [] ? restore_args+0x0/0x30 [43440.071364] [] ? kthread+0x0/0x90 [43440.071365] [] ? child_rip+0x0/0x11 [43440.071367] ib_cm/3 S ffff88042f855ee0 0 2843 2 [43440.071369] ffff88042b559ec0 0000000000000046 ffff88042b4dc3e0 ffff88042b4dc3e0 [43440.071372] ffffffff80267ed9 ffff88042b4dc3e0 ffff88042e83a1f0 ffff88042b4dc750 [43440.071375] 000000032b559e60 00000000ffff97c2 ffff88042b4dc750 0000000000000282 [43440.071378] Call Trace: [43440.071380] [] ? prepare_to_wait+0x49/0x80 [43440.071382] [] ? trace_hardirqs_on+0xd/0x10 [43440.071384] [] worker_thread+0xed/0x130 [43440.071386] [] ? autoremove_wake_function+0x0/0x40 [43440.071388] [] ? worker_thread+0x0/0x130 [43440.071390] [] kthread+0x49/0x90 [43440.071391] [] child_rip+0xa/0x11 [43440.071393] [] ? restore_args+0x0/0x30 [43440.071395] [] ? kthread+0x0/0x90 [43440.071397] [] ? child_rip+0x0/0x11 [43440.071398] ib_cm/4 S ffff88042f855ee0 0 2844 2 [43440.071401] ffff88042b55bec0 0000000000000046 ffff88042b550000 ffff88042b550000 [43440.071404] ffffffff80267ed9 ffff88042b550000 ffff88042e87c3e0 ffff88042b550370 [43440.071407] 000000042b55be60 00000000ffff97c2 ffff88042b550370 0000000000000282 [43440.071410] Call Trace: [43440.071411] [] ? prepare_to_wait+0x49/0x80 [43440.071413] [] ? trace_hardirqs_on+0xd/0x10 [43440.071415] [] worker_thread+0xed/0x130 [43440.071417] [] ? autoremove_wake_function+0x0/0x40 [43440.071419] [] ? worker_thread+0x0/0x130 [43440.071421] [] kthread+0x49/0x90 [43440.071423] [] child_rip+0xa/0x11 [43440.071424] [] ? restore_args+0x0/0x30 [43440.071426] [] ? kthread+0x0/0x90 [43440.071428] [] ? child_rip+0x0/0x11 [43440.071429] ib_cm/5 S ffff88042f855ee0 0 2845 2 [43440.071432] ffff88042b541ec0 0000000000000046 ffff88042b5521f0 ffff88042b5521f0 [43440.071435] ffffffff80267ed9 ffff88042b5521f0 ffff88042e8c0000 ffff88042b552560 [43440.071438] 000000052b541e60 000000010004ef98 ffff88042b552560 0000000000000282 [43440.071440] Call Trace: [43440.071442] [] ? prepare_to_wait+0x49/0x80 [43440.071444] [] ? trace_hardirqs_on+0xd/0x10 [43440.071446] [] worker_thread+0xed/0x130 [43440.071448] [] ? autoremove_wake_function+0x0/0x40 [43440.071450] [] ? worker_thread+0x0/0x130 [43440.071452] [] kthread+0x49/0x90 [43440.071453] [] child_rip+0xa/0x11 [43440.071455] [] ? restore_args+0x0/0x30 [43440.071457] [] ? kthread+0x0/0x90 [43440.071459] [] ? child_rip+0x0/0x11 [43440.071460] ib_cm/6 S ffff88042f855ee0 0 2846 2 [43440.071462] ffff88042b543ec0 0000000000000046 ffff88042b5543e0 ffff88042b5543e0 [43440.071465] ffffffff80267ed9 ffff88042b5543e0 ffff88042e8fa1f0 ffff88042b554750 [43440.071468] 000000062b543e60 00000000ffff97c2 ffff88042b554750 0000000000000282 [43440.071471] Call Trace: [43440.071472] [] ? prepare_to_wait+0x49/0x80 [43440.071474] [] ? trace_hardirqs_on+0xd/0x10 [43440.071476] [] worker_thread+0xed/0x130 [43440.071478] [] ? autoremove_wake_function+0x0/0x40 [43440.071480] [] ? worker_thread+0x0/0x130 [43440.071482] [] kthread+0x49/0x90 [43440.071484] [] child_rip+0xa/0x11 [43440.071486] [] ? restore_args+0x0/0x30 [43440.071487] [] ? kthread+0x0/0x90 [43440.071489] [] ? child_rip+0x0/0x11 [43440.071490] ib_cm/7 S ffff88042f855ee0 0 2847 2 [43440.071493] ffff88042b545ec0 0000000000000046 ffff88042b530000 ffff88042b530000 [43440.071496] ffffffff80267ed9 ffff88042b530000 ffff88042e93c3e0 ffff88042b530370 [43440.071499] 000000072b545e60 00000000ffff97c2 ffff88042b530370 0000000000000282 [43440.071502] Call Trace: [43440.071504] [] ? prepare_to_wait+0x49/0x80 [43440.071505] [] ? trace_hardirqs_on+0xd/0x10 [43440.071507] [] worker_thread+0xed/0x130 [43440.071509] [] ? autoremove_wake_function+0x0/0x40 [43440.071511] [] ? worker_thread+0x0/0x130 [43440.071513] [] kthread+0x49/0x90 [43440.071515] [] child_rip+0xa/0x11 [43440.071517] [] ? restore_args+0x0/0x30 [43440.071519] [] ? kthread+0x0/0x90 [43440.071520] [] ? child_rip+0x0/0x11 [43440.071522] ipoib S ffff88042f855ee0 0 2853 2 [43440.071524] ffff88042b5f9ec0 0000000000000046 ffff88042b5321f0 ffff88042b5321f0 [43440.071527] ffffffff80267ed9 ffff88042b5321f0 ffff88042e988000 ffff88042b532560 [43440.071530] 000000032b5f9e60 ffffffff8027a38b ffff88042b532560 0000000000000282 [43440.071533] Call Trace: [43440.071535] [] ? prepare_to_wait+0x49/0x80 [43440.071537] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071539] [] ? trace_hardirqs_on+0xd/0x10 [43440.071541] [] worker_thread+0xed/0x130 [43440.071543] [] ? autoremove_wake_function+0x0/0x40 [43440.071544] [] ? worker_thread+0x0/0x130 [43440.071546] [] kthread+0x49/0x90 [43440.071548] [] child_rip+0xa/0x11 [43440.071550] [] ? restore_args+0x0/0x30 [43440.071552] [] ? kthread+0x0/0x90 [43440.071553] [] ? child_rip+0x0/0x11 [43440.071555] ib_addr S ffff88042f855ee0 0 2870 2 [43440.071557] ffff88042ace9ec0 0000000000000046 ffff88042b5343e0 ffff88042b5343e0 [43440.071560] ffffffff80267ed9 ffff88042b5343e0 ffff88042e8c0000 ffff88042b534750 [43440.071563] 000000052ace9e60 0000000100419d43 ffff88042b534750 0000000000000282 [43440.071566] Call Trace: [43440.071568] [] ? prepare_to_wait+0x49/0x80 [43440.071570] [] ? trace_hardirqs_on+0xd/0x10 [43440.071572] [] worker_thread+0xed/0x130 [43440.071574] [] ? autoremove_wake_function+0x0/0x40 [43440.071575] [] ? worker_thread+0x0/0x130 [43440.071577] [] kthread+0x49/0x90 [43440.071579] [] child_rip+0xa/0x11 [43440.071581] [] ? restore_args+0x0/0x30 [43440.071583] [] ? kthread+0x0/0x90 [43440.071585] [] ? child_rip+0x0/0x11 [43440.071586] iw_cm_wq S ffff88042f855ee0 0 2872 2 [43440.071589] ffff880428373ec0 0000000000000046 ffff88042d8b21f0 ffff88042d8b21f0 [43440.071592] ffffffff80267ed9 ffff88042d8b21f0 ffff880423130000 ffff88042d8b2560 [43440.071595] 0000000328373e60 ffffffff8027a38b ffff88042d8b2560 0000000000000282 [43440.071598] Call Trace: [43440.071599] [] ? prepare_to_wait+0x49/0x80 [43440.071601] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071603] [] ? trace_hardirqs_on+0xd/0x10 [43440.071605] [] worker_thread+0xed/0x130 [43440.071607] [] ? autoremove_wake_function+0x0/0x40 [43440.071609] [] ? worker_thread+0x0/0x130 [43440.071611] [] kthread+0x49/0x90 [43440.071613] [] child_rip+0xa/0x11 [43440.071614] [] ? restore_args+0x0/0x30 [43440.071616] [] ? kthread+0x0/0x90 [43440.071618] [] ? child_rip+0x0/0x11 [43440.071619] rdma_cm S ffff88042f855ee0 0 2874 2 [43440.071622] ffff880429917ec0 0000000000000046 ffff88042c4ec3e0 ffff88042c4ec3e0 [43440.071625] ffffffff80267ed9 ffff88042c4ec3e0 ffff880428c343e0 ffff88042c4ec750 [43440.071628] 0000000629917e60 ffffffff8027a38b ffff88042c4ec750 0000000000000282 [43440.071631] Call Trace: [43440.071632] [] ? prepare_to_wait+0x49/0x80 [43440.071634] [] ? trace_hardirqs_on_caller+0x13b/0x1a0 [43440.071636] [] ? trace_hardirqs_on+0xd/0x10 [43440.071638] [] worker_thread+0xed/0x130 [43440.071640] [] ? autoremove_wake_function+0x0/0x40 [43440.071642] [] ? worker_thread+0x0/0x130 [43440.071644] [] kthread+0x49/0x90 [43440.071646] [] child_rip+0xa/0x11 [43440.071647] [] ? restore_args+0x0/0x30 [43440.071649] [] ? kthread+0x0/0x90 [43440.071651] [] ? child_rip+0x0/0x11 scsi_tgtd threads all look like: [43440.075668] scsi_tgtd/7 S ffff88042f855ee0 0 27784 2 [43440.075668] ffff880427057ec0 0000000000000046 ffff880427f043e0 ffff880427f043e0 [43440.075668] ffffffff80267ed9 ffff880427f043e0 ffff88042e93c3e0 ffff880427f04750 [43440.075668] 0000000727057e60 000000010002c6f4 ffff880427f04750 0000000000000282 [43440.075668] Call Trace: [43440.075668] [] ? prepare_to_wait+0x49/0x80 [43440.075668] [] ? trace_hardirqs_on+0xd/0x10 [43440.075668] [] worker_thread+0xed/0x130 [43440.075668] [] ? autoremove_wake_function+0x0/0x40 [43440.075668] [] ? worker_thread+0x0/0x130 [43440.075668] [] kthread+0x49/0x90 [43440.075668] [] child_rip+0xa/0x11 [43440.075668] [] ? restore_args+0x0/0x30 [43440.075668] [] ? kthread+0x0/0x90 [43440.075668] [] ? child_rip+0x0/0x11 Not sure what this thread is: [43440.075668] ib_fmr(mlx4_0 S ffff88042f855ee0 0 27790 2 [43440.075668] ffff8803c18a9ef0 0000000000000046 ffffffff80bf8a00 0000000000000000 [43440.075668] ffff880423eff080 ffff880427e9a1f0 ffff88042e8fa1f0 ffff880427e9a560 [43440.075668] 0000000627e9a1f0 000000010002c6f7 ffff880427e9a560 ffff880427e9a1f0 [43440.075668] Call Trace: [43440.075668] [] ib_fmr_cleanup_thread+0xb5/0xd0 [ib_core] [43440.075668] [] ? ib_fmr_cleanup_thread+0x0/0xd0 [ib_core] [43440.075668] [] kthread+0x49/0x90 [43440.075668] [] child_rip+0xa/0x11 [43440.075668] [] ? restore_args+0x0/0x30 [43440.075668] [] ? kthread+0x0/0x90 [43440.075668] [] ? child_rip+0x0/0x11 Another scsi_eh, in a different state: [43440.075668] scsi_eh_4 S ffff88042713bed0 0 27831 2 [43440.075668] ffff88042713be60 0000000000000046 ffffffff80244f0f ffff880427e9c3e0 [43440.075668] ffffffff806a98fb ffff880427e9c3e0 ffff88042e83a1f0 ffff880427e9c750 [43440.075668] 0000000344840a00 000000010002d3a4 ffff880427e9c750 ffffffff8027a3fd [43440.075668] Call Trace: [43440.075668] [] ? finish_task_switch+0x5f/0x120 [43440.075668] [] ? _spin_unlock_irq+0x2b/0x40 [43440.075668] [] ? trace_hardirqs_on+0xd/0x10 [43440.075668] [] ? _spin_unlock_irq+0x2b/0x40 [43440.075668] [] ? finish_task_switch+0x5f/0x120 [43440.075668] [] ? finish_task_switch+0x0/0x120 [43440.075668] [] scsi_error_handler+0x8c/0x5d0 [43440.075668] [] ? complete+0x4b/0x60 [43440.075668] [] ? scsi_error_handler+0x0/0x5d0 [43440.075668] [] kthread+0x49/0x90 [43440.075668] [] child_rip+0xa/0x11 [43440.075668] [] ? restore_args+0x0/0x30 [43440.075668] [] ? kthread+0x0/0x90 [43440.075668] [] ? child_rip+0x0/0x11 While I appreciate you and Vlad's help immensely, from what I gather (and I could be wrong), neither of you work on ib_srp. Is there someone from the initiator side we could bring-in on this? Some clarifications on my testing procedures: in testing block sizes, I usually start w/ 1MB blocks, then decrease the block size to lesser powers of two, finishing with 512B blocks... so when I say "the problem occurs w/ the RHEL/OFED target at <2KB blocks", that means it made it through the 8KB blocks w/o hanging (where the Ubuntu hangs). While being able to remove the scst/scst_vdisk/ib_srpt modules from the target once the hang has occurred, and not being able to remove ib_srp on the initiator, seems to implicate the initiator, it could be that the target is in a state in the protocol where it believes nothing is outstanding, while the initiator is waiting in another state, for more from the target. If that is the case, then it's hard to say who's to blame. Thanks, Chris From swise at opengridcomputing.com Thu Sep 3 08:36:35 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Sep 2009 10:36:35 -0500 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <20090903152537.GG15063@lapinou.lsd.univ-montp2.fr> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <20090903075118.GE15063@lapinou.lsd.univ-montp2.fr> <4A9FD68A.9020403@opengridcomputing.com> <20090903152537.GG15063@lapinou.lsd.univ-montp2.fr> Message-ID: <4A9FE283.4060405@opengridcomputing.com> what does 'modinfo xprtrdma' show? BOYRIE Fabrice wrote: > On Thu, Sep 03, 2009 at 09:45:30AM -0500, Steve Wise wrote: > >> BOYRIE Fabrice wrote: >> >>> On Wed, Sep 02, 2009 at 10:37:22PM +0300, Tziporet Koren wrote: >>> >>> >>>> It seems issues of NFS/RDMA backports. >>>> Can you install OFED without NFS/RDMA? >>>> You can change the conf file for this >>>> >>>> >>> I've tried the following configuration and rebooted the node. The bug >>> is always here. Did I forgot to disable something ? >>> >>> >>> >>> >> I don't see the nfsrdma entry in your ofed.conf file. Was that >> generated by the 1.4.2 install.pl? >> > > Yes > > The only line with nfs in the ofed.conf is > rnfs-utils=n > > I attach the ofed.conf, to avoid any copy/paste error. > > Fabrice BOYRIE > From donald.j.meyer at intel.com Thu Sep 3 10:12:06 2009 From: donald.j.meyer at intel.com (Meyer, Donald J) Date: Thu, 3 Sep 2009 10:12:06 -0700 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> <4A9FC81F.4040009@dev.mellanox.co.il> Message-ID: <6203933669E90E4AB42B5BC4EDE38D350C7D1283EE@orsmsx510.amr.corp.intel.com> Hal, If you would like to use my example configuration in the man page (the one with realistic GUID's) please feel free to do so. The GUID's are all imaginary but realistic. Are you sure the default partition should be "Default=0x7fff,ipoib,rate=7:ALL,SELF=full;" and not "Default=0x7fff:SELF=full,ALL=limited;"? The second version forces both known and unknown CA's to be unable to reach any CA but the sm except via their own partition. It also seems to me that the first version bypasses partitioning by allowing CA's to use the default partition to reach other CA's not in the same partition. Also, if you would like, I would be happy to work on a version of the man page where I would try to possibly explain a bit more and have more complete examples. Thanks, Don Meyer Senior Network/System Engineer/Programmer US+ (253) 371-9532 iNet 8-371-9532 *Other names and brands may be claimed as the property of others ________________________________ From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com] Sent: Thursday, September 03, 2009 6:46 AM To: kliteyn at dev.mellanox.co.il Cc: Meyer, Donald J; general at lists.openfabrics.org Subject: Re: [ofa-general] question about partitioning IB networks On Thu, Sep 3, 2009 at 9:43 AM, Yevgeny Kliteynik > wrote: Hal Rosenstock wrote: On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J >> wrote: ... ... Just as a sanity check, my interpretation so far is that my network should have a partition configuration file similar to the following. Can anyone tell me if I have this correct? In this example configuration, I am trying to create two partitions. One with rack one and two, the other with rack three and four: #Default partition (for SM control of the CA's) Default=0x7fff,ipoib,rate=7:ALL=limited; Default=0x7fff,ipoib,rate=7:ALL,SELF=full; #rack1 rack1=0x111,ipoib,rate=7,defmember=full:; #rack2 rack2=0x111,ipoib,rate=7,defmember=full:; #rack3 rack3=0x112,ipoib,rate=7,defmember=full:; #rack4 rack4=0x112,ipoib,rate=7,defmember=full:; I've never done it this way but it does look like the partition create code will detect the duplicated partitions (0x111 and 0x112) and merge ports from rack2 with rack1 and rack4 with rack3. It will. Note that partition names are meaningless in terms of IB management. Basically they are used just for logging. The only real partition ID is its pkey. The low 7 bits (without membership bit) of pkey denotes partition. -- Hal -- Yevgeny -- Hal *Thanks,* *Don Meyer* /Senior Network/System Engineer/Programmer/ US+ (253) 371-9532 iNet 8-371-9532 /*Other names and brands may be claimed as the property of others/ _______________________________________________ general mailing list general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ------------------------------------------------------------------------ _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Thu Sep 3 10:31:25 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 3 Sep 2009 13:31:25 -0400 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: <6203933669E90E4AB42B5BC4EDE38D350C7D1283EE@orsmsx510.amr.corp.intel.com> References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> <4A9FC81F.4040009@dev.mellanox.co.il> <6203933669E90E4AB42B5BC4EDE38D350C7D1283EE@orsmsx510.amr.corp.intel.com> Message-ID: Don, On Thu, Sep 3, 2009 at 1:12 PM, Meyer, Donald J wrote: > Hal, > > > > If you would like to use my example configuration in the man page (the one > with realistic GUID’s) please feel free to do so. The GUID’s are all > imaginary but realistic. > > > > Are you sure the default partition should be > “Default=0x7fff,ipoib,rate=7:ALL,SELF=full;” and not > “Default=0x7fff:SELF=full,ALL=limited;”? > The second version forces both known and unknown CA’s to be unable to > reach any CA but the sm except via their own partition. > I thought that was what you wanted. I thought you only wanted the CAs to be able to talk with each other on the designated non default partitions. > It also seems to me that the first version bypasses partitioning by > allowing CA’s to use the default partition to reach other CA’s not in the > same partition. > You mean the since they are all full members of the default partition they can talk to each other on that partition despite that not being allowed on some other partition. If so, yes. > > > Also, if you would like, I would be happy to work on a version of the man > page where I would try to possibly explain a bit more and have more complete > examples. > Sure; if you want you are welcome to post patches to the list for review, comment, etc. -- Hal > > > *Thanks,* > > *Don Meyer* > > *Senior Network/System Engineer/Programmer* > > US+ (253) 371-9532 iNet 8-371-9532 > > **Other names and brands may be claimed as the property of others* > ------------------------------ > > *From:* Hal Rosenstock [mailto:hal.rosenstock at gmail.com] > *Sent:* Thursday, September 03, 2009 6:46 AM > *To:* kliteyn at dev.mellanox.co.il > *Cc:* Meyer, Donald J; general at lists.openfabrics.org > *Subject:* Re: [ofa-general] question about partitioning IB networks > > > > > > On Thu, Sep 3, 2009 at 9:43 AM, Yevgeny Kliteynik < > kliteyn at dev.mellanox.co.il> wrote: > > Hal Rosenstock wrote: > > > On Mon, Aug 31, 2009 at 3:29 PM, Meyer, Donald J < > donald.j.meyer at intel.com > wrote: > ... > ... > > > Just as a sanity check, my interpretation so far is that my network > should have a partition configuration file similar to the following. > Can anyone tell me if I have this correct? In this example > configuration, I am trying to create two partitions. One with rack > one and two, the other with rack three and four: > > > #Default partition (for SM control of the CA’s) > > Default=0x7fff,ipoib,rate=7:ALL=limited; > > Default=0x7fff,ipoib,rate=7:ALL,SELF=full; > > #rack1 > > rack1=0x111,ipoib,rate=7,defmember=full:; > > #rack2 > > rack2=0x111,ipoib,rate=7,defmember=full:; > > #rack3 > > rack3=0x112,ipoib,rate=7,defmember=full:; > > #rack4 > > rack4=0x112,ipoib,rate=7,defmember=full:; > > I've never done it this way but it does look like the partition create code > will detect the duplicated partitions (0x111 and 0x112) and merge ports from > rack2 with rack1 and rack4 with rack3. > > > It will. > Note that partition names are meaningless in terms of IB management. > Basically they are used just for logging. The only real partition ID > is its pkey. > > > > The low 7 bits (without membership bit) of pkey denotes partition. > > > > -- Hal > > > > > -- Yevgeny > > -- Hal > > > *Thanks,* > > *Don Meyer* > > /Senior Network/System Engineer/Programmer/ > > US+ (253) 371-9532 iNet 8-371-9532 > > /*Other names and brands may be claimed as the property of others/ > > > > _______________________________________________ > general mailing list > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From worleys at gmail.com Thu Sep 3 10:38:34 2009 From: worleys at gmail.com (Chris Worley) Date: Thu, 3 Sep 2009 11:38:34 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4A9FA945.4070408@vlnb.net> References: <4A9FA945.4070408@vlnb.net> Message-ID: On Thu, Sep 3, 2009 at 5:32 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/03/2009 08:08 AM wrote: >> >> On Wed, Sep 2, 2009 at 2:58 PM, Chris Worley wrote: >>> >>> On Wed, Sep 2, 2009 at 2:00 PM, Bart Van Assche >>> wrote: >>>> >>>> On Wed, Sep 2, 2009 at 9:53 PM, Chris Worley wrote: >>>>> >>>>> On Wed, Sep 2, 2009 at 1:31 PM, Bart Van >>>>> Assche wrote: >>>>>> >>>>>> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>>>>>> >>>>>>> [ ... ] >>>>>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>>>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>>>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>>>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>>>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>>>>> used). >>>>>>> [ ... ] >>>>>> >>>>>> Is there a special reason why you are using the 2.6.27-14-server >>>>>> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. >>>>> >>>>> No special reason other than it didn't get upgraded w/ the rest of the >>>>> distro... started w/ 8.10. >>> >>> I'm upgrading too, to 9.04. >> >> I tried the 2.6.28-15-server kernel (along w/ the 9.04 upgrade), and >> it does repeat the issue. >> >> In trying to build a kernel w/ lockdep support as Vlad requested, my >> lack of Debian knowledge shone through, and, although I believe I >> followed all the instructions correctly, I'm not sure if I have a >> 2.6.28-15 or 2.6.28-10 kernel.  Anyway, the issue is still repeatable. >> >> Whatever kernel that is, I have SRP hung currently.  What should I >> look for in /proc/lockd*? >> >> I don't think it's a kernel lock... I think it's a protocol lock, as I >> can rmmod the target kernel modules (scst_vdisk, scst, and ib_srpt) >> when the initiator gets in this state. > > Since you can rmmod SCST modules, then this shouldn't be SCST or backstorage > SW/HW issue, because that means there are no stuck or lost SCSI commands. At least on the target side. The initiator could think there are outstanding commands, when they were actually lost on the target (or the target completed them, and the initiator is in error not thinking they are completed). > So, it should be issue of either SRP target/initiator, or OFED on the target > or initiator, or your IB hardware on any node. I've used a couple of initiators (different systems) w/ different OSes, w/ different IB cards (all QDR) and different IB stacks (built-in vs. OFED) and can repeat the problem in all but the RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does repeat). > > You should enable lockdep on both target and initiator (better with other > kernel debug facilities enabled, see the attached file as a sample) and > reproduce the issue. That's done and reported in another response; it doesn't seem to be a lock issue. > There is a big chance that those facilities will spot > what's going on wrong there. I applied the .config changes you suggested, and the kernel was certainly more verbose, but I don't think added any information. When the drives are attached over SRP, I see the following message: [ 454.317328] sd 4:0:0:3: [sde] Attached SCSI disk [ 454.317340] kobject: 'scsi_device' (ffff8804234a3aa0): kobject_add_internal: parent: '4:0:0:3', set: '' [ 454.317350] kobject: '4:0:0:3' (ffff880423cd2780): kobject_add_internal: parent: 'scsi_device', set: 'devices' [ 454.317378] kobject: '4:0:0:3' (ffff880423cd2780): kobject_uevent_env [ 454.317390] kobject: '4:0:0:3' (ffff880423cd2780): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host4/target4:0:0/4:0:0:3/scsi_device/4:0:0:3' [ 454.317437] kobject: 'scsi_generic' (ffff8804234a3c38): kobject_add_internal: parent: '4:0:0:3', set: '' [ 454.317447] kobject: 'sg5' (ffff88042ac4ecb8): kobject_add_internal: parent: 'scsi_generic', set: 'devices' [ 454.317489] kobject: 'sg5' (ffff88042ac4ecb8): kobject_uevent_env [ 454.317500] kobject: 'sg5' (ffff88042ac4ecb8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host4/target4:0:0/4:0:0:3/scsi_generic/sg5' [ 454.317523] sd 4:0:0:3: Attached scsi generic sg5 type 0 Is there somewhere else to look for problems? Thanks, Chris > > Vlad > >> Thanks, >> >> Chris >>> >>> Chris >>>>> >>>>> Do you think that kernel is better? >>>> >>>> I noticed this while trying to reproduce this issue. I have no opinion >>>> yet about which of these two kernels is better. I'll downgrade the >>>> Ubuntu kernel in my setup. >>>> >>>> Bart. >>>> >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >> 30-Day trial. Simplify your report design, integration and deployment - and >> focus on what you do best, core application coding. Discover what's new with >> Crystal Reports now.  http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Scst-devel mailing list >> Scst-devel at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > From rdreier at cisco.com Thu Sep 3 10:40:55 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Sep 2009 10:40:55 -0700 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.32 In-Reply-To: <4A9FB004.6020904@mellanox.co.il> (Tziporet Koren's message of "Thu, 03 Sep 2009 15:01:08 +0300") References: <4A9FB004.6020904@mellanox.co.il> Message-ID: > What about RDMAoE? > Patches were sent few weeks ago and it seems you ignore them. Sorry, I should have mentioned that. Yes, I have been ignoring the patches -- I want to get through XRC first, and also I would like to see a real spec for IBoE (that resolves issues like multicast interaction with IGMP, address resolution, etc) so we can judge whether the current patches match the long-term direction or are just a point-in-time hack. - R. From arlin.r.davis at intel.com Thu Sep 3 12:28:13 2009 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Sep 2009 12:28:13 -0700 Subject: [ofa-general] [PATCH] uDAPL v2: scm, ucm: UD QP support was broken when porting to common openib code base. Message-ID: <2C05ECC7CE4F4A9EB9E711465D7D99A6@amr.corp.intel.com> create remote_ah was moved out of modify_qp_state function but not included in the RTU and ACCEPT code for UD QP's. qp type check should be on daddr not saddr in ucm cm code. QP number must be converted to host order before supplying remote_ah, and qp number to consumer. Modify QP state to RTR for UD QP mask setting incorrect. Signed-off-by: Arlin Davis --- dapl/openib_common/qp.c | 1 + dapl/openib_scm/cm.c | 32 ++++++++++++++++++++++++++++---- dapl/openib_ucm/cm.c | 6 +++--- 3 files changed, 32 insertions(+), 7 deletions(-) diff --git a/dapl/openib_common/qp.c b/dapl/openib_common/qp.c index 73d2c3f..581fc83 100644 --- a/dapl/openib_common/qp.c +++ b/dapl/openib_common/qp.c @@ -415,6 +415,7 @@ dapls_modify_qp_state(IN ib_qp_handle_t qp_handle, /* UD: already in RTR, RTS state */ if (qp_handle->qp_type == IBV_QPT_UD) { + mask = IBV_QP_STATE; if (ep_ptr->qp_state == IBV_QPS_RTR || ep_ptr->qp_state == IBV_QPS_RTS) return DAT_SUCCESS; diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 8b85e15..06fff95 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -772,11 +772,16 @@ ud_bail: goto bail; } + dapl_log(DAPL_DBG_TYPE_CM, + " CONN_RTU: UD AH %p for lid 0x%x qpn 0x%x\n", + cm_ptr->ah, ntohs(cm_ptr->msg.saddr.ib.lid), + ntohl(cm_ptr->msg.saddr.ib.qpn)); + /* post EVENT, modify_qp created ah */ xevent.status = 0; xevent.type = DAT_IB_UD_REMOTE_AH; xevent.remote_ah.ah = cm_ptr->ah; - xevent.remote_ah.qpn = cm_ptr->msg.saddr.ib.qpn; + xevent.remote_ah.qpn = ntohl(cm_ptr->msg.saddr.ib.qpn); dapl_os_memcpy(&xevent.remote_ah.ia_addr, &ep_ptr->remote_ia_address, sizeof(union dcm_addr)); @@ -1153,6 +1158,7 @@ dapli_socket_accept_usr(DAPL_EP * ep_ptr, void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) { int len; + ib_cm_events_t event = IB_CME_CONNECTED; /* complete handshake after final QP state change, VER and OP */ len = recv(cm_ptr->socket, (char *)&cm_ptr->msg, 4, 0); @@ -1162,6 +1168,7 @@ void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) len, ntohs(cm_ptr->msg.op), inet_ntoa(((struct sockaddr_in *) &cm_ptr->msg.daddr.so)->sin_addr)); + event = IB_CME_DESTINATION_REJECT; goto bail; } @@ -1175,11 +1182,28 @@ void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) if (cm_ptr->msg.saddr.ib.qp_type == IBV_QPT_UD) { DAT_IB_EXTENSION_EVENT_DATA xevent; + ib_pd_handle_t pd_handle = + ((DAPL_PZ *)cm_ptr->ep->param.pz_handle)->pd_handle; + + cm_ptr->ah = dapls_create_ah(cm_ptr->hca, pd_handle, + cm_ptr->ep->qp_handle, + cm_ptr->msg.saddr.ib.lid, + NULL); + if (!cm_ptr->ah) { + event = IB_CME_LOCAL_FAILURE; + goto bail; + } + + dapl_log(DAPL_DBG_TYPE_CM, + " CONN_RTU: UD AH %p for lid 0x%x qpn 0x%x\n", + cm_ptr->ah, ntohs(cm_ptr->msg.saddr.ib.lid), + ntohl(cm_ptr->msg.saddr.ib.qpn)); + /* post EVENT, modify_qp created ah */ xevent.status = 0; xevent.type = DAT_IB_UD_PASSIVE_REMOTE_AH; xevent.remote_ah.ah = cm_ptr->ah; - xevent.remote_ah.qpn = cm_ptr->msg.saddr.ib.qpn; + xevent.remote_ah.qpn = ntohl(cm_ptr->msg.saddr.ib.qpn); dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm_ptr->msg.daddr.so, sizeof(union dcm_addr)); @@ -1200,14 +1224,14 @@ void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) } else { #endif cm_ptr->ep->cm_handle = cm_ptr; /* only RC, multi CR's on UD */ - dapls_cr_callback(cm_ptr, IB_CME_CONNECTED, NULL, cm_ptr->sp); + dapls_cr_callback(cm_ptr, event, NULL, cm_ptr->sp); } return; bail: dapls_modify_qp_state(cm_ptr->ep->qp_handle, IBV_QPS_ERR, 0, 0, 0); dapls_ib_cm_free(cm_ptr, cm_ptr->ep); - dapls_cr_callback(cm_ptr, IB_CME_DESTINATION_REJECT, NULL, cm_ptr->sp); + dapls_cr_callback(cm_ptr, event, NULL, cm_ptr->sp); } /* diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index ab3823e..a2db64e 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -930,7 +930,7 @@ ud_bail: xevent.status = 0; xevent.type = DAT_IB_UD_REMOTE_AH; xevent.remote_ah.ah = cm->hca->ib_trans.ah[lid]; - xevent.remote_ah.qpn = cm->msg.daddr.ib.qpn; + xevent.remote_ah.qpn = ntohl(cm->msg.daddr.ib.qpn); dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, sizeof(union dcm_addr)); @@ -1070,7 +1070,7 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) dapl_dbg_log(DAPL_DBG_TYPE_CM, " PASSIVE: connected!\n"); #ifdef DAT_EXTENSIONS - if (cm->msg.saddr.ib.qp_type == IBV_QPT_UD) { + if (cm->msg.daddr.ib.qp_type == IBV_QPT_UD) { DAT_IB_EXTENSION_EVENT_DATA xevent; uint16_t lid = ntohs(cm->msg.daddr.ib.lid); @@ -1078,7 +1078,7 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) xevent.status = 0; xevent.type = DAT_IB_UD_PASSIVE_REMOTE_AH; xevent.remote_ah.ah = cm->hca->ib_trans.ah[lid]; - xevent.remote_ah.qpn = cm->msg.daddr.ib.qpn; + xevent.remote_ah.qpn = ntohl(cm->msg.daddr.ib.qpn); dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, sizeof(cm->msg.daddr)); -- 1.5.2.5 From arlin.r.davis at intel.com Thu Sep 3 12:28:15 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Thu, 3 Sep 2009 12:28:15 -0700 Subject: [ofa-general] [PATCH] uDAPL v2: dtest, dtestx: modifications for UD QP testing with ucm provider. Message-ID: remote_addr is wrong for IP remote address. The dtestx requires the server connect back to the client for the UD test. With the ucm provider you need to provide the QPN and the LID which you cannot get until the dtest client starts. So, for now, don't support UD testing on UCM providers. Signed-off-by: Arlin Davis --- test/dtest/dtest.c | 2 +- test/dtest/dtestx.c | 22 +++++++++++++++++----- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/test/dtest/dtest.c b/test/dtest/dtest.c index 2f418fe..75cbe4c 100755 --- a/test/dtest/dtest.c +++ b/test/dtest/dtest.c @@ -1037,7 +1037,7 @@ DAT_RETURN connect_ep(char *hostname, DAT_CONN_QUAL conn_id) getpid(), (rval >> 0) & 0xff, (rval >> 8) & 0xff, (rval >> 16) & 0xff, (rval >> 24) & 0xff, conn_id); - remote_addr = (DAT_IA_ADDRESS_PTR)&target->ai_addr; /* IP */ + remote_addr = (DAT_IA_ADDRESS_PTR)target->ai_addr; /* IP */ no_resolution: for (i = 0; i < 48; i++) /* simple pattern in private data */ pdata[i] = i + 1; diff --git a/test/dtest/dtestx.c b/test/dtest/dtestx.c index af87af0..6e17b6d 100755 --- a/test/dtest/dtestx.c +++ b/test/dtest/dtestx.c @@ -349,7 +349,7 @@ void process_conn(int idx) DAT_EVENT event; DAT_COUNT nmore; DAT_RETURN status; - int pdata; + int pdata, exp_event; DAT_IB_EXTENSION_EVENT_DATA *ext_event = (DAT_IB_EXTENSION_EVENT_DATA *) & event.event_extension_data[0]; DAT_CONNECTION_EVENT_DATA *conn_event = @@ -366,12 +366,17 @@ void process_conn(int idx) event.event_number, conn_event->private_data, conn_event->private_data_size); + if (ud_test) + exp_event = DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED; + else + exp_event = DAT_CONNECTION_EVENT_ESTABLISHED; + /* Waiting on CR's or CONN_EST */ - if (event.event_number != DAT_CONNECTION_EVENT_ESTABLISHED || + if (event.event_number != exp_event || (ud_test && event.event_number != - DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED)) { + DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED)) { printf("unexpected event, !conn established: 0x%x\n", - event.event_number); + event.event_number); exit(1); } @@ -441,6 +446,13 @@ int connect_ep(char *hostname) if (local.ib.qp_type == IBV_QPT_UD) { ucm = 1; + + if (ud_test) { + printf("%d UD test over UCM provider not supported\n", + getpid()); + exit(1); + } + printf("%d Local uCM Address = QPN=0x%x, LID=0x%x\n", getpid(), ntohl(local.ib.qpn), ntohs(local.ib.lid)); @@ -618,7 +630,7 @@ int connect_ep(char *hostname) strcpy((char *)buf[SND_RDMA_BUF_INDEX], "Client written data"); - remote_addr = (DAT_IA_ADDRESS_PTR)&target->ai_addr; /* IP */ + remote_addr = (DAT_IA_ADDRESS_PTR)target->ai_addr; /* IP */ no_resolution: /* one Client EP, multiple Server EPs, same conn_qual -- 1.5.2.5 From arlin.r.davis at intel.com Thu Sep 3 12:43:44 2009 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Sep 2009 12:43:44 -0700 Subject: [ofa-general] [PATCH] uDAPL v2: winof: Convert windows version of dapl and dat libaries to use private heaps. Message-ID: <57C0A190340A40B0AD5CB093F480B8B5@amr.corp.intel.com> This allows for better support of memory registration caching by upper level libaries (MPI) that use SecureMemoryCacheCallback. It also makes it easier to debug heap corruption issues. Signed-off-by: Sean Hefty --- dapl/openib_scm/cm.c | 2 +- dapl/udapl/windows/dapl_osd.c | 7 ++++++- dapl/udapl/windows/dapl_osd.h | 19 +++++++++++++------ dat/udat/windows/dat_osd.c | 5 +++++ dat/udat/windows/dat_osd.h | 7 +++++-- 5 files changed, 30 insertions(+), 10 deletions(-) diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 06fff95..8560788 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -1845,7 +1845,7 @@ void cr_thread(void *arg) } dapl_os_unlock(&hca_ptr->ib_trans.lock); - free(set); + dapl_os_free(set, sizeof(struct dapl_fd_set)); out: hca_ptr->ib_trans.cr_state = IB_THREAD_EXIT; dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " cr_thread(hca %p) exit\n", hca_ptr); diff --git a/dapl/udapl/windows/dapl_osd.c b/dapl/udapl/windows/dapl_osd.c index 8097560..eb409cd 100644 --- a/dapl/udapl/windows/dapl_osd.c +++ b/dapl/udapl/windows/dapl_osd.c @@ -48,7 +48,7 @@ #include #include /* needed for getenv() */ - +HANDLE heap; /* * DllMain @@ -75,6 +75,10 @@ DllMain ( switch( fdwReason ) { case DLL_PROCESS_ATTACH: + heap = HeapCreate(0, 0, 0); + if (heap == NULL) { + return FALSE; + } /* * We don't attach/detach threads that need any sort * of initialization, so disable this ability to optimize @@ -112,6 +116,7 @@ DllMain ( */ dapl_fini (); #endif + HeapDestroy(heap); break; } return TRUE; diff --git a/dapl/udapl/windows/dapl_osd.h b/dapl/udapl/windows/dapl_osd.h index 6266f1f..5fb9363 100644 --- a/dapl/udapl/windows/dapl_osd.h +++ b/dapl/udapl/windows/dapl_osd.h @@ -350,6 +350,8 @@ dapl_os_wait_object_destroy ( * Memory Functions */ +extern HANDLE heap; + /* function prototypes */ STATIC __inline void *dapl_os_alloc (int size); @@ -389,19 +391,18 @@ dapl_os_sync_rdma_write ( STATIC __inline void *dapl_os_alloc (int size) { - return malloc (size); + return HeapAlloc(heap, 0, size); } STATIC __inline void *dapl_os_realloc (void *ptr, int size) { - return realloc(ptr, size); + return HeapReAlloc(heap, 0, ptr, size); } STATIC __inline void dapl_os_free (void *ptr, int size) { - size = size; - free (ptr); - ptr = NULL; + UNREFERENCED_PARAMETER(size); + HeapFree(heap, 0, ptr); } STATIC __inline void * dapl_os_memzero (void *loc, int size) @@ -427,7 +428,13 @@ STATIC __inline unsigned int dapl_os_strlen(const char *str) STATIC __inline char * dapl_os_strdup(const char *str) { - return _strdup(str); + char *dup; + + dup = dapl_os_alloc(strlen(str) + 1); + if (!dup) + return NULL; + strcpy(dup, str); + return dup; } diff --git a/dat/udat/windows/dat_osd.c b/dat/udat/windows/dat_osd.c index 5b57f43..37f3087 100644 --- a/dat/udat/windows/dat_osd.c +++ b/dat/udat/windows/dat_osd.c @@ -124,6 +124,7 @@ dat_os_dbg_print ( } } +HANDLE heap; BOOL APIENTRY DllMain( @@ -138,12 +139,16 @@ DllMain( switch( ul_reason_for_call ) { case DLL_PROCESS_ATTACH: + heap = HeapCreate(0, 0, 0); + if (heap == NULL) + return FALSE; DisableThreadLibraryCalls( h_module ); udat_check_state(); break; case DLL_PROCESS_DETACH: dat_fini(); + HeapDestroy(heap); } return TRUE; diff --git a/dat/udat/windows/dat_osd.h b/dat/udat/windows/dat_osd.h index d78fe44..6941e46 100644 --- a/dat/udat/windows/dat_osd.h +++ b/dat/udat/windows/dat_osd.h @@ -244,11 +244,13 @@ dat_os_usleep( * * *********************************************************************/ +extern HANDLE heap; + STATIC INLINE void * dat_os_alloc ( int size) { - return malloc (size); + return HeapAlloc(heap, 0, size); } STATIC INLINE void @@ -256,7 +258,8 @@ dat_os_free ( void *ptr, int size) { - free (ptr); + UNREFERENCED_PARAMETER(size); + HeapFree(heap, 0, ptr); } STATIC INLINE void * -- 1.5.2.5 From meenakshi.venkataraman at intel.com Thu Sep 3 14:17:46 2009 From: meenakshi.venkataraman at intel.com (Venkataraman, Meenakshi) Date: Thu, 3 Sep 2009 14:17:46 -0700 Subject: [ofa-general] perftest 1.2 write bw test woes Message-ID: Hi list, I'm trying to run perftest-1.2's write_bw test for iWARP, and I'm getting the following error. ./ib_write_bw -s 512 -n 10000 -q 2 -g 120 -t 200 ------------------------------------------------------------------ RDMA_Write BW Test Number of qp's running 2 Connection type : RC Each Qp will post up to 120 messages each time Inline data is used up to 400 bytes message local address: LID 0x01, QPN 0x00f0, PSN 0x5978cb RKey 0x406557b8 VAddr 0x00000000609200 remote address: LID 0x01, QPN 0x00e9, PSN 0x9239b0, RKey 0x7083a8d2 VAddr 0x00000000609200 Mtu : 2048 Failed to modify RC QP to RTS Any idea what the error is? I'm new to RDMA, and wonder if I've missed any parameter. Just FYI, I'm running the OFED 1.5 nightly release (14.july.2009) on an NE020 driver. Thanks! Meenakshi Venkataraman ~ 42 ~ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bart at atipa.com Thu Sep 3 14:26:56 2009 From: bart at atipa.com (Bart Willems) Date: Thu, 3 Sep 2009 16:26:56 -0500 Subject: [ofa-general] mvapich_pgi rpm ignored by install.pl Message-ID: Hi All, I 'm trying to install OFED 1.4 using the install.pl script in the OFED-1.4-mlnx8.tgz tarball. All goes well except for the mvapich_pgi and mvapich2_pgi packages. The install scripts creates the RPMs fine, but does not install them. Manual installation of the RPMs goes without problems as well. Any ideas on why the install.pl script neglects mvapich_pgi-1.1.0-3143.x86_64.rpm? Thanks, Bart From skylar.atip at gmail.com Thu Sep 3 14:41:51 2009 From: skylar.atip at gmail.com (skylaratip) Date: Thu, 3 Sep 2009 16:41:51 -0500 Subject: [ofa-general] install.pl - mpitests-3.1-891.src.rpm errors. Message-ID: <72b78d5d0909031441y6b0d71edq4f8cf1c846752163@mail.gmail.com> Hello, I am atempting to install OFED-1.4-mlnx8 with PGI using the install.pl script. Looking at the logs mpitests-3.1-891.src.rpm errors because, from what I can tell, the install.pl script is passing GCC flags and is causing the installer to break. Any ideas for a way to fix this? Thanks Skylar Here's an excert from the log. cd /var/tmp/OFED_topdir/BUILD/mpitests-3.1/osu_benchmarks-3.0 && make MPIHOME=/opt/ofed/mpi/pgi/openmpi-1.2.8 make[1]: Entering directory `/var/tmp/OFED_topdir/BUILD/mpitests-3.1/osu_benchmarks-3.0' /opt/ofed/mpi/pgi/openmpi-1.2.8/bin/mpicc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/opt/ofed/include -c -o osu_bcast.o osu_bcast.c pgcc-Error-Unknown switch: -pipe pgcc-Error-Unknown switch: -Wall pgcc-Error-Unknown switch: -Wp,-D_FORTIFY_SOURCE=2 pgcc-Error-Unknown switch: -fexceptions pgcc-Error-Unknown switch: -fstack-protector pgcc-Error-Unknown switch: --param=ssp-buffer-size=4 pgcc-Error-Unknown switch: -m64 pgcc-Error-Unknown switch: -mtune=generic make[1]: *** [osu_bcast.o] Error 1 make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/mpitests-3.1/osu_benchmarks-3.0' make: *** [osu] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.46299 (%install) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chien.tin.tung at intel.com Thu Sep 3 15:37:35 2009 From: chien.tin.tung at intel.com (Tung, Chien Tin) Date: Thu, 3 Sep 2009 15:37:35 -0700 Subject: [ofa-general] RE: perftest 1.2 write bw test woes In-Reply-To: References: Message-ID: <60BEFF3FBD4C6047B0F13F205CAFA383036E195C09@azsmsx501.amr.corp.intel.com> > ./ib_write_bw -s 512 -n 10000 -q 2 -g 120 -t 200 You will have better luck using ib_rdma_bw with iWARP adapters and -c argument. > Just FYI, I'm running the OFED 1.5 nightly release > (14.july.2009) on an NE020 driver. Don't expect to get support for pre-release builds. It is best if you stick with OFED 1.4.1(or 2). Lastly, wrong list. you want to ask OFED related questions on ewg. Chien From worleys at gmail.com Thu Sep 3 16:20:42 2009 From: worleys at gmail.com (Chris Worley) Date: Thu, 3 Sep 2009 17:20:42 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: > On Thu, Sep 3, 2009 at 5:32 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/03/2009 08:08 AM wrote: >>> >>> On Wed, Sep 2, 2009 at 2:58 PM, Chris Worley wrote: >>>> >>>> On Wed, Sep 2, 2009 at 2:00 PM, Bart Van Assche >>>> wrote: >>>>> >>>>> On Wed, Sep 2, 2009 at 9:53 PM, Chris Worley wrote: >>>>>> >>>>>> On Wed, Sep 2, 2009 at 1:31 PM, Bart Van >>>>>> Assche wrote: >>>>>>> >>>>>>> On Tue, Sep 1, 2009 at 1:04 AM, Chris Worley wrote: >>>>>>>> >>>>>>>> [ ... ] >>>>>>>> I've found a good kernel/scst mix to easily repeat this; I can get it >>>>>>>> to repeatedly hang w/ 8K block transfers running Ubuntu 9.04 w/ the >>>>>>>> 2.6.27-14-server kernel on _both_ target and initiator (i.e. no WinOF >>>>>>>> or OFED at all) and SCST rev 1062 on the target using one drive >>>>>>>> (performance is >600MB/s, >80K IOPS, on the 8KB block sizes being >>>>>>>> used). >>>>>>>> [ ... ] >>>>>>> >>>>>>> Is there a special reason why you are using the 2.6.27-14-server >>>>>>> kernel ? AFAIK the latest Ubuntu 9.04 kernel is 2.6.28-15-server. >>>>>> >>>>>> No special reason other than it didn't get upgraded w/ the rest of the >>>>>> distro... started w/ 8.10. >>>> >>>> I'm upgrading too, to 9.04. >>> >>> I tried the 2.6.28-15-server kernel (along w/ the 9.04 upgrade), and >>> it does repeat the issue. >>> >>> In trying to build a kernel w/ lockdep support as Vlad requested, my >>> lack of Debian knowledge shone through, and, although I believe I >>> followed all the instructions correctly, I'm not sure if I have a >>> 2.6.28-15 or 2.6.28-10 kernel.  Anyway, the issue is still repeatable. >>> >>> Whatever kernel that is, I have SRP hung currently.  What should I >>> look for in /proc/lockd*? >>> >>> I don't think it's a kernel lock... I think it's a protocol lock, as I >>> can rmmod the target kernel modules (scst_vdisk, scst, and ib_srpt) >>> when the initiator gets in this state. >> >> Since you can rmmod SCST modules, then this shouldn't be SCST or backstorage >> SW/HW issue, because that means there are no stuck or lost SCSI commands. > > At least on the target side.  The initiator could think there are > outstanding commands, when they were actually lost on the target (or > the target completed them, and the initiator is in error not thinking > they are completed). > >> So, it should be issue of either SRP target/initiator, or OFED on the target >> or initiator, or your IB hardware on any node. > > I've used a couple of initiators (different systems) w/ different > OSes, w/ different IB cards (all QDR) and different IB stacks > (built-in vs. OFED) and can repeat the problem in all but the > RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is > WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does > repeat). Here's a twist: I used the Ubuntu initiator w/ one of the RHEL targets, and the RHEL initiator (same machine as was running WinOF from the beginning of this thread) w/ one of the Ubuntu targets: in both cases, the problem does not repeat. That makes it sound like OFED is the cure on either side of the connection, but does not explain the issue w/ WinOF (which does fail w/ either Ununtu or RHEL targets). Chris > >> >> You should enable lockdep on both target and initiator (better with other >> kernel debug facilities enabled, see the attached file as a sample) and >> reproduce the issue. > > That's done and reported in another response; it doesn't seem to be a > lock issue. > >> There is a big chance that those facilities will spot >> what's going on wrong there. > > I applied the .config changes you suggested, and the kernel was > certainly more verbose, but I don't think added any information.  When > the drives are attached over SRP, I see the following message: > > [  454.317328] sd 4:0:0:3: [sde] Attached SCSI disk > [  454.317340] kobject: 'scsi_device' (ffff8804234a3aa0): > kobject_add_internal: parent: '4:0:0:3', set: '' > [  454.317350] kobject: '4:0:0:3' (ffff880423cd2780): > kobject_add_internal: parent: 'scsi_device', set: 'devices' > [  454.317378] kobject: '4:0:0:3' (ffff880423cd2780): kobject_uevent_env > [  454.317390] kobject: '4:0:0:3' (ffff880423cd2780): fill_kobj_path: > path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host4/target4:0:0/4:0:0:3/scsi_device/4:0:0:3' > [  454.317437] kobject: 'scsi_generic' (ffff8804234a3c38): > kobject_add_internal: parent: '4:0:0:3', set: '' > [  454.317447] kobject: 'sg5' (ffff88042ac4ecb8): > kobject_add_internal: parent: 'scsi_generic', set: 'devices' > [  454.317489] kobject: 'sg5' (ffff88042ac4ecb8): kobject_uevent_env > [  454.317500] kobject: 'sg5' (ffff88042ac4ecb8): fill_kobj_path: path > = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host4/target4:0:0/4:0:0:3/scsi_generic/sg5' > [  454.317523] sd 4:0:0:3: Attached scsi generic sg5 type 0 > > Is there somewhere else to look for problems? > > Thanks, > > Chris >> >> Vlad >> >>> Thanks, >>> >>> Chris >>>> >>>> Chris >>>>>> >>>>>> Do you think that kernel is better? >>>>> >>>>> I noticed this while trying to reproduce this issue. I have no opinion >>>>> yet about which of these two kernels is better. I'll downgrade the >>>>> Ubuntu kernel in my setup. >>>>> >>>>> Bart. >>>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >>> 30-Day trial. Simplify your report design, integration and deployment - and >>> focus on what you do best, core application coding. Discover what's new with >>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july >>> _______________________________________________ >>> Scst-devel mailing list >>> Scst-devel at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scst-devel >>> >> >> > From jeff.johnson at aeoncomputing.com Thu Sep 3 17:27:01 2009 From: jeff.johnson at aeoncomputing.com (Jeff Johnson) Date: Thu, 3 Sep 2009 17:27:01 -0700 Subject: [ofa-general] Cannot export multiple directories using nfs-rdma Message-ID: I have a nfs-rdma configuration using Mellanox ConnectX-DDR, ofed-1.4.2 on Centos 5.3 x86_64. My ConnectX cards are running 2.5.0 firmware as I have read that 2.6.0 had rdma issues. I saw these issues and down rev'd the cards to 2.5.0. I am seeing a peculiar behavior where if I export two separate directories from the server and attempt to mount them separately from a client I end up with the same export mounted to two different client directories. e.g.: server:/raid1 server:/raid2 'mount.rnfs 10.0.0.251:/raid1 /raid1 -i -o rdma,port=2050' client:/raid1 <---has server:/raid1 contents 'mount.rnfs 10.0.0.251:/raid2 /raid2 -i -o rdma,port=2050' client:/raid2 <---has server:/raid1 contents I have tried creating multiple rdma ports on the server (2050 and 2051) and then using different ports for each separate mount. The result is the same. I have verified that I am indeed mounting rdma and not merely ipoib. Is nfs-rdma capable of multiple exports? If so, I cannot find a method for dealing with multiple exports from the server or client side in any ofed docs. Thanks for any assistance.. ------------------------------ Jeff Johnson Manager Aeon Computing jeff.johnson at aeoncomputing.com t: 858-412-3810 f: 858-412-3845 m: 619-204-9061 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117 From bart.vanassche at gmail.com Thu Sep 3 23:35:20 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Fri, 4 Sep 2009 08:35:20 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: > Here's a twist: I used the Ubuntu initiator w/ one of the RHEL > targets, and the RHEL initiator (same machine as was running WinOF > from the beginning of this thread) w/ one of the Ubuntu targets: in > both cases, the problem does not repeat. > > That makes it sound like OFED is the cure on either side of the > connection, but does not explain the issue w/ WinOF (which does fail > w/ either Ununtu or RHEL targets). It might be a good idea to report the WinOF behavior via the OFED bugzilla (https://bugs.openfabrics.org/). Bart. From vlad at lists.openfabrics.org Fri Sep 4 03:06:23 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 4 Sep 2009 03:06:23 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090904-0200 daily build status Message-ID: <20090904100623.9514EE61F0B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: \ -I/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp/arch//include \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(cong)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/.tmp_cong.o /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:36:35: error: asm-generic/bitops/le.h: No such file or directory make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090904-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From vst at vlnb.net Fri Sep 4 05:49:41 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Fri, 04 Sep 2009 16:49:41 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: <4AA10CE5.302@vlnb.net> Bart Van Assche, on 09/04/2009 10:35 AM wrote: > On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >> targets, and the RHEL initiator (same machine as was running WinOF >> from the beginning of this thread) w/ one of the Ubuntu targets: in >> both cases, the problem does not repeat. >> >> That makes it sound like OFED is the cure on either side of the >> connection, but does not explain the issue w/ WinOF (which does fail >> w/ either Ununtu or RHEL targets). > > It might be a good idea to report the WinOF behavior via the OFED > bugzilla (https://bugs.openfabrics.org/). I agree. 99% this is OFED's problem, 90% that on the initiator's side. Something should be racy there. Vlad From ranjit.pandit.ib at gmail.com Fri Sep 4 11:27:41 2009 From: ranjit.pandit.ib at gmail.com (pandit ib) Date: Fri, 4 Sep 2009 11:27:41 -0700 Subject: [ofa-general] Re: [ewg] Update from September OpenFabrics Interoperability Event at UNH-IOL In-Reply-To: <48FE05F0.8070608@iol.unh.edu> References: <48FB2C81.3080301@mellanox.co.il> <4D511C95BE7F4D8E92B2BAE9E0D67AA0@annapurna> <48FE05F0.8070608@iol.unh.edu> Message-ID: <96f8e60e0909041127t1bcdea09t9ea20ec93a8f6f30@mail.gmail.com> Has there been any new interoperability testing between the iWARP vendors since Oct 08? Ranjit On Tue, Oct 21, 2008 at 9:40 AM, Bob Noseworthy wrote: > Greetings EWG members, >  A bug for the observed IPoIB issue was logged last Friday,  and updated > yesterday confirming that RC3 still demonstrates the issue. This is logged > as #1287 --  https://bugs.openfabrics.org/show_bug.cgi?id=1287 > > Further issues/observations from the recent OFA Interoperability Logo > Group's September Interoperability Event are at the end of this email. > Summary of reported IPoIB issue: > If IPoIB datagram mode is enabled,  and IP frames of 8K or larger are sent, >  and no ARP entry exists for the destination,  then the first IP frame is > always lost (ping used),  no matter what the timeout is set to (as high as > 15s) > > > The following is a short summary of various updates from the September > OpenFabrics Interoperability Event.  Due to confidentiality reasons, many > details are occluded.  Per the request of the IWG on Oct 14, this > information is being shared with the EWG. > > ================== > > > Below are rough notes from our testers, principally Nick Wood and Mike > Hagen. > IB update; > > 1. An SDP issue was observed once and not reproduced - suspected to be an > issue with starting testing too soon after netserver was started while all > three SDP tests were running simultaneously.   When retesting was performed > tests were not run simultaneously and no issues were seen. > > 2. An SRP issues was observed once and not reproduced - A vendors SRP target > was seen to become unresponsive when srp_sg_tablesize was increased to 255. >  Subsequent testing did not reproduce this behavior but is still being > pursued. > > 2a.  A vendors HCA was seen to perform slowly on SRP transfers,  this was > traced to an issue with the default srp_sg_tablesize of 16 had to be > increased to 131 for reasonable performance.    Reminder - performance is > outside the scope of the Logo program. Tziporet - this default value perhaps > could be increased as recommended unless there is a reason 16 is preferred. > > > > 3.     There is a link issue between two vendor's HCA cards. The fix that > was introduced allowed the link indication light to come up however > ibdiagnet never completes (hangs at IPoIB subnets check) and had to be > killed. Ibdiagnet also reports the following error: > > > -I--------------------------------------------------- > -I- PM Counters Info > -I--------------------------------------------------- > -E- Could not get PM info: >  "pmGetPortCounters 0xffff 1" failed 4 consecutive times. > -E- Could not get PM info: >  "pmGetPortCounters 0xffff 1" failed 4 consecutive times. > -I- No illegal PM counters values were found > >  This happens with both VendorA cards when linked to any speed card from > VendorB *without* an sm running. If there is an sm running and the fix is in > place on the machines housing the VendorA cards then everything works > flawlessly when linked with any speed VendorB card. > >  Upon removal of the cable from the VendorA card, that card gets put into a > bad state; with the fix in place and an sm running. The sm does not activate > the newly established link. This happened with VendorA cards to any VendorB > card. OpenSM also reports an error on screen; OpenSM: SM port is down. > Reestablishing the connection that was in place when the opensm instance was > started restores the active state. > >  One final bit of information that I have been able to glean. It does not > appear to matter if you restore the original connection that the opensm was > started on. The only connection that brings the card back to an active state > is if you link it with a qdr hca even if that connection was not the > original. If you then attempt to restore the original the active state will > not be restored. > Currently this issue is presumed to be principally a vendor matter, but if > evidence points to additional issues with ibdiagnet, or other OFED matters, > then bugs will be filed. > > > 4.  Similar to the above issue,  it was observed that two vendor's HCAs that > should link at DDR when directly connected were actually linking at SDR > speeds, regardless of the cable used.  This is a known issue however seems > to be a failure of the Link Init test procedure as the highest denominator > speed is not achieved. > > 5. An issue with ibdiagnet was discovered by a vendor and bugs submitted > (unrelated to issue 3 above) > > ================== > > iWARP update; > > 1. "dapltest -T P" will not work between two  cards.  They both have > implemented a different peer2peer protocol that ensures that a client does a > transfer before the server, to overcome the limitation in the iWARP standard > that says a client must send first data or the connection must be teared > down. > > 2. The section in the IWG test suite covering dapl must be updated to > include at least some reference to /etc/dat.conf which must be configured in > order to use any dapl based application including many MPIs and dapltest. > (This was being addressed by Arlin Davis) > > 3. dapl2.0 and dapltest2.0 do not work with iWARP devices.  From the base > OFED1.4 install dapl2.0-utils must be uninstalled and compat-dapl must be > installed from the OFED website. > > 4. Due to the dapl problems, Intel MPI works in single vendor environments > but will not work in multi-vendor environments. > > 5. The default OpenMPI installed with OFED 1.4 is version 1.2.7.  iWARP > support is officially not added until OpenMPI 1.3. > > 6. Loopback functionality is still not seen by all vendors.  (this has > relevance to OFED feature enhancement #1275 > > > 7.  Dynamic links support was not seen by all vendors when using Intel MPI. > > > ================== > ================== > > > Testing is ongoing with RC3 and future 1.4RCs on a best effort basis until > the GA, at which time the Logo Event will be held for those participating. >  If you have additional questions about these comments,  the > Interoperability Events, Logo Events,  or the OFA Interoperability Test > Plan, please feel free to contact us here at UNH-IOL,  our OFA > Interoperability Logo Group team can be reached at ofalab at iol.unh.edu. > > The testplan, logo list and past logo reports can be reviewed at > http://www.iol.unh.edu/services/testing/ofa/ > > > Best Regards, > - Bob Noseworthy >  Chief Engineer / Technical Sherpa >    +1-909-891-0090 {unified phone number for office, cell, etc} >  +1-603-862-0090 {IOL Main number-associate this with any shipments} >  UNH-IOL > > > > > > > > > > Rupert Dance wrote: >> >> I have sent another reminder to UNH IOL to get this logged. I will >> continue >> to follow up on this. >> >> Thanks >> >> Rupert >> -----Original Message----- >> From: Tziporet Koren [mailto:tziporet at dev.mellanox.co.il] Sent: Sunday, >> October 19, 2008 8:48 AM >> To: Rupert Dance >> Cc: EWG >> Subject: Have you opened bugs to OFED 1.4 >> I mean the bugs you explained in the last OFED meeting. >> >> Thanks >> Tziporet >> >> > > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > From vlad at lists.openfabrics.org Sat Sep 5 03:07:35 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 5 Sep 2009 03:07:35 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090905-0200 daily build status Message-ID: <20090905100735.5DC1CE61F74@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: \ -I/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp/arch//include \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(cong)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/.tmp_cong.o /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.c:36:35: error: asm-generic/bitops/le.h: No such file or directory make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: from /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/rds.h:4, from /home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090905-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From vlad at lists.openfabrics.org Sun Sep 6 03:06:10 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 6 Sep 2009 03:06:10 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090906-0200 daily build status Message-ID: <20090906100611.2805FE61F15@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090906-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From tziporet at mellanox.co.il Sun Sep 6 04:37:39 2009 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Sun, 6 Sep 2009 14:37:39 +0300 Subject: [ofa-general] OFED 1.5-alpha 4 and RHEL 5.3 GA In-Reply-To: <1251473851.10055.3.camel@halves-ltc> References: <1251473851.10055.3.camel@halves-ltc> Message-ID: <2ED289D4E09FBD4D92D911E869B97FDDA11A6B@mtlexch01.mtl.com> Vlad/jack What should be fixed? tziporet -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Higor Aparecido Vieira Alves Sent: Friday, August 28, 2009 6:38 PM To: OpenIB Subject: [ofa-general] OFED 1.5-alpha 4 and RHEL 5.3 GA Hi Guys, I tried build OFED1.5 on RHEL 5.3 GA and got an error to build ofa_kernel. Build log attached. Regards, -- Higor Aparecido Vieira Alves Software Engineer Linux Technology Center IBM Systems & Technology Group From kliteyn at dev.mellanox.co.il Sun Sep 6 06:01:01 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:01:01 +0300 Subject: [ofa-general] Re: [PATCH] ibutils: ibdiagnet -r "Dead end" errors In-Reply-To: <20090707232243.GJ15871@sgi.com> References: <20090707232243.GJ15871@sgi.com> Message-ID: <4AA3B28D.5060101@dev.mellanox.co.il> akepner at sgi.com wrote: > On a cluster running sles11 and OFED 1.4, we recently started seeing errors > like this: > > # ibdiagnet -r > ..... > -I- > -I- Verifying all CA to CA paths ... > -E- Unassigned LFT for lid:1 Dead end at:S0800690000004057/U1 > -E- Fail to find a path from:r1i0n9/U1/1 to:r1lead/U1/1 > ... > > > But the forwarding tables (obtained with dump_lfts.sh, and smpdump) > are correct. The problem turned out to be that the string "-lft" was > being interpreted as a port number, resulting in an off-by-one error. > > The following fixed it for us. > > Signed-off-by: Arthur Kepner > --- > Thanks, applied. -- Yevgeny From kliteyn at mellanox.co.il Sun Sep 6 06:03:11 2009 From: kliteyn at mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:03:11 +0300 Subject: [ofa-general] [PATCH] ibdm/ibnl/* ibnl definition files for Sun IB QDR products In-Reply-To: <4A97A7C0.8010808@Sun.COM> References: <4A97A7C0.8010808@Sun.COM> Message-ID: <4AA3B30F.1080301@mellanox.co.il> Lars Paul Huse wrote: > ibnl definition files for Sun IB QDR products: > - 48 port QNEM > - 36 port Switch > - 72 port Switch > - 648 port Switch > > Signed-off-by: Lars Paul Huse > > Thanks, applied. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:08:56 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:08:56 +0300 Subject: [ofa-general] Re: [PATCH 0/6] ibutils: Build fixes for FC11 In-Reply-To: <20090902120353.3ee1a8e2@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> Message-ID: <4AA3B468.8010802@dev.mellanox.co.il> Sebastien, sebastien dugue wrote: > Hi, > > here are some fixes I had to apply in order to be able to build under FC11 > due to some changes in the toolchain. > > Sebastien. > Thanks. I've checked in all but patch 3/6. See details in the mail. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:09:30 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:09:30 +0300 Subject: [ofa-general] Re: [PATCH 1/6] ibutils/ibdm: Fix 'invalid conversion from const char* to char*' build error In-Reply-To: <20090902120516.57ca4e2b@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120516.57ca4e2b@frecb007965> Message-ID: <4AA3B48A.8020203@dev.mellanox.co.il> sebastien dugue wrote: > This occurs under FC11 with gcc 4.4.0-4. > > Signed-off-by: Sebastien Dugue > Applied, thanks. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:09:48 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:09:48 +0300 Subject: [ofa-general] Re: [PATCH 2/6] ibutils/ibdm: Add -fPIC to libreplace build In-Reply-To: <20090902120551.5bfd3fe0@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120551.5bfd3fe0@frecb007965> Message-ID: <4AA3B49C.4050709@dev.mellanox.co.il> sebastien dugue wrote: > This allows to build under FC11. Otherwise, building shared libraries using > libreplace results in the following error: > > .../ibutils/ibdm/replace/libreplace.a(regex.o): relocation R_X86_64_32S against > `a local symbol' can not be used when making a shared object; > recompile with -fPIC > .../ibutils/ibdm/replace/libreplace.a: could not read symbols: Bad value > > Signed-off-by: Sebastien Dugue Applied, thanks. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:10:19 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:10:19 +0300 Subject: [ofa-general] Re: [PATCH 4/6] ibutils: Add libibsysapi.so to the spec file In-Reply-To: <20090902120755.6a6ad964@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120755.6a6ad964@frecb007965> Message-ID: <4AA3B4BB.1090502@dev.mellanox.co.il> sebastien dugue wrote: > > > Signed-off-by: Sebastien Dugue Applied, thanks. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:11:13 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:11:13 +0300 Subject: [ofa-general] Re: [PATCH 5/6] ibutils: Allow parallel build In-Reply-To: <20090902120911.28e49f38@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120911.28e49f38@frecb007965> Message-ID: <4AA3B4F1.60708@dev.mellanox.co.il> sebastien dugue wrote: > > > Signed-off-by: Sebastien Dugue Applied, thanks. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:11:35 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:11:35 +0300 Subject: [ofa-general] Re: [PATCH 6/6] ibutils: Fix libibsysapi build for old autotools In-Reply-To: <20090902120958.27fe75c5@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120958.27fe75c5@frecb007965> Message-ID: <4AA3B507.9090903@dev.mellanox.co.il> sebastien dugue wrote: > Signed-off-by: Sebastien Dugue Applied, thanks. -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Sep 6 06:13:36 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 06 Sep 2009 16:13:36 +0300 Subject: [ofa-general] Re: [PATCH 3/6] ibutils/ibdm: Fix libibsysapi build In-Reply-To: <20090902120646.0bc3db4b@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120646.0bc3db4b@frecb007965> Message-ID: <4AA3B580.2000807@dev.mellanox.co.il> Sebastien, sebastien dugue wrote: > Add libibdmcom linker path to allow build under FC11. > > Signed-off-by: Sebastien Dugue > --- > ibdm/src/Makefile.am | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/ibdm/src/Makefile.am b/ibdm/src/Makefile.am > index 8b2f9ba..b763387 100644 > --- a/ibdm/src/Makefile.am > +++ b/ibdm/src/Makefile.am > @@ -61,7 +61,7 @@ ibnlparse_SOURCES = test_ibnl_parser.cpp > lib_LTLIBRARIES = libibsysapi.la > libibsysapi_la_SOURCES = ibsysapi.cpp > libibsysapi_la_LDFLAGS = -version-info 1:0:0 > -libibsysapi_la_LIBADD = -libdmcom > +libibsysapi_la_LIBADD = -L../ibdm -libdmcom This problem was already pointed out by Dale Purdy, and I already fixed it based on his suggestion. In fact, my fix is the same as yours :) -- Yevgeny From bart.vanassche at gmail.com Sun Sep 6 06:17:31 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Sun, 6 Sep 2009 15:17:31 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: > On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: > > I've used a couple of initiators (different systems) w/ different > > OSes, w/ different IB cards (all QDR) and different IB stacks > > (built-in vs. OFED) and can repeat the problem in all but the > > RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is > > WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does > > repeat). > > Here's a twist: I used the Ubuntu initiator w/ one of the RHEL > targets, and the RHEL initiator (same machine as was running WinOF > from the beginning of this thread) w/ one of the Ubuntu targets: in > both cases, the problem does not repeat. > > That makes it sound like OFED is the cure on either side of the > connection, but does not explain the issue w/ WinOF (which does fail > w/ either Ununtu or RHEL targets). These results are strange. Regarding the Linux-only tests, I was assuming failure of a single component (Ubuntu SRP initiator, OFED SRP initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for each of these components there is at least one test that passes and at least one test that fails. So either my assumption is wrong or one of the above test results is not repeatable. Do you have the time to repeat the Linux-only tests ? Bart. From worleys at gmail.com Sun Sep 6 06:36:23 2009 From: worleys at gmail.com (Chris Worley) Date: Sun, 6 Sep 2009 15:36:23 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche wrote: > On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >> > I've used a couple of initiators (different systems) w/ different >> > OSes, w/ different IB cards (all QDR) and different IB stacks >> > (built-in vs. OFED) and can repeat the problem in all but the >> > RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >> > WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >> > repeat). >> >> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >> targets, and the RHEL initiator (same machine as was running WinOF >> from the beginning of this thread) w/ one of the Ubuntu targets: in >> both cases, the problem does not repeat. >> >> That makes it sound like OFED is the cure on either side of the >> connection, but does not explain the issue w/ WinOF (which does fail >> w/ either Ununtu or RHEL targets). > > These results are strange. Regarding the Linux-only tests, I was > assuming failure of a single component (Ubuntu SRP initiator, OFED SRP > initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for > each of these components there is at least one test that passes and at > least one test that fails. So either my assumption is wrong or one of > the above test results is not repeatable. Do you have the time to > repeat the Linux-only tests ? Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and the problem repeated; now, I can't repeat the case where it didn't fail. Still, no errors, other than the eventual timeouts previously shown; the target thinks all is fine, the initiator is stuck. Chris > > Bart. > From worleys at gmail.com Sun Sep 6 06:41:04 2009 From: worleys at gmail.com (Chris Worley) Date: Sun, 6 Sep 2009 15:41:04 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: > On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche wrote: >> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>> > I've used a couple of initiators (different systems) w/ different >>> > OSes, w/ different IB cards (all QDR) and different IB stacks >>> > (built-in vs. OFED) and can repeat the problem in all but the >>> > RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>> > WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>> > repeat). >>> >>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>> targets, and the RHEL initiator (same machine as was running WinOF >>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>> both cases, the problem does not repeat. >>> >>> That makes it sound like OFED is the cure on either side of the >>> connection, but does not explain the issue w/ WinOF (which does fail >>> w/ either Ununtu or RHEL targets). >> >> These results are strange. Regarding the Linux-only tests, I was >> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >> each of these components there is at least one test that passes and at >> least one test that fails. So either my assumption is wrong or one of >> the above test results is not repeatable. Do you have the time to >> repeat the Linux-only tests ? > > Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and > the problem repeated; now, I can't repeat the case where it didn't > fail.  Still, no errors, other than the eventual timeouts previously > shown; the target thinks all is fine, the initiator is stuck. ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or 9.04. Chris > > Chris >> >> Bart. >> > From vlad at dev.mellanox.co.il Sun Sep 6 07:18:15 2009 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 06 Sep 2009 17:18:15 +0300 Subject: [ofa-general] OFED 1.5-alpha 4 and RHEL 5.3 GA In-Reply-To: <2ED289D4E09FBD4D92D911E869B97FDDA11A6B@mtlexch01.mtl.com> References: <1251473851.10055.3.camel@halves-ltc> <2ED289D4E09FBD4D92D911E869B97FDDA11A6B@mtlexch01.mtl.com> Message-ID: <4AA3C4A7.4010901@dev.mellanox.co.il> Tziporet Koren wrote: > Vlad/jack > What should be fixed? > > tziporet > > Hi, I can't reproduce this failure with the latest build (OFED-1.5-20090905-0600) on RHEL5.3, 2.6.18-128.el5, ppc64. Probably, the issue was fixed after 1.5-alpha4. Regards, Vladimir > -----Original Message----- > From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Higor Aparecido Vieira Alves > Sent: Friday, August 28, 2009 6:38 PM > To: OpenIB > Subject: [ofa-general] OFED 1.5-alpha 4 and RHEL 5.3 GA > > Hi Guys, > > I tried build OFED1.5 on RHEL 5.3 GA and got an error to build > ofa_kernel. Build log attached. > > > Regards, > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From tziporet at dev.mellanox.co.il Sun Sep 6 08:06:26 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 06 Sep 2009 18:06:26 +0300 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.32 In-Reply-To: References: <4A9FB004.6020904@mellanox.co.il> Message-ID: <4AA3CFF2.20306@mellanox.co.il> Roland Dreier wrote: > > What about RDMAoE? > > Patches were sent few weeks ago and it seems you ignore them. > > Sorry, I should have mentioned that. > > Yes, I have been ignoring the patches -- I want to get through XRC > first, We can help with XRC if this will expedite the RDMAoE. Will it? Tziporet > > From sashak at voltaire.com Sun Sep 6 08:25:05 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 18:25:05 +0300 Subject: [ofa-general] Re: [PATCH 1/2 v3] opensm: Storage organization for multicast groups In-Reply-To: <4A798D56.2020408@Voltaire.COM> References: <4A798D56.2020408@Voltaire.COM> Message-ID: <20090906152505.GC25241@me> Hi Slava, On 16:47 Wed 05 Aug , Slava Strebkov wrote: > > Subject: [PATCH 1/2] Storage organization for multicast groups > > Main purpose is to prepare infrastructure for (many) mgids > to one mlid compression. Proposed the following changes: > 1. Element in mlid array is now a multicast group holder. > 2. mgrp_holder keeps a list of mgroups sharing same mlid. > With introduction of compression, there will be many > multicast groups per mlid. Current implementation keeps > one mgid to one mlid ratio. > 3. mgrp_holder has a map of ports sharing same mlid. Ports sorted > by port guid. Port map is necessary for building spanning > tree per mgroup_holder, not just for single mgroup. > 4. Element in port map keeps a list of mgroups opened by this port. > This allows quick deletion of mgroups when port changes > state to DOWN. > 5. Multicast processing functions use mgroup_holder object instead > of mgroup. > > Signed-off-by: Slava Strebkov > --- > opensm/include/opensm/osm_multicast.h | 343 +++++++++++++++++++++++++++++--- > opensm/include/opensm/osm_sm.h | 10 +- > opensm/include/opensm/osm_subnet.h | 38 ++-- > opensm/opensm/osm_drop_mgr.c | 14 +- > opensm/opensm/osm_mcast_mgr.c | 228 +++++++++++++--------- > opensm/opensm/osm_multicast.c | 198 +++++++++++++++++-- > opensm/opensm/osm_qos_policy.c | 38 ++-- > opensm/opensm/osm_sa.c | 31 +-- > opensm/opensm/osm_sa_mcmember_record.c | 94 +++++---- > opensm/opensm/osm_sa_path_record.c | 13 +- > opensm/opensm/osm_sm.c | 81 +++++++- > opensm/opensm/osm_subnet.c | 31 +++- > 12 files changed, 855 insertions(+), 264 deletions(-) > > diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h > index 9a47de5..61d1ba6 100644 > --- a/opensm/include/opensm/osm_multicast.h > +++ b/opensm/include/opensm/osm_multicast.h > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * > @@ -107,6 +107,82 @@ typedef struct osm_mcast_mgr_ctxt { > * > * SEE ALSO > *********/ > +/****s* OpenSM: Multicast Group Holder/osm_mgrp_holder_t > +* NAME > +* osm_mgrp_holder_t > +* > +* DESCRIPTION > +* Holder for mgroups. > +* > +* The osm_mgrp_t object should be treated as opaque and should > +* be manipulated only through the provided functions. > +* > +* SYNOPSIS > +*/ > + > +typedef struct osm_mgrp_holder { > + cl_qmap_t mgrp_port_map; > + cl_qlist_t mgrp_list; > + osm_mtree_node_t *p_root; > + ib_net16_t mlid; > + boolean_t to_be_deleted; > + uint32_t last_tree_id; > + uint32_t last_change_id; > +} osm_mgrp_holder_t; I remember we discussed (off-list) that using shorter name would be better. You proposed then 'osm_mgrp_box' then... > + > +/* > +* FIELDS > +* mgrp_port_map > +* Map of all ports joined same mlid > +* > +* mgrp_list > +* List of mgroups having same mlid > +* > +* p_root > +* Pointer to the root "tree node" in the single spanning tree > +* for this multicast group holder.The nodes of the tree represent > +* switches. Member ports are not represented in the tree. > +* > +* mlid > +* mlid of current group holder > +* > +* to_be_deleted > +* Since holders are deleted when there are no mgroups in. > +* > +* last_change_id > +* a counter for the number of changes applied to the group in this holder. > +* This counter shuold be incremented on any modification > +* to the group: joining or leaving of ports. > +* > +* last_tree_id > +* the last change id used for building the current tree. > +*/ > + /****s* OpenSM: Multicast group Port /osm_mgrp_port _t > +* NAME > +* osm_mgrp_port _t > +* > +* DESCRIPTION > +* Holder for pointers to mgroups and port guid. > +* > +* > +* SYNOPSIS > +*/ > +typedef struct _osm_mgrp_port { No leading underscore is needed in public structure definitions. > + cl_map_item_t guid_item; > + cl_qlist_t mgroups; What is the purpose of this 'groups' list? > + ib_net64_t port_guid; > +} osm_mgrp_port_t; > +/* > +* FIELDS > +* guid_item > +* Map for ports. Must be first element > +* > +* mgroups > +* List of mgroups opened by this port. > +* > +* portguid > +* guid of port representing current structure > +*/ > > /****s* OpenSM: Multicast Group/osm_mgrp_t > * NAME > @@ -122,14 +198,13 @@ typedef struct osm_mcast_mgr_ctxt { > */ > typedef struct osm_mgrp { > cl_fmap_item_t map_item; > + cl_list_item_t mlid_item; > + cl_list_item_t port_item; And respectively this 'port_item' field? Well, I see below what you are trying to do - it serves introduced holder_port_add/delete() APIs. For me it looks that much simple approach is possible - see a comments there. > ib_net16_t mlid; > - osm_mtree_node_t *p_root; > cl_qmap_t mcm_port_tbl; > ib_member_rec_t mcmember_rec; > boolean_t well_known; > boolean_t to_be_deleted; > - uint32_t last_change_id; > - uint32_t last_tree_id; > unsigned full_members; > } osm_mgrp_t; > /* > @@ -141,10 +216,11 @@ typedef struct osm_mgrp { > * The network ordered LID of this Multicast Group (must be > * >= 0xC000). > * > -* p_root > -* Pointer to the root "tree node" in the single spanning tree > -* for this multicast group. The nodes of the tree represent > -* switches. Member ports are not represented in the tree. > +* mlid_item > +* List item for groups with same MLID > +* > +* port_item > +* List item for groups opened on same port > * > * mcm_port_tbl > * Table (sorted by port GUID) of osm_mcm_port_t objects > @@ -163,14 +239,6 @@ typedef struct osm_mgrp { > * track the fact the group is about to be deleted so we can > * track the fact a new join is actually a create request. > * > -* last_change_id > -* a counter for the number of changes applied to the group. > -* This counter shuold be incremented on any modification > -* to the group: joining or leaving of ports. > -* > -* last_tree_id > -* the last change id used for building the current tree. > -* > * SEE ALSO > *********/ > > @@ -456,30 +524,111 @@ osm_mgrp_delete_port(IN osm_subn_t * const p_subn, > int osm_mgrp_remove_port(osm_subn_t *subn, osm_log_t *log, osm_mgrp_t *mgrp, > osm_mcm_port_t *mcm, uint8_t join_state); > > -/****f* OpenSM: Multicast Group/osm_mgrp_apply_func > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_new > * NAME > -* osm_mgrp_apply_func > +* osm_mgrp_holder_new > * > * DESCRIPTION > -* Calls the specified function for each element in the tree. > -* Elements are passed to the callback function in no particular order. > +* Allocates and initializes a Multicast Group Holder for use. > * > * SYNOPSIS > */ > -void > -osm_mgrp_apply_func(const osm_mgrp_t * const p_mgrp, > - osm_mgrp_func_t p_func, void *context); > +osm_mgrp_holder_t *osm_mgrp_holder_new(IN osm_subn_t * p_subn, > + IN ib_net16_t mlid); > +/* > +* PARAMETERS > +* p_subn > +* (in) pointer to osm_subnet > +* mlid > +* [in] Multicast LID for this multicast group holder. > +* > +* RETURN VALUES > +* pointer to initialized osm_mgrp_holder_t > +* or NULL, if unsuccessful > +* > +* SEE ALSO > +* Multicast Group Holder, osm_mgrp_holder_delete > +*********/ > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_delete > +* NAME > +* osm_mgrp_holder_delete > +* > +* DESCRIPTION > +* Removes entry from array of holders > +* Removes port from mgroup port list > +* > +* SYNOPSIS > +*/ > +void osm_mgrp_holder_delete(IN osm_subn_t * p_subn, > + IN ib_net16_t mlid); > + > /* > * PARAMETERS > +* > +* p_subn > +* [in] Pointer to osm_subnet > +* > +* mlid > +* [in] holder's mlid > +* > +* RETURN VALUES > +* None. > +* > +* NOTES > +* > +* SEE ALSO > +* > +*********/ > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_add_mgrp_port > +* NAME > +* osm_mgrp_holder_port_add_mgrp > +* > +* DESCRIPTION > +* Allocates osm_mgrp_port_t for new port joined to mgroup with mlid of this holder, > +* and adds mgroup to mgroup map of existed osm_mgrp_port_t object. > +* > +* SYNOPSIS > +*/ > +ib_api_status_t osm_mgrp_holder_port_add_mgrp(IN osm_mgrp_holder_t * > + p_mgrp_holder, > + IN osm_mgrp_t * p_mgrp, > + IN ib_net64_t port_guid); I don't think that you need such APIs, but instead can use existing osm_mgrp_add_port() and osm_mgrp_remove_port() - both have pointer to it mgrp as parameter which "knows" its holder (mlid), so just need to add holder's port map addition there and continue use existing stuff as is. > +/* > +* PARAMETERS > +* p_mgrp_holder > +* (in) pointer to osm_mgrp_holder_t > * p_mgrp > -* [in] Pointer to an osm_mgrp_t object. > +* (in) pointer to osm_mgrp_t > * > -* p_func > -* [in] Pointer to the users callback function. > +* RETURN VALUES > +* IB_SUCCESS or > +* IB_INSUFFICIENT_MEMORY > * > -* context > -* [in] User context passed to the callback function. > +* SEE ALSO > +* Multicast Group Holder, osm_mgrp_holder_delete_mgrp_port > +*********/ > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_delete_mgrp_port > +* NAME > +* osm_mgrp_holder_port_delete_mgrp > * > +* DESCRIPTION > +* Deletes osm_mgrp_port_t for specified port > +* > +* SYNOPSIS > +*/ > +void osm_mgrp_holder_port_delete_mgrp(IN osm_mgrp_holder_t * p_mgrp_holder, > + IN osm_mgrp_t * p_mgrp, > + IN ib_net64_t port_guid); Same as above. > +/* > +* PARAMETERS > +* p_mgrp_holder > +* [in] Pointer to an osm_mgrp_holder_t object. > +* > +* p_mgrp > +* (in) Pointer to osm_mgrp_t object > +* > +* port_guid > +* [in] Port guid of the departing port. > * > * RETURN VALUES > * None. > @@ -487,8 +636,144 @@ osm_mgrp_apply_func(const osm_mgrp_t * const p_mgrp, > * NOTES > * > * SEE ALSO > -* Multicast Group > +Multicast Group Holder,osm_holder_add_mgrp_port Bad formatting (here and in many other places). > +*********/ > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_add_mgrp > +* NAME > +* osm_mgrp_holder_add_mgrp > +* > +* DESCRIPTION > +* Adds mgroup to holder according to its mgid > +* > +* > +* SYNOPSIS > +*/ > +void osm_mgrp_holder_add_mgrp(IN osm_mgrp_holder_t * p_mgrp_holder, > + IN osm_mgrp_t * p_mgrp, > + IN osm_log_t * const p_log); > +/* > +* PARAMETERS > +* > +* p_mgrp_holder > +* [in] Pointer to an osm_mgrp_holder_t object. > +* > +* p_mgrp > +* [in] mgroup to add. > +* > +* RETURN VALUES > +* None. > +* > +* NOTES > +* Updates common_mgid when holder is being reused > +* SEE ALSO > +* Multicast Group Holder,osm_mgrp_holder_delete_mgrp > +*********/ > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_delete_mgrp > +* NAME > +* osm_mgrp_holder_delete_mgrp > +* > +* DESCRIPTION > +* Deletes mgroup from holder according to its mgid > +* > +* > +* SYNOPSIS > +*/ > +void osm_mgrp_holder_delete_mgrp(IN osm_mgrp_holder_t * p_mgrp_holder, > + IN osm_mgrp_t * p_mgrp); Do you really need 'holder' parameter here? > +/* > +* PARAMETERS > +* > +* p_mgrp_holder > +* [in] Pointer to an osm_mgrp_holder_t object. > +* > +* p_mgrp > +* [in] mgroup to delete. > +* > +* RETURN VALUES > +* None. > +* > +* NOTES > +* > +* SEE ALSO > +* Multicast Group Holder,osm_mgrp_holder_add_mgrp > *********/ > > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_remove_port > +* NAME > +* osm_mgrp_holder_remove_port > +* > +* DESCRIPTION > +* Removes osm_mgrp_port_t from mgrp_port_map of holder > +* Removes port from mgroup port list > +* > +* SYNOPSIS > +*/ > +void osm_mgrp_holder_remove_port(IN osm_subn_t * const p_subn, > + IN osm_log_t * const p_log, > + IN osm_mgrp_holder_t * const p_mgrp_holder, > + IN const ib_net64_t port_guid); > +/* > +* PARAMETERS > +* > +* p_subn > +* [in] Pointer to the subnet object > +* > +* p_log > +* [in] The log object pointer > +* > +* p_mgrp_holder > +* [in] Pointer to an osm_mgrp_holder_t object. > +* > +* port_guid > +* [in] Port guid of the departing port. > +* > +* RETURN VALUES > +* None. > +* > +* NOTES > +* > +* SEE ALSO > +* > +*********/ > +/****f* OpenSM: Subnet/osm_get_mgrp_by_mlid > +* NAME > +* osm_get_mgrp_by_mlid > +* > +* DESCRIPTION > +* The looks for the given multicast group in the subnet table by mlid. > +* NOTE: this code is not thread safe. Need to grab the lock before > +* calling it. > +* > +* SYNOPSIS > +*/ > +static inline struct osm_mgrp_holder *osm_get_mgrp_holder_by_mlid(osm_subn_t const > + *p_subn, > + ib_net16_t mlid) > +{ > + return p_subn->mgroup_holders[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; > +} Why is this function in this file? All 'osm_get_*_by_*()' helpers are located in osm_sudnet.[ch]. > +/* > +* PARAMETERS > +* p_subn > +* [in] Pointer to an osm_subn_t object > +* > +* mlid > +* [in] The multicast group mlid in network order > +* > +* RETURN VALUES > +* The multicast group structure pointer if found. NULL otherwise. > +*********/ > +static inline ib_net16_t osm_mgrp_holder_get_mlid(IN osm_mgrp_holder_t * > + const p_mgrp_holder) > +{ > + return (p_mgrp_holder->mlid); > +} > + > +static inline boolean_t osm_mgrp_holder_is_empty(IN const osm_mgrp_holder_t * > + const p_mgrp_holder) > +{ > + return (cl_qmap_count(&p_mgrp_holder->mgrp_port_map) == 0); > +} Where is this function used really? I didn't find. > + > END_C_DECLS > #endif /* _OSM_MULTICAST_H_ */ > diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h > index cc8321d..7f898ad 100644 > --- a/opensm/include/opensm/osm_sm.h > +++ b/opensm/include/opensm/osm_sm.h > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * > @@ -61,6 +61,7 @@ > #include > #include > #include > +#include > > #ifdef __cplusplus > # define BEGIN_C_DECLS extern "C" { > @@ -539,7 +540,8 @@ osm_resp_send(IN osm_sm_t * sm, > ib_api_status_t > osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, > IN const ib_net16_t mlid, > - IN const ib_net64_t port_guid); > + IN const ib_net64_t port_guid, > + IN const ib_gid_t * p_mgid); > /* > * PARAMETERS > * p_sm > @@ -551,6 +553,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, > * port_guid > * [in] Port GUID to add to the group. > * > +* p_mgid > +* [in] MGID to add to the group holder. > * RETURN VALUES > * None > * > @@ -572,7 +576,7 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, > */ > ib_api_status_t > osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm, > - IN const ib_net16_t mlid, IN const ib_net64_t port_guid); > + IN osm_mgrp_t * p_mgrp, IN ib_net64_t port_guid); > /* > * PARAMETERS > * p_sm > diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h > index 6c20de8..fad8780 100644 > --- a/opensm/include/opensm/osm_subnet.h > +++ b/opensm/include/opensm/osm_subnet.h > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -513,7 +513,7 @@ typedef struct osm_subn { > boolean_t coming_out_of_standby; > unsigned need_update; > cl_fmap_t mgrp_mgid_tbl; > - void *mgroups[IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + 1]; > + void *mgroup_holders[IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + 1]; > } osm_subn_t; > /* > * FIELDS > @@ -634,8 +634,8 @@ typedef struct osm_subn { > * This flag should be on during first non-master heavy > * (including pre-master discovery stage) > * > -* mgroups > -* Array of pointers to all Multicast Group objects in the subnet. > +* mgroup_holders > +* Array of pointers to all Multicast Group Holder objects in the subnet. > * Indexed by MLID offset from base MLID. > * > * SEE ALSO > @@ -935,32 +935,34 @@ struct osm_port *osm_get_port_by_guid(IN osm_subn_t const *p_subn, > * osm_port_t > *********/ > > -/****f* OpenSM: Subnet/osm_get_mgrp_by_mlid > +/****f* OpenSM: Multicast Group Holder /osm_mgrp_holder_get_mlid_by_mgid > * NAME > -* osm_get_mgrp_by_mlid > +* osm_mgrp_holder_get_mlid_by_mgid > * > * DESCRIPTION > -* The looks for the given multicast group in the subnet table by mlid. > -* NOTE: this code is not thread safe. Need to grab the lock before > -* calling it. > +* Searches mgroup with given mgid > +* Returns mlid of the found mgroup > * > * SYNOPSIS > */ > -static inline > -struct osm_mgrp *osm_get_mgrp_by_mlid(osm_subn_t const *p_subn, ib_net16_t mlid) > -{ > - return p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; > -} > +ib_net16_t osm_mgrp_holder_get_mlid_by_mgid(IN osm_subn_t const *p_subn, > + IN const ib_gid_t * const p_mgid); If you are placing this helper here (in osm_subnet) please keep name convention, something like: osm_get_mgrp_holder_by_mgid(); > /* > * PARAMETERS > +* > * p_subn > -* [in] Pointer to an osm_subn_t object > +* [in] Pointer to osm_subn_t object > * > -* mlid > -* [in] The multicast group mlid in network order > +* p_mgid > +* [in] pointer to mgid > * > * RETURN VALUES > -* The multicast group structure pointer if found. NULL otherwise. > +* mlid of found holder, or zero. > +* > +* NOTES > +* > +* SEE ALSO > +* > *********/ > > /****f* OpenSM: Helper/osm_get_physp_by_mad_addr > diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c > index c9a4f33..e1f2bd3 100644 > --- a/opensm/opensm/osm_drop_mgr.c > +++ b/opensm/opensm/osm_drop_mgr.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -158,7 +158,6 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) > osm_port_t *p_port_check; > cl_qmap_t *p_sm_guid_tbl; > osm_mcm_info_t *p_mcm; > - osm_mgrp_t *p_mgrp; > cl_ptr_vector_t *p_port_lid_tbl; > uint16_t min_lid_ho; > uint16_t max_lid_ho; > @@ -168,6 +167,7 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) > ib_gid_t port_gid; > ib_mad_notice_attr_t notice; > ib_api_status_t status; > + osm_mgrp_holder_t *p_mgrp_holder; > > OSM_LOG_ENTER(sm->p_log); > > @@ -212,10 +212,12 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) > > p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); > while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { > - p_mgrp = osm_get_mgrp_by_mlid(sm->p_subn, p_mcm->mlid); > - if (p_mgrp) { > - osm_mgrp_delete_port(sm->p_subn, sm->p_log, > - p_mgrp, p_port->guid); > + p_mgrp_holder = > + osm_get_mgrp_holder_by_mlid(sm->p_subn, p_mcm->mlid); > + if (p_mgrp_holder) { > + osm_mgrp_holder_remove_port(sm->p_subn, sm->p_log, > + p_mgrp_holder, > + p_port->guid); > osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); > } > p_mcm = > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index 4dbbaa0..f506393 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -55,6 +55,7 @@ > #include > #include > #include > +#include > > /********************************************************************** > **********************************************************************/ > @@ -111,14 +112,15 @@ static void mcast_mgr_purge_tree_node(IN osm_mtree_node_t * p_mtn) > > /********************************************************************** > **********************************************************************/ > -static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_t * p_mgrp) > +static void mcast_mgr_purge_tree(osm_sm_t * sm, > + IN osm_mgrp_holder_t * p_mgrp_holder) > { > OSM_LOG_ENTER(sm->p_log); > > - if (p_mgrp->p_root) > - mcast_mgr_purge_tree_node(p_mgrp->p_root); > + if (p_mgrp_holder->p_root) > + mcast_mgr_purge_tree_node(p_mgrp_holder->p_root); > > - p_mgrp->p_root = NULL; > + p_mgrp_holder->p_root = NULL; > > OSM_LOG_EXIT(sm->p_log); > } > @@ -126,41 +128,40 @@ static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_t * p_mgrp) > /********************************************************************** > **********************************************************************/ > static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, > - const osm_mgrp_t * p_mgrp, > + const osm_mgrp_holder_t * > + p_mgrp_holder, > const osm_switch_t * p_sw) > { > float avg_hops = 0; > uint32_t hops = 0; > uint32_t num_ports = 0; > const osm_port_t *p_port; > - const osm_mcm_port_t *p_mcm_port; > - const cl_qmap_t *p_mcm_tbl; > + const osm_mgrp_port_t *p_holder_port; > > OSM_LOG_ENTER(sm->p_log); > > - p_mcm_tbl = &p_mgrp->mcm_port_tbl; > > /* > For each member of the multicast group, compute the > number of hops to its base LID. > */ > - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); > - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); > - p_mcm_port = > - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { > + for (p_holder_port = > + (osm_mgrp_port_t *) cl_qmap_head(&p_mgrp_holder->mgrp_port_map); > + p_holder_port != > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map); > + p_holder_port = > + (osm_mgrp_port_t *) cl_qmap_next(&p_holder_port->guid_item)) { > /* > Acquire the port object for this port guid, then create > the new worker object to build the list. > */ > p_port = osm_get_port_by_guid(sm->p_subn, > - ib_gid_get_guid(&p_mcm_port-> > - port_gid)); > + p_holder_port->port_guid); > > if (!p_port) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A18: " > "No port object for port 0x%016" PRIx64 "\n", > - cl_ntoh64(ib_gid_get_guid > - (&p_mcm_port->port_gid))); > + cl_ntoh64(p_holder_port->port_guid)); > continue; > } > > @@ -185,40 +186,39 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, > of the group HCAs > **********************************************************************/ > static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, > - const osm_mgrp_t * p_mgrp, > + const osm_mgrp_holder_t * > + p_mgrp_holder, > const osm_switch_t * p_sw) > { > uint32_t max_hops = 0; > uint32_t hops = 0; > const osm_port_t *p_port; > - const osm_mcm_port_t *p_mcm_port; > - const cl_qmap_t *p_mcm_tbl; > + const osm_mgrp_port_t *p_mgrp_holder_port; > > OSM_LOG_ENTER(sm->p_log); > > - p_mcm_tbl = &p_mgrp->mcm_port_tbl; > > /* > For each member of the multicast group, compute the > number of hops to its base LID. > */ > - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); > - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); > - p_mcm_port = > - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { > + for (p_mgrp_holder_port = > + (osm_mgrp_port_t *) cl_qmap_head(&p_mgrp_holder->mgrp_port_map); > + p_mgrp_holder_port != > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map); > + p_mgrp_holder_port = > + (osm_mgrp_port_t *) cl_qmap_next(&p_mgrp_holder_port->guid_item)) { > /* > Acquire the port object for this port guid, then create > the new worker object to build the list. > */ > p_port = osm_get_port_by_guid(sm->p_subn, > - ib_gid_get_guid(&p_mcm_port-> > - port_gid)); > + p_mgrp_holder_port->port_guid); > > if (!p_port) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1A: " > "No port object for port 0x%016" PRIx64 "\n", > - cl_ntoh64(ib_gid_get_guid > - (&p_mcm_port->port_gid))); > + cl_ntoh64(p_mgrp_holder_port->port_guid)); > continue; > } > > @@ -244,7 +244,8 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, > of the multicast group. > **********************************************************************/ > static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, > - const osm_mgrp_t * p_mgrp) > + const osm_mgrp_holder_t * > + p_mgrp_holder) > { > cl_qmap_t *p_sw_tbl; > const osm_switch_t *p_sw; > @@ -252,7 +253,7 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, > float hops = 0; > float best_hops = 10000; /* any big # will do */ > #ifdef OSM_VENDOR_INTF_ANAFA > - boolean_t use_avg_hops = TRUE; /* anafa2 - bug hca on switch *//* use max hops for root */ > + boolean_t use_avg_hops = TRUE; /* anafa2 - bug hca on switch *//* use max hops for root */ > #else > boolean_t use_avg_hops = FALSE; /* use max hops for root */ > #endif > @@ -261,7 +262,7 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, > > p_sw_tbl = &sm->p_subn->sw_guid_tbl; > > - CL_ASSERT(!osm_mgrp_is_empty(p_mgrp)); > + CL_ASSERT(!osm_mgrp_holder_is_empty(p_mgrp_holder)); > > for (p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl); > @@ -270,9 +271,13 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, > continue; > > if (use_avg_hops) > - hops = osm_mcast_mgr_compute_avg_hops(sm, p_mgrp, p_sw); > + hops = > + osm_mcast_mgr_compute_avg_hops(sm, p_mgrp_holder, > + p_sw); > else > - hops = osm_mcast_mgr_compute_max_hops(sm, p_mgrp, p_sw); > + hops = > + osm_mcast_mgr_compute_max_hops(sm, p_mgrp_holder, > + p_sw); > > OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > "Switch 0x%016" PRIx64 ", hops = %f\n", > @@ -301,7 +306,8 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, > This function returns the existing or optimal root swtich for the tree. > **********************************************************************/ > static osm_switch_t *mcast_mgr_find_root_switch(osm_sm_t * sm, > - const osm_mgrp_t * p_mgrp) > + const osm_mgrp_holder_t * > + p_mgrp_holder) > { > const osm_switch_t *p_sw = NULL; > > @@ -313,7 +319,7 @@ static osm_switch_t *mcast_mgr_find_root_switch(osm_sm_t * sm, > the root will be always on the first switch attached to it. > - Very bad ... > */ > - p_sw = mcast_mgr_find_optimal_switch(sm, p_mgrp); > + p_sw = mcast_mgr_find_optimal_switch(sm, p_mgrp_holder); > > OSM_LOG_EXIT(sm->p_log); > return (osm_switch_t *) p_sw; > @@ -393,7 +399,8 @@ static int mcast_mgr_set_tbl(osm_sm_t * sm, IN osm_switch_t * p_sw) > spanning tree that eminate from this switch. On input, the p_list > contains the group members that must be routed from this switch. > **********************************************************************/ > -static void mcast_mgr_subdivide(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > +static void mcast_mgr_subdivide(osm_sm_t * sm, > + osm_mgrp_holder_t * p_mgrp_holder, > osm_switch_t * p_sw, cl_qlist_t * p_list, > cl_qlist_t * list_array, uint8_t array_size) > { > @@ -404,7 +411,7 @@ static void mcast_mgr_subdivide(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > > OSM_LOG_ENTER(sm->p_log); > > - mlid_ho = cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)); > + mlid_ho = cl_ntoh16(osm_mgrp_holder_get_mlid(p_mgrp_holder)); > > /* > For Multicast Groups, we want not to count on previous > @@ -494,7 +501,8 @@ static void mcast_mgr_purge_list(osm_sm_t * sm, cl_qlist_t * p_list) > > The function returns the newly created mtree node element. > **********************************************************************/ > -static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > +static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, > + osm_mgrp_holder_t * p_mgrp_holder, > osm_switch_t * p_sw, > cl_qlist_t * p_list, uint8_t depth, > uint8_t upstream_port, > @@ -520,7 +528,7 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > > node_guid = osm_node_get_node_guid(p_sw->p_node); > node_guid_ho = cl_ntoh64(node_guid); > - mlid_ho = cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)); > + mlid_ho = cl_ntoh16(osm_mgrp_holder_get_mlid(p_mgrp_holder)); > > OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, > "Routing MLID 0x%X through switch 0x%" PRIx64 > @@ -597,7 +605,8 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > for (i = 0; i < max_children; i++) > cl_qlist_init(&list_array[i]); > > - mcast_mgr_subdivide(sm, p_mgrp, p_sw, p_list, list_array, max_children); > + mcast_mgr_subdivide(sm, p_mgrp_holder, p_sw, p_list, list_array, > + max_children); > > p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > > @@ -680,8 +689,9 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, > CL_ASSERT(p_remote_physp); > > p_mtn->child_array[i] = > - mcast_mgr_branch(sm, p_mgrp, p_remote_node->sw, > - p_port_list, depth, > + mcast_mgr_branch(sm, p_mgrp_holder, > + p_remote_node->sw, p_port_list, > + depth, > osm_physp_get_port_num > (p_remote_physp), p_max_depth); > } else { > @@ -716,11 +726,11 @@ Exit: > /********************************************************************** > **********************************************************************/ > static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > - osm_mgrp_t * p_mgrp) > + osm_mgrp_holder_t * > + p_mgrp_holder) > { > - const cl_qmap_t *p_mcm_tbl; > const osm_port_t *p_port; > - const osm_mcm_port_t *p_mcm_port; > + const osm_mgrp_port_t *p_mgrp_port; > uint32_t num_ports; > cl_qlist_t port_list; > osm_switch_t *p_sw; > @@ -739,14 +749,13 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > on multicast forwarding table information if the user wants to > preserve existing multicast routes. > */ > - mcast_mgr_purge_tree(sm, p_mgrp); > + mcast_mgr_purge_tree(sm, p_mgrp_holder); > > - p_mcm_tbl = &p_mgrp->mcm_port_tbl; > - num_ports = cl_qmap_count(p_mcm_tbl); > + num_ports = cl_qmap_count(&p_mgrp_holder->mgrp_port_map); > if (num_ports == 0) { > OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, > "MLID 0x%X has no members - nothing to do\n", > - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp))); > + cl_ntoh16(osm_mgrp_holder_get_mlid(p_mgrp_holder))); > goto Exit; > } > > @@ -766,11 +775,11 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > Locate the switch around which to create the spanning > tree for this multicast group. > */ > - p_sw = mcast_mgr_find_root_switch(sm, p_mgrp); > + p_sw = mcast_mgr_find_root_switch(sm, p_mgrp_holder); > if (p_sw == NULL) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A08: " > "Unable to locate a suitable switch for group 0x%X\n", > - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp))); > + cl_ntoh16(osm_mgrp_holder_get_mlid(p_mgrp_holder))); > status = IB_ERROR; > goto Exit; > } > @@ -778,22 +787,22 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > /* > Build the first "subset" containing all member ports. > */ > - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); > - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); > - p_mcm_port = > - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { > + for (p_mgrp_port = > + (osm_mgrp_port_t *) cl_qmap_head(&p_mgrp_holder->mgrp_port_map); > + p_mgrp_port != > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map); > + p_mgrp_port = > + (osm_mgrp_port_t *) cl_qmap_next(&p_mgrp_port->guid_item)) { > /* > Acquire the port object for this port guid, then create > the new worker object to build the list. > */ > - p_port = osm_get_port_by_guid(sm->p_subn, > - ib_gid_get_guid(&p_mcm_port-> > - port_gid)); > + p_port = > + osm_get_port_by_guid(sm->p_subn, p_mgrp_port->port_guid); > if (!p_port) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A09: " > "No port object for port 0x%016" PRIx64 "\n", > - cl_ntoh64(ib_gid_get_guid > - (&p_mcm_port->port_gid))); > + cl_ntoh64(p_mgrp_port->port_guid)); > continue; > } > > @@ -801,8 +810,7 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > if (p_wobj == NULL) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A10: " > "Insufficient memory to route port 0x%016" > - PRIx64 "\n", > - cl_ntoh64(osm_port_get_guid(p_port))); > + PRIx64 "\n", cl_ntoh64(p_mgrp_port->port_guid)); > continue; > } > > @@ -810,12 +818,14 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, > } > > count = cl_qlist_count(&port_list); > - p_mgrp->p_root = mcast_mgr_branch(sm, p_mgrp, p_sw, &port_list, 0, 0, > - &max_depth); > + p_mgrp_holder->p_root = > + mcast_mgr_branch(sm, p_mgrp_holder, p_sw, &port_list, 0, 0, > + &max_depth); > > OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, > "Configured MLID 0x%X for %u ports, max tree depth = %u\n", > - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth); > + cl_ntoh16(osm_mgrp_holder_get_mlid(p_mgrp_holder)), count, > + max_depth); > > Exit: > OSM_LOG_EXIT(sm->p_log); > @@ -1023,17 +1033,20 @@ Exit: > NOTE : The lock should be held externally! > **********************************************************************/ > static ib_api_status_t mcast_mgr_process_mgrp(osm_sm_t * sm, > - IN osm_mgrp_t * p_mgrp) > + IN osm_mgrp_holder_t * p_mgrp_holder) > { > ib_api_status_t status = IB_SUCCESS; > ib_net16_t mlid; > + osm_mgrp_t *p_mgrp; > + cl_list_item_t *p_item; > + unsigned has_full_members = 0; > > OSM_LOG_ENTER(sm->p_log); > > - mlid = osm_mgrp_get_mlid(p_mgrp); > + mlid = osm_mgrp_holder_get_mlid(p_mgrp_holder); > > OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > - "Processing multicast group 0x%X\n", cl_ntoh16(mlid)); > + "Processing multicast group_holder 0x%X\n", cl_ntoh16(mlid)); > > /* > Clear the multicast tables to start clean, then build > @@ -1042,27 +1055,52 @@ static ib_api_status_t mcast_mgr_process_mgrp(osm_sm_t * sm, > */ > mcast_mgr_clear(sm, cl_ntoh16(mlid)); > > - if (p_mgrp->full_members) { > - status = mcast_mgr_build_spanning_tree(sm, p_mgrp); > + p_item = cl_qlist_head(&p_mgrp_holder->mgrp_list); > + while (p_item != cl_qlist_end(&p_mgrp_holder->mgrp_list)) { > + char gid_str[INET6_ADDRSTRLEN]; > + p_mgrp = (osm_mgrp_t *) > + PARENT_STRUCT(p_item, osm_mgrp_t, mlid_item); > + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > + "MLID 0x%x has mgrp %s\n",cl_ntoh16(p_mgrp->mlid), > + inet_ntop(AF_INET6, > + p_mgrp->mcmember_rec.mgid.raw, > + gid_str, sizeof(gid_str))); > + p_item = cl_qlist_next(p_item); > + if (p_mgrp->to_be_deleted) { > + osm_mcm_port_t *p_mcm_port; Wrong indentation (here and in many other places). > + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > + "Destroying mgrp %s with lid:0x%x\n", > + inet_ntop(AF_INET6, > + p_mgrp->mcmember_rec.mgid.raw, > + gid_str, sizeof(gid_str)), > + cl_ntoh16(p_mgrp->mlid)); > + osm_mgrp_holder_delete_mgrp(p_mgrp_holder, p_mgrp); > + p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(&p_mgrp->mcm_port_tbl); > + while (p_mcm_port != > + (osm_mcm_port_t *) cl_qmap_end(&p_mgrp->mcm_port_tbl)) { > + osm_mgrp_holder_port_delete_mgrp(p_mgrp_holder, p_mgrp, > + p_mcm_port->port_gid.unicast.interface_id); > + p_mcm_port = > + (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item); > + } > + cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, > + &p_mgrp->map_item); > + osm_mgrp_delete(p_mgrp); I'm not happy with this cleaning block. Actually it removes multicast group (mgrp) - would be better to consolidate this code in separate function. > + } > + else if (!has_full_members) > + has_full_members = p_mgrp->full_members; No need to bother with condition check, something like: else has_full_members = 1; should be enough here. > + } > + if (has_full_members) { > + status = mcast_mgr_build_spanning_tree(sm, p_mgrp_holder); > if (status != IB_SUCCESS) { > OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A17: " > "Unable to create spanning tree (%s)\n", > ib_get_err_str(status)); > goto Exit; > } > - } else if (p_mgrp->to_be_deleted) { > - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > - "Destroying mgrp with lid:0x%x\n", > - cl_ntoh16(p_mgrp->mlid)); > - sm->p_subn->mgroups[cl_ntoh16(p_mgrp->mlid) - > - IB_LID_MCAST_START_HO] = NULL; > - cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, > - &p_mgrp->map_item); > - osm_mgrp_delete(p_mgrp); > - goto Exit; > + p_mgrp_holder->last_tree_id = p_mgrp_holder->last_change_id; And where is holder deletion handled? I don't see (except a final osm_subnet cleanup). So does it mean that once allocated mlids will be never released? > } > > - p_mgrp->last_tree_id = p_mgrp->last_change_id; > > Exit: > OSM_LOG_EXIT(sm->p_log); > @@ -1076,7 +1114,7 @@ int osm_mcast_mgr_process(osm_sm_t * sm) > osm_switch_t *p_sw; > cl_qmap_t *p_sw_tbl; > cl_qlist_t *p_list = &sm->mgrp_list; > - osm_mgrp_t *p_mgrp; > + osm_mgrp_holder_t *p_mgrp_holder; > int i, ret = 0; > > OSM_LOG_ENTER(sm->p_log); > @@ -1104,9 +1142,10 @@ int osm_mcast_mgr_process(osm_sm_t * sm) > of the subnet. Not due to a specific multicast request. > So the request type is subnet_change and the port guid is 0. > */ > - p_mgrp = sm->p_subn->mgroups[i]; > - if (p_mgrp) > - mcast_mgr_process_mgrp(sm, p_mgrp); > + p_mgrp_holder = sm->p_subn->mgroup_holders[i]; > + if (p_mgrp_holder) { > + mcast_mgr_process_mgrp(sm, p_mgrp_holder); > + } > } > > /* > @@ -1141,7 +1180,7 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) > cl_qlist_t *p_list = &sm->mgrp_list; > osm_switch_t *p_sw; > cl_qmap_t *p_sw_tbl; > - osm_mgrp_t *p_mgrp; > + osm_mgrp_holder_t *p_mgrp_holder; > ib_net16_t mlid; > osm_mcast_mgr_ctxt_t *ctx; > int ret = 0; > @@ -1169,24 +1208,25 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) > > /* since we delayed the execution we prefer to pass the > mlid as the mgrp identifier and then find it or abort */ > - p_mgrp = osm_get_mgrp_by_mlid(sm->p_subn, mlid); > - if (!p_mgrp) > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(sm->p_subn, mlid); > + if (!p_mgrp_holder) > continue; > > /* if there was no change from the last time > * we processed the group we can skip doing anything > */ > - if (p_mgrp->last_change_id == p_mgrp->last_tree_id) { > + if (p_mgrp_holder->last_change_id == > + p_mgrp_holder->last_tree_id) { > OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > - "Skip processing mgrp with lid:0x%X change id:%u\n", > - cl_ntoh16(mlid), p_mgrp->last_change_id); > + "Skip processing p_mgrp_holder with lid:0x%X change id:%u\n", > + cl_ntoh16(mlid), p_mgrp_holder->last_change_id); > continue; > } > > OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > "Processing mgrp with lid:0x%X change id:%u\n", > - cl_ntoh16(mlid), p_mgrp->last_change_id); > - mcast_mgr_process_mgrp(sm, p_mgrp); > + cl_ntoh16(mlid), p_mgrp_holder->last_change_id); > + mcast_mgr_process_mgrp(sm, p_mgrp_holder); > } > > /* > diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c > index d2733c4..072b591 100644 > --- a/opensm/opensm/osm_multicast.c > +++ b/opensm/opensm/osm_multicast.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * > @@ -48,6 +48,7 @@ > #include > #include > #include > +#include > > /********************************************************************** > **********************************************************************/ > @@ -67,8 +68,6 @@ void osm_mgrp_delete(IN osm_mgrp_t * p_mgrp) > (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item); > osm_mcm_port_delete(p_mcm_port); > } > - /* destroy the mtree_node structure */ > - osm_mtree_destroy(p_mgrp->p_root); > > free(p_mgrp); > } > @@ -86,9 +85,6 @@ osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) > memset(p_mgrp, 0, sizeof(*p_mgrp)); > cl_qmap_init(&p_mgrp->mcm_port_tbl); > p_mgrp->mlid = mlid; > - p_mgrp->last_change_id = 0; > - p_mgrp->last_tree_id = 0; > - p_mgrp->to_be_deleted = FALSE; > > return p_mgrp; > } > @@ -133,6 +129,7 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, > ib_net64_t port_guid; > osm_mcm_port_t *p_mcm_port; > cl_map_item_t *prev_item; > + osm_mgrp_holder_t *p_mgrp_holder; > uint8_t prev_join_state = 0; > uint8_t prev_scope; > > @@ -167,9 +164,18 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, > p_mcm_port->scope_state = > ib_member_set_scope_state(prev_scope, > prev_join_state | join_state); > - } else { > - /* track the fact we modified the group ports */ > - p_mgrp->last_change_id++; > + } > + > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(subn, p_mgrp->mlid); > + if (! p_mgrp_holder || Is it legal case? > + (IB_SUCCESS != osm_mgrp_holder_port_add_mgrp(p_mgrp_holder, > + p_mgrp, port_guid)) ) { > + /* if the above failed and added port is new one, remove port also from mcm_port_tbl */ > + if (! prev_join_state) { I'm not following. Why is this condition needed for error cleanup? > + cl_qmap_remove_item(&p_mgrp->mcm_port_tbl, &p_mcm_port->map_item); > + osm_mcm_port_delete(p_mcm_port); > + } > + return NULL; > } > > if ((join_state & IB_JOIN_STATE_FULL) && > @@ -212,7 +218,6 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, > cl_ntoh64(mcm->port_gid.unicast.interface_id)); > osm_mcm_port_delete(mcm); > /* track the fact we modified the group */ > - mgrp->last_change_id++; > ret = 1; > } > > @@ -285,16 +290,173 @@ static void mgrp_apply_func_sub(const osm_mgrp_t * p_mgrp, > > /********************************************************************** > **********************************************************************/ > -void osm_mgrp_apply_func(const osm_mgrp_t * p_mgrp, osm_mgrp_func_t p_func, > - void *context) This function seems to be not used and removing this is basically fine, but please: do it in separate patch, remove its prototypes from header files, cleanup associated compiler warnings. > +static osm_mgrp_port_t *osm_mgrp_port_new(ib_net64_t port_guid) > +{ > + osm_mgrp_port_t *p_mgrp_port = > + (osm_mgrp_port_t *) malloc(sizeof(osm_mgrp_port_t)); > + if (!p_mgrp_port) { > + return NULL; > + } > + memset(p_mgrp_port, 0, sizeof(*p_mgrp_port)); > + p_mgrp_port->port_guid = port_guid; > + cl_qlist_init(&p_mgrp_port->mgroups); > + return p_mgrp_port; > +} > + > +/********************************************************************** > + **********************************************************************/ > +osm_mgrp_holder_t *osm_mgrp_holder_new(IN osm_subn_t * p_subn, > + ib_net16_t mlid) > { > - osm_mtree_node_t *p_mtn; > + osm_mgrp_holder_t *p_mgrp_holder; > + p_mgrp_holder = > + p_subn->mgroup_holders[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = > + (osm_mgrp_holder_t *) malloc(sizeof(*p_mgrp_holder)); > + if (!p_mgrp_holder) > + return NULL; > > - CL_ASSERT(p_mgrp); > - CL_ASSERT(p_func); > + memset(p_mgrp_holder, 0, sizeof(*p_mgrp_holder)); > + p_mgrp_holder->mlid = mlid; > + cl_qmap_init(&p_mgrp_holder->mgrp_port_map); > + cl_qlist_init(&p_mgrp_holder->mgrp_list); > + return p_mgrp_holder; > +} > + > +/********************************************************************** > + **********************************************************************/ > +void osm_mgrp_holder_delete(IN osm_subn_t *p_subn, ib_net16_t mlid) > +{ > + osm_mgrp_port_t *p_osm_mgr_port; > + cl_map_item_t *p_item; > + > + osm_mgrp_holder_t *p_mgrp_holder = > + p_subn->mgroup_holders[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; > + p_item = cl_qmap_head(&p_mgrp_holder->mgrp_port_map); > + /* Delete ports shared same MLID */ > + while (p_item != cl_qmap_end(&p_mgrp_holder->mgrp_port_map)) { > + p_osm_mgr_port = (osm_mgrp_port_t *) p_item; > + cl_qlist_remove_all(&p_osm_mgr_port->mgroups); > + cl_qmap_remove_item(&p_mgrp_holder->mgrp_port_map, p_item); > + p_item = cl_qmap_head(&p_mgrp_holder->mgrp_port_map); > + free(p_osm_mgr_port); > + } > + /* Remove mgrp from this MLID */ > + cl_qlist_remove_all(&p_mgrp_holder->mgrp_list); > + /* Destroy the mtree_node structure */ > + osm_mtree_destroy(p_mgrp_holder->p_root); > + p_subn->mgroup_holders[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = NULL; > + free(p_mgrp_holder); > +} As commented already I cannot see where those resources are freed (except final OpenSM destroying). > + > +/********************************************************************** > + **********************************************************************/ > +void osm_mgrp_holder_remove_port(osm_subn_t * subn, osm_log_t * p_log, > + osm_mgrp_holder_t * p_mgrp_holder, > + ib_net64_t port_guid) > +{ > + osm_mgrp_t *p_mgrp; > + cl_list_item_t *p_item; > + > + OSM_LOG_ENTER(p_log); > + > + osm_mgrp_port_t *p_mgrp_port = (osm_mgrp_port_t *) > + cl_qmap_remove(&p_mgrp_holder->mgrp_port_map, port_guid); > + if (p_mgrp_port != > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map)) { > + char gid_str[INET6_ADDRSTRLEN]; > + OSM_LOG(p_log, OSM_LOG_DEBUG, > + "port 0x%" PRIx64 " removed from mlid 0x%X\n", > + port_guid, cl_ntoh16(p_mgrp_holder->mlid)); > + while ((p_item = > + cl_qlist_remove_head(&p_mgrp_port->mgroups)) != > + cl_qlist_end(&p_mgrp_port->mgroups)) { > + p_mgrp = (osm_mgrp_t *) > + PARENT_STRUCT(p_item, osm_mgrp_t,port_item); > + OSM_LOG(p_log, OSM_LOG_DEBUG, > + "removing mgrp mgid %s from port 0x%" PRIx64"\n", > + inet_ntop(AF_INET6,p_mgrp->mcmember_rec.mgid.raw, > + gid_str, sizeof(gid_str)), > + cl_ntoh64(port_guid)); > + osm_mgrp_delete_port(subn, p_log, p_mgrp, port_guid); > + } > + free(p_mgrp_port); > + } > + OSM_LOG_EXIT(p_log); > +} > > - p_mtn = p_mgrp->p_root; > +/********************************************************************** > + **********************************************************************/ > +void osm_mgrp_holder_add_mgrp(osm_mgrp_holder_t * p_mgrp_holder, > + osm_mgrp_t * p_mgrp, osm_log_t * p_log) > +{ > + char gid_str[INET6_ADDRSTRLEN]; > + > + OSM_LOG_ENTER(p_log); > + p_mgrp_holder->to_be_deleted = 0; > + cl_qlist_insert_tail(&p_mgrp_holder->mgrp_list, &p_mgrp->mlid_item); > + OSM_LOG(p_log, OSM_LOG_DEBUG, > + "mgrp with MGID:%s added to holder with mlid = 0x%X\n", > + inet_ntop(AF_INET6, p_mgrp->mcmember_rec.mgid.raw, gid_str, > + sizeof(gid_str)), cl_ntoh16(p_mgrp_holder->mlid)); > + p_mgrp_holder->last_change_id++; > + OSM_LOG_EXIT(p_log); > +} > > - if (p_mtn) > - mgrp_apply_func_sub(p_mgrp, p_mtn, p_func, context); > +/********************************************************************** > + **********************************************************************/ > +void osm_mgrp_holder_delete_mgrp(osm_mgrp_holder_t * p_mgrp_holder, > + osm_mgrp_t * p_mgrp) > +{ > + p_mgrp->to_be_deleted = 1; > + cl_qlist_remove_item(&p_mgrp_holder->mgrp_list, &p_mgrp->mlid_item); > + if (0 == cl_qlist_count(&p_mgrp_holder->mgrp_list)) { > + /* No more mgroups on this mlid */ > + p_mgrp_holder->to_be_deleted = 1; > + p_mgrp_holder->last_tree_id = 0; > + p_mgrp_holder->last_change_id = 0; > + } > +} > + > +/********************************************************************** > + **********************************************************************/ > +ib_api_status_t osm_mgrp_holder_port_add_mgrp(osm_mgrp_holder_t * p_mgrp_holder, > + osm_mgrp_t * p_mgrp, > + ib_net64_t port_guid) > +{ > + osm_mgrp_port_t *p_mgrp_port = (osm_mgrp_port_t *) > + cl_qmap_get(&p_mgrp_holder->mgrp_port_map, port_guid); > + if (p_mgrp_port == > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map)) { > + /* new port to mlid */ > + p_mgrp_port = osm_mgrp_port_new(port_guid); > + if (!p_mgrp_port) { > + return IB_INSUFFICIENT_MEMORY; > + } > + cl_qmap_insert(&p_mgrp_holder->mgrp_port_map, > + p_mgrp_port->port_guid, &p_mgrp_port->guid_item); > + } > + cl_qlist_insert_tail(&p_mgrp_port->mgroups, &p_mgrp->port_item); This function (osm_mgrp_holder_port_add_mgrp()) is called from osm_mgrp_add_port() and as far as I can see will be executed not only on "pure" port addition but also when port join state in specified MC group is changing. If so wouldn't this potentially double MC group addition corrupt this list? > + p_mgrp_holder->last_change_id++; And for same reason, wouldn't last_change_id be updated when actually no ports were added to MC group? > + return IB_SUCCESS; > +} > + > +/********************************************************************** > + **********************************************************************/ > +void osm_mgrp_holder_port_delete_mgrp(osm_mgrp_holder_t * p_mgrp_holder, > + osm_mgrp_t * p_mgrp, > + ib_net64_t port_guid) > +{ > + osm_mgrp_port_t *p_mgrp_port = (osm_mgrp_port_t *) > + cl_qmap_get(&p_mgrp_holder->mgrp_port_map, port_guid); > + if (p_mgrp_port != > + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_holder->mgrp_port_map)) { > + cl_qlist_remove_item(&p_mgrp_port->mgroups, &p_mgrp->port_item); > + if (0 == cl_qlist_count(&p_mgrp_port->mgroups)) { > + /* No mgroups registered on this port for current mlid */ > + cl_qmap_remove_item(&p_mgrp_holder->mgrp_port_map, > + &p_mgrp_port->guid_item); > + free(p_mgrp_port); > + } > + p_mgrp_holder->last_change_id++; > + } > } > diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c > index 7826578..041377f 100644 > --- a/opensm/opensm/osm_qos_policy.c > +++ b/opensm/opensm/osm_qos_policy.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -785,7 +785,9 @@ static void __qos_policy_validate_pkey( > uint8_t sl; > uint32_t flow; > uint8_t hop; > + osm_mgrp_holder_t * p_mgrp_holder; > osm_mgrp_t * p_mgrp; > + cl_list_item_t *p_item; > > if (!p_qos_policy || !p_qos_match_rule || !p_prtn) > return; > @@ -809,31 +811,35 @@ static void __qos_policy_validate_pkey( > if (!p_prtn->mlid) > return; > > - p_mgrp = osm_get_mgrp_by_mlid(p_qos_policy->p_subn, p_prtn->mlid); > - if (!p_mgrp) { > + p_mgrp_holder = > + osm_get_mgrp_holder_by_mlid(p_qos_policy->p_subn, p_prtn->mlid); > + if (!p_mgrp_holder) { > OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_ERROR, > - "ERR AC16: MCast group for partition with " > - "pkey 0x%04X not found\n", > - cl_ntoh16(p_prtn->pkey)); > + "ERR AC16: MCast mgrp_holder for partition with pkey 0x%04X not found\n", > + cl_ntoh16(p_prtn->pkey)); > return; > } > > - CL_ASSERT((cl_ntoh16(p_mgrp->mcmember_rec.pkey) & 0x7fff) == > - (cl_ntoh16(p_prtn->pkey) & 0x7fff)); > - > - ib_member_get_sl_flow_hop(p_mgrp->mcmember_rec.sl_flow_hop, > - &sl, &flow, &hop); > - if (sl != p_prtn->sl) { > - OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, > + p_item = cl_qlist_head(&p_mgrp_holder->mgrp_list); > + while (p_item != cl_qlist_end(&p_mgrp_holder->mgrp_list)) { > + p_mgrp = (osm_mgrp_t *) PARENT_STRUCT(p_item, osm_mgrp_t, > + mlid_item); > + p_item = cl_qlist_next(p_item); > + CL_ASSERT((cl_ntoh16(p_mgrp->mcmember_rec.pkey) & 0x7fff) == > + (cl_ntoh16(p_prtn->pkey) & 0x7fff)); > + ib_member_get_sl_flow_hop(p_mgrp->mcmember_rec.sl_flow_hop, > + &sl, &flow, &hop); > + if (sl != p_prtn->sl) { > + OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, > "Updating MCGroup (MLID 0x%04x) SL to " > "match partition SL (%u)\n", > cl_hton16(p_mgrp->mcmember_rec.mlid), > p_prtn->sl); > - p_mgrp->mcmember_rec.sl_flow_hop = > - ib_member_set_sl_flow_hop(p_prtn->sl, flow, hop); > + p_mgrp->mcmember_rec.sl_flow_hop = > + ib_member_set_sl_flow_hop(p_prtn->sl, flow, hop); > + } > } > } > - > /*************************************************** > ***************************************************/ > > diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c > index fcc3f27..22dd495 100644 > --- a/opensm/opensm/osm_sa.c > +++ b/opensm/opensm/osm_sa.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -706,17 +706,15 @@ static void sa_dump_all_sa(osm_opensm_t * p_osm, FILE * file) > { > struct opensm_dump_context dump_context; > osm_mgrp_t *p_mgrp; > - int i; > > dump_context.p_osm = p_osm; > dump_context.file = file; > OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, "Dump multicast\n"); > cl_plock_acquire(&p_osm->lock); > - for (i = 0; i <= p_osm->subn.max_mcast_lid_ho - IB_LID_MCAST_START_HO; > - i++) { > - p_mgrp = p_osm->subn.mgroups[i]; > - if (p_mgrp) > - sa_dump_one_mgrp(p_mgrp, &dump_context); > + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_osm->subn.mgrp_mgid_tbl); > + while (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_osm->subn.mgrp_mgid_tbl)) { > + sa_dump_one_mgrp(p_mgrp, &dump_context); > + p_mgrp = (osm_mgrp_t*) cl_fmap_next(&p_mgrp->map_item); > } > OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, "Dump inform\n"); > cl_qlist_apply_func(&p_osm->subn.sa_infr_list, > @@ -740,23 +738,16 @@ static osm_mgrp_t *load_mcgroup(osm_opensm_t * p_osm, ib_net16_t mlid, > unsigned well_known) > { > ib_net64_t comp_mask; > - osm_mgrp_t *p_mgrp; > > + cl_fmap_item_t *p_fitem; > + osm_mgrp_t *p_mgrp = NULL; > cl_plock_excl_acquire(&p_osm->lock); > > - p_mgrp = osm_get_mgrp_by_mlid(&p_osm->subn, mlid); > - if (p_mgrp) { > - if (!memcmp(&p_mgrp->mcmember_rec.mgid, &p_mcm_rec->mgid, > - sizeof(ib_gid_t))) { > - OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, > - "mgrp %04x is already here.", cl_ntoh16(mlid)); > + p_fitem = cl_fmap_get(&p_osm->subn.mgrp_mgid_tbl, &p_mcm_rec->mgid); > + if (p_fitem != cl_fmap_end(&p_osm->subn.mgrp_mgid_tbl)) { > + OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, > + "mgrp %04x is already here.", cl_ntoh16(mlid)); > goto _out; > - } > - OSM_LOG(&p_osm->log, OSM_LOG_VERBOSE, > - "mlid %04x is already used by another MC group. Will " > - "request clients reregistration.\n", cl_ntoh16(mlid)); > - p_mgrp = NULL; > - goto _out; > } > > comp_mask = IB_MCR_COMPMASK_MTU | IB_MCR_COMPMASK_MTU_SEL > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > index a9e0a3b..3838a08 100644 > --- a/opensm/opensm/osm_sa_mcmember_record.c > +++ b/opensm/opensm/osm_sa_mcmember_record.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -121,14 +121,17 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) > > if (requested_mlid && cl_ntoh16(requested_mlid) >= IB_LID_MCAST_START_HO > && cl_ntoh16(requested_mlid) <= p_subn->max_mcast_lid_ho > - && !osm_get_mgrp_by_mlid(p_subn, requested_mlid)) > + && !osm_get_mgrp_holder_by_mlid(p_subn, requested_mlid)) > return requested_mlid; > > max = p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO + 1; > for (i = 0; i < max; i++) { > - osm_mgrp_t *p_mgrp = sa->p_subn->mgroups[i]; > - if (!p_mgrp || p_mgrp->to_be_deleted) > - return cl_hton16(i + IB_LID_MCAST_START_HO); > + osm_mgrp_holder_t *p_mgrp_holder = sa->p_subn->mgroup_holders[i]; > + if (!p_mgrp_holder || p_mgrp_holder->to_be_deleted) { > + OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "returning mgrp_holder to_be_deleted =%d\n", > + p_mgrp_holder ? p_mgrp_holder->to_be_deleted : 0); Is such debug message needed permanently (for more than development uses)? I think that we should avoid adding new debug messages when possible (as well as to cleanup existing ones) - OpenSM debug log is overflowed by junks and almost unreadable now. > + return cl_hton16(i + IB_LID_MCAST_START_HO); > + } > } > > return 0; > @@ -146,8 +149,9 @@ static void cleanup_mgrp(IN osm_sa_t * sa, osm_mgrp_t * mgrp) > /* Remove MGRP only if osm_mcm_port_t count is 0 and > not a well known group */ > if (cl_is_qmap_empty(&mgrp->mcm_port_tbl) && !mgrp->well_known) { > - sa->p_subn->mgroups[cl_ntoh16(mgrp->mlid) - > - IB_LID_MCAST_START_HO] = NULL; > + osm_mgrp_holder_t *p_mgrp_holder = > + osm_get_mgrp_holder_by_mlid(sa->p_subn, mgrp->mlid); > + osm_mgrp_holder_delete_mgrp(p_mgrp_holder, mgrp); > cl_fmap_remove_item(&sa->p_subn->mgrp_mgid_tbl, > &mgrp->map_item); > osm_mgrp_delete(mgrp); > @@ -802,19 +806,19 @@ static boolean_t mgrp_request_is_realizable(IN osm_sa_t * sa, > Call this function to create a new mgrp. > **********************************************************************/ > ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, > - IN ib_net64_t comp_mask, > - IN const ib_member_rec_t * > - const p_recvd_mcmember_rec, > - IN const osm_physp_t * p_physp, > - OUT osm_mgrp_t ** pp_mgrp) > + IN ib_net64_t comp_mask, > + IN const ib_member_rec_t * > + const p_recvd_mcmember_rec, > + IN const osm_physp_t * p_physp, > + OUT osm_mgrp_t ** pp_mgrp) > { > - ib_net16_t mlid; > + ib_net16_t mlid, existed_mlid; > unsigned zero_mgid, i; > uint8_t scope; > ib_gid_t *p_mgid; > - osm_mgrp_t *p_prev_mgrp; > ib_api_status_t status = IB_SUCCESS; > ib_member_rec_t mcm_rec = *p_recvd_mcmember_rec; /* copy for modifications */ > + osm_mgrp_holder_t * p_mgrp_holder; > > OSM_LOG_ENTER(sa->p_log); > > @@ -890,6 +894,15 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, > goto Exit; > } > > + if (0 != (existed_mlid = osm_mgrp_holder_get_mlid_by_mgid(sa->p_subn, p_mgid))) { > + char gid_str[INET6_ADDRSTRLEN]; > + mlid = existed_mlid; > + OSM_LOG(sa->p_log, OSM_LOG_DEBUG, > + "found existed mlid 0x%04x for mgid %s\n", > + cl_ntoh16(mlid), inet_ntop(AF_INET6, p_mgid->raw, > + gid_str, sizeof gid_str)); This debug message is pure duplication. > + } > + > /* create a new MC Group */ > *pp_mgrp = osm_mgrp_new(mlid); > if (*pp_mgrp == NULL) { > @@ -914,25 +927,26 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, > > /* Insert the new group in the data base */ > > - /* since we might have an old group by that mlid > - one whose deletion was delayed for an idle time > - we need to deallocate it first */ > - p_prev_mgrp = osm_get_mgrp_by_mlid(sa->p_subn, mlid); > - if (p_prev_mgrp) { > + > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(sa->p_subn, mlid); > + if (!p_mgrp_holder) { > OSM_LOG(sa->p_log, OSM_LOG_DEBUG, > - "Found previous group for mlid:0x%04x - " > - "Destroying it first\n", cl_ntoh16(mlid)); > - sa->p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = > - NULL; > - cl_fmap_remove_item(&sa->p_subn->mgrp_mgid_tbl, > - &p_prev_mgrp->map_item); > - osm_mgrp_delete(p_prev_mgrp); > + "Creating new mgrp_holder for mlid:0x%04x\n", > + cl_ntoh16(mlid)); > + p_mgrp_holder = osm_mgrp_holder_new(sa->p_subn, mlid); > } > > + if (!p_mgrp_holder) { > + OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B08: " > + "osm_mgrp_holder_new failed\n"); > + free_mlid(sa, mlid); > + status = IB_INSUFFICIENT_MEMORY; > + goto Exit; > + } > cl_fmap_insert(&sa->p_subn->mgrp_mgid_tbl, > &(*pp_mgrp)->mcmember_rec.mgid, &(*pp_mgrp)->map_item); > > - sa->p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = *pp_mgrp; > + osm_mgrp_holder_add_mgrp(p_mgrp_holder, *pp_mgrp, sa->p_log); > > Exit: > OSM_LOG_EXIT(sa->p_log); > @@ -1074,7 +1088,7 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > CL_PLOCK_RELEASE(sa->p_lock); > > /* we can leave if port was deleted from MCG */ > - if (removed && osm_sm_mcgrp_leave(sa->sm, mlid, portguid)) > + if (removed && osm_sm_mcgrp_leave(sa->sm, p_mgrp, portguid)) > OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B09: " > "osm_sm_mcgrp_leave failed\n"); > > @@ -1102,6 +1116,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > osm_physp_t *p_request_physp; > uint8_t is_new_group; /* TRUE = there is a need to create a group */ > uint8_t join_state; > + osm_mgrp_holder_t *p_mgrp_holder; > > OSM_LOG_ENTER(sa->p_log); > > @@ -1275,6 +1290,8 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > goto Exit; > } > > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(sa->p_subn, mlid); > + CL_ASSERT(p_mgrp_holder); > /* create or update existing port (join-state will be updated) */ > status = add_new_mgrp_port(sa, p_mgrp, p_recvd_mcmember_rec, > osm_madw_get_mad_addr_ptr(p_madw), > @@ -1282,6 +1299,8 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > > if (status != IB_SUCCESS) { > /* we fail to add the port so we might need to delete the group */ > + osm_mgrp_holder_port_delete_mgrp(p_mgrp_holder, p_mgrp, > + p_recvd_mcmember_rec->port_gid.unicast.interface_id); > cleanup_mgrp(sa, p_mgrp); > > CL_PLOCK_RELEASE(sa->p_lock); > @@ -1304,7 +1323,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > /* do the actual routing (actually schedule the update) */ > status = osm_sm_mcgrp_join(sa->sm, mlid, > p_recvd_mcmember_rec->port_gid.unicast. > - interface_id); > + interface_id, &p_recvd_mcmember_rec->mgid); > > if (status != IB_SUCCESS) { > OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B14: " > @@ -1315,9 +1334,10 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); > > /* the request for routing failed so we need to remove the port */ > + osm_mgrp_holder_port_delete_mgrp(p_mgrp_holder, p_mgrp, > + p_recvd_mcmember_rec->port_gid.unicast.interface_id); > osm_mgrp_delete_port(sa->p_subn, sa->p_log, p_mgrp, > - p_recvd_mcmember_rec->port_gid. > - unicast.interface_id); > + p_recvd_mcmember_rec->port_gid.unicast.interface_id); > cleanup_mgrp(sa, p_mgrp); > CL_PLOCK_RELEASE(sa->p_lock); > osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); > @@ -1549,7 +1569,6 @@ static void mcmr_query_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > osm_physp_t *p_req_physp; > boolean_t trusted_req; > osm_mgrp_t *p_mgrp; > - int i; > > OSM_LOG_ENTER(sa->p_log); > > @@ -1578,12 +1597,11 @@ static void mcmr_query_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > CL_PLOCK_ACQUIRE(sa->p_lock); > > /* simply go over all MCGs and match */ > - for (i = 0; i <= sa->p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; > - i++) { > - p_mgrp = sa->p_subn->mgroups[i]; > - if (p_mgrp) > - mcmr_by_comp_mask(sa, p_rcvd_rec, comp_mask, p_mgrp, > - p_req_physp, trusted_req, &rec_list); > + p_mgrp = (osm_mgrp_t *) cl_fmap_head(&sa->p_subn->mgrp_mgid_tbl); > + while (p_mgrp != (osm_mgrp_t *) cl_fmap_end(&sa->p_subn->mgrp_mgid_tbl)) { > + mcmr_by_comp_mask(sa, p_rcvd_rec, comp_mask, p_mgrp, > + p_req_physp, trusted_req, &rec_list); > + p_mgrp = (osm_mgrp_t *) cl_fmap_next(&p_mgrp->map_item); > } > > CL_PLOCK_RELEASE(sa->p_lock); > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > index 75d9516..aa63d78 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -1468,11 +1468,14 @@ static osm_mgrp_t *pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) > mgrp = NULL; > goto Exit; > } > - } else > - if (!(mgrp = osm_get_mgrp_by_mlid(sa->p_subn, p_pr->dlid))) > - OSM_LOG(sa->p_log, OSM_LOG_ERROR, > - "ERR 1F11: " "No MC group found for PathRecord " > + } else { > + mgrp = osm_get_mgrp_by_mgid(sa, &p_pr->dgid); Such replacement seems wrong to me - it is under block where SA PR compmask has DLID (not DGID) bit on. > + if (!mgrp) > + OSM_LOG(sa->p_log, OSM_LOG_ERROR, > + "ERR 1F11: " > + "No MC group found for PathRecord " > "destination LID 0x%x\n", p_pr->dlid); > + } > } > > Exit: > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c > index b3ce69a..d990450 100644 > --- a/opensm/opensm/osm_sm.c > +++ b/opensm/opensm/osm_sm.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -47,6 +47,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -468,12 +469,15 @@ static ib_api_status_t sm_mgrp_process(IN osm_sm_t * p_sm, > /********************************************************************** > **********************************************************************/ > ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, > - IN const ib_net64_t port_guid) > + IN const ib_net64_t port_guid, > + IN const ib_gid_t * p_mgid) Any reason to have 'mlid' as parameter for this function after adding MGID there? > { > - osm_mgrp_t *p_mgrp; > + osm_mgrp_t *p_mgrp = NULL; > osm_port_t *p_port; > ib_api_status_t status = IB_SUCCESS; > osm_mcm_info_t *p_mcm; > + cl_list_item_t *p_item; > + osm_mgrp_holder_t *p_mgrp_holder; > > OSM_LOG_ENTER(p_sm->p_log); > > @@ -497,8 +501,44 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, > /* > * If this multicast group does not already exist, create it. > */ > - p_mgrp = osm_get_mgrp_by_mlid(p_sm->p_subn, mlid); > - if (!p_mgrp || !osm_mgrp_is_guid(p_mgrp, port_guid)) { > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(p_sm->p_subn, mlid); > + if (p_mgrp_holder) { > + char gid_str[INET6_ADDRSTRLEN]; > + if (TRUE) { > + size_t gr_count = cl_qlist_count(&p_mgrp_holder->mgrp_list); > + OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, > + "mlid 0x%X has %lu mgroups\n", cl_ntoh16(mlid), gr_count); > + if (gr_count) { > + p_item = > + cl_qlist_head(&p_mgrp_holder->mgrp_list); > + while (p_item != > + cl_qlist_end(&p_mgrp_holder->mgrp_list)) { > + p_mgrp = (osm_mgrp_t *) > + PARENT_STRUCT(p_item, osm_mgrp_t, > + mlid_item); > + OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, > + "mlid 0x%X has mgrp with MGID: %s\n", > + cl_ntoh16(mlid), > + inet_ntop(AF_INET6, > + p_mgrp->mcmember_rec. > + mgid.raw, gid_str, > + sizeof gid_str)); > + p_item = cl_qlist_next(p_item); > + } > + } > + } What does this 'if (TRUE) ...' block? You overwrites resolved p_mgrp in the line below. > + p_mgrp = (osm_mgrp_t *)cl_fmap_get(&p_sm->p_subn->mgrp_mgid_tbl, p_mgid); > + if (p_mgrp == (osm_mgrp_t *)cl_fmap_end(&p_sm->p_subn->mgrp_mgid_tbl)) { > + p_mgrp = NULL; > + OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, > + "group with MGID: %s not found on mlid 0x%X\n", > + inet_ntop(AF_INET6, > + p_mgid->raw, > + gid_str, sizeof gid_str), > + cl_ntoh16(mlid)); > + } > + } > + if (!p_mgrp_holder || !p_mgrp || !osm_mgrp_is_guid(p_mgrp, port_guid)) { > /* > * The group removed or the port is not a > * member of the group, then fail immediately. > @@ -513,6 +553,22 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, > goto Exit; > } > > + /* if there was no change from the last time > + * we processed the group we can skip doing anything > + */ > + if (p_mgrp_holder->last_change_id == p_mgrp_holder->last_tree_id) { > + OSM_LOG(p_sm->p_log, OSM_LOG_VERBOSE, > + "Skip processing mgrp holder with lid:0x%X last change id:%u\n", > + cl_ntoh16(mlid), p_mgrp_holder->last_change_id); > + goto Exit; > + } else { > + OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, > + "processing mgrp holder with lid:0x%X port: 0x%016" > + PRIx64 " last change id:%u tree id:%u\n", > + cl_ntoh16(mlid), cl_ntoh64(port_guid), > + p_mgrp_holder->last_change_id, > + p_mgrp_holder->last_tree_id); > + } How is this part related to the patch? (And do you need 'holder' struct in this function for any other reason?) It looks that this is part of Eli's patch where he tried to fix MCG join/leave bug, which still have some issues, was commented and no new version was received yet. Sasha > /* > * Check if the object (according to mlid) already exists on this port. > * If it does - then no need to update it again, and no need to > @@ -549,12 +605,13 @@ Exit: > > /********************************************************************** > **********************************************************************/ > -ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, > +ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t * p_mgrp, > IN const ib_net64_t port_guid) > { > - osm_mgrp_t *p_mgrp; > osm_port_t *p_port; > ib_api_status_t status; > + osm_mgrp_holder_t *p_mgrp_holder; > + ib_net16_t mlid = p_mgrp->mlid; > > OSM_LOG_ENTER(p_sm->p_log); > > @@ -577,21 +634,25 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, > } > > /* > - * Get the multicast group object for this group. > + * Get the multicast group holder object for this group. > */ > - p_mgrp = osm_get_mgrp_by_mlid(p_sm->p_subn, mlid); > - if (!p_mgrp) { > + p_mgrp_holder = osm_get_mgrp_holder_by_mlid(p_sm->p_subn, mlid); > + if (!p_mgrp_holder) { > OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E08: " > "No multicast group for MLID 0x%X\n", cl_ntoh16(mlid)); > status = IB_INVALID_PARAMETER; > goto Exit; > } > > + osm_mgrp_holder_port_delete_mgrp(p_mgrp_holder, p_mgrp, port_guid); > /* > * Walk the list of ports in the group, and remove the appropriate one. > */ > osm_port_remove_mgrp(p_port, mlid); > > + OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, > + " Calling sm_mgrp_process for mgrp with mlid = 0x%X\n", > + cl_ntoh16(mlid)); > status = sm_mgrp_process(p_sm, p_mgrp); > Exit: > CL_PLOCK_RELEASE(p_sm->p_lock); > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > index 0d11811..6ed95d4 100644 > --- a/opensm/opensm/osm_subnet.c > +++ b/opensm/opensm/osm_subnet.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. > * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. > @@ -428,8 +428,9 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) > osm_switch_t *p_sw, *p_next_sw; > osm_remote_sm_t *p_rsm, *p_next_rsm; > osm_prtn_t *p_prtn, *p_next_prtn; > - osm_mgrp_t *p_mgrp; > + osm_mgrp_holder_t *p_mgrp_holder; > osm_infr_t *p_infr, *p_next_infr; > + osm_mgrp_t *p_mgrp; > > /* it might be a good idea to de-allocate all known objects */ > p_next_node = (osm_node_t *) cl_qmap_head(&p_subn->node_guid_tbl); > @@ -471,14 +472,20 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) > osm_prtn_delete(&p_prtn); > } > > - cl_fmap_remove_all(&p_subn->mgrp_mgid_tbl); > > for (i = 0; i <= p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; > i++) { > - p_mgrp = p_subn->mgroups[i]; > - p_subn->mgroups[i] = NULL; > - if (p_mgrp) > - osm_mgrp_delete(p_mgrp); > + p_mgrp_holder = p_subn->mgroup_holders[i]; > + if (p_mgrp_holder){ > + osm_mgrp_holder_delete(p_subn, p_mgrp_holder->mlid); > + } > + } > + > + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_subn->mgrp_mgid_tbl); > + while (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_subn->mgrp_mgid_tbl)) { > + cl_fmap_remove_item(&p_subn->mgrp_mgid_tbl, (cl_fmap_item_t*)p_mgrp); > + osm_mgrp_delete(p_mgrp); > + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_subn->mgrp_mgid_tbl); > } > > p_next_infr = (osm_infr_t *) cl_qlist_head(&p_subn->sa_infr_list); > @@ -1646,3 +1653,13 @@ int osm_subn_write_conf_file(char *file_name, IN osm_subn_opt_t *const p_opts) > > return 0; > } > + > +ib_net16_t osm_mgrp_holder_get_mlid_by_mgid(IN osm_subn_t const *p_subn, > + IN const ib_gid_t * const p_mgid) > +{ > + osm_mgrp_t *p_mgrp = (osm_mgrp_t*)cl_fmap_get(&p_subn->mgrp_mgid_tbl, p_mgid); > + if (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_subn->mgrp_mgid_tbl)) { > + return p_mgrp->mlid; > + } > + return 0; > +} > -- > 1.6.3.3 > From sashak at voltaire.com Sun Sep 6 08:49:01 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 18:49:01 +0300 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info Message-ID: <20090906154901.GF25241@me> Port needs to access multicast groups where it is joined to. Now it is implemented by keeping list of list of mcm_info elements where MLID of each multicast group is stored. Obviously this assumes single MGID to MLID mapping model. This patch changes this so that instead of MLID mcm_info stores pointer to multicast group object (mgrp). Such model makes it possible to have MGIDs to MLID compression. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_mcm_info.h | 13 +++++++------ opensm/include/opensm/osm_port.h | 13 +++++++------ opensm/opensm/osm_drop_mgr.c | 10 +++------- opensm/opensm/osm_mcm_info.c | 8 ++++---- opensm/opensm/osm_port.c | 10 +++++----- opensm/opensm/osm_sm.c | 6 +++--- 6 files changed, 29 insertions(+), 31 deletions(-) diff --git a/opensm/include/opensm/osm_mcm_info.h b/opensm/include/opensm/osm_mcm_info.h index dec607f..62ae326 100644 --- a/opensm/include/opensm/osm_mcm_info.h +++ b/opensm/include/opensm/osm_mcm_info.h @@ -47,6 +47,7 @@ #include #include #include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -73,15 +74,15 @@ BEGIN_C_DECLS */ typedef struct osm_mcm_info { cl_list_item_t list_item; - ib_net16_t mlid; + osm_mgrp_t *mgrp; } osm_mcm_info_t; /* * FIELDS * list_item * Linkage structure for cl_qlist. MUST BE FIRST MEMBER! * -* mlid -* MLID of this multicast group. +* mgrp +* The pointer to multicast group where this port is member of * * SEE ALSO *********/ @@ -95,11 +96,11 @@ typedef struct osm_mcm_info { * * SYNOPSIS */ -osm_mcm_info_t *osm_mcm_info_new(IN const ib_net16_t mlid); +osm_mcm_info_t *osm_mcm_info_new(IN osm_mgrp_t *mgrp); /* * PARAMETERS -* mlid -* [in] MLID value for this multicast group. +* mgrp +* [in] the pointer to multicast group. * * RETURN VALUES * Pointer to an initialized tree node. diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 7079e74..0e0d3d2 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -65,6 +65,7 @@ BEGIN_C_DECLS */ struct osm_port; struct osm_node; +struct osm_mgrp; /****h* OpenSM/Physical Port * NAME @@ -1420,14 +1421,14 @@ osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, * SYNOPSIS */ ib_api_status_t -osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); +osm_port_add_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); /* * PARAMETERS * p_port * [in] Pointer to an osm_port_t object. * -* mlid -* [in] MLID of the multicast group. +* mgrp +* [in] Pointer to the multicast group. * * RETURN VALUES * IB_SUCCESS @@ -1449,14 +1450,14 @@ osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); * SYNOPSIS */ void -osm_port_remove_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); +osm_port_remove_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); /* * PARAMETERS * p_port * [in] Pointer to an osm_port_t object. * -* mlid -* [in] MLID of the multicast group. +* mgrp +* [in] Pointer to the multicast group. * * RETURN VALUES * None. diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index c9a4f33..4891bb8 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -158,7 +158,6 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) osm_port_t *p_port_check; cl_qmap_t *p_sm_guid_tbl; osm_mcm_info_t *p_mcm; - osm_mgrp_t *p_mgrp; cl_ptr_vector_t *p_port_lid_tbl; uint16_t min_lid_ho; uint16_t max_lid_ho; @@ -212,12 +211,9 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { - p_mgrp = osm_get_mgrp_by_mlid(sm->p_subn, p_mcm->mlid); - if (p_mgrp) { - osm_mgrp_delete_port(sm->p_subn, sm->p_log, - p_mgrp, p_port->guid); - osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); - } + osm_mgrp_delete_port(sm->p_subn, sm->p_log, p_mcm->mgrp, + p_port->guid); + osm_mcm_info_delete(p_mcm); p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); } diff --git a/opensm/opensm/osm_mcm_info.c b/opensm/opensm/osm_mcm_info.c index 0325a34..c07c70b 100644 --- a/opensm/opensm/osm_mcm_info.c +++ b/opensm/opensm/osm_mcm_info.c @@ -49,17 +49,17 @@ /********************************************************************** **********************************************************************/ -osm_mcm_info_t *osm_mcm_info_new(IN const ib_net16_t mlid) +osm_mcm_info_t *osm_mcm_info_new(IN osm_mgrp_t *mgrp) { osm_mcm_info_t *p_mcm; - p_mcm = (osm_mcm_info_t *) malloc(sizeof(*p_mcm)); + p_mcm = malloc(sizeof(*p_mcm)); if (p_mcm) { memset(p_mcm, 0, sizeof(*p_mcm)); - p_mcm->mlid = mlid; + p_mcm->mgrp = mgrp; } - return (p_mcm); + return p_mcm; } /********************************************************************** diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index 751c0f0..3470381 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -223,12 +223,12 @@ Found: /********************************************************************** **********************************************************************/ -ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN ib_net16_t mlid) +ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t *mgrp) { ib_api_status_t status = IB_SUCCESS; osm_mcm_info_t *p_mcm; - p_mcm = osm_mcm_info_new(mlid); + p_mcm = osm_mcm_info_new(mgrp); if (p_mcm) cl_qlist_insert_tail(&p_port->mcm_list, (cl_list_item_t *) p_mcm); @@ -243,7 +243,7 @@ ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN ib_net16_t mlid) static cl_status_t port_mgrp_find_func(IN const cl_list_item_t * p_list_item, IN void *context) { - if (*((ib_net16_t *) context) == ((osm_mcm_info_t *) p_list_item)->mlid) + if (context == ((osm_mcm_info_t *) p_list_item)->mgrp) return CL_SUCCESS; else return CL_NOT_FOUND; @@ -251,12 +251,12 @@ static cl_status_t port_mgrp_find_func(IN const cl_list_item_t * p_list_item, /********************************************************************** **********************************************************************/ -void osm_port_remove_mgrp(IN osm_port_t * p_port, IN const ib_net16_t mlid) +void osm_port_remove_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t *mgrp) { cl_list_item_t *p_mcm; p_mcm = cl_qlist_find_from_head(&p_port->mcm_list, port_mgrp_find_func, - &mlid); + mgrp); if (p_mcm != cl_qlist_end(&p_port->mcm_list)) { cl_qlist_remove_item(&p_port->mcm_list, p_mcm); diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index b3ce69a..2794775 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -520,7 +520,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, */ p_mcm = (osm_mcm_info_t *) cl_qlist_head(&p_port->mcm_list); while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { - if (p_mcm->mlid == mlid) { + if (p_mcm->mgrp->mlid == mlid) { OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, "Found mlid object for Port:" "0x%016" PRIx64 " lid:0x%X\n", @@ -530,7 +530,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, p_mcm = (osm_mcm_info_t *) cl_qlist_next(&p_mcm->list_item); } - status = osm_port_add_mgrp(p_port, mlid); + status = osm_port_add_mgrp(p_port, p_mgrp); if (status != IB_SUCCESS) { OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E03: " "Unable to associate port 0x%" PRIx64 " to mlid 0x%X\n", @@ -590,7 +590,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, /* * Walk the list of ports in the group, and remove the appropriate one. */ - osm_port_remove_mgrp(p_port, mlid); + osm_port_remove_mgrp(p_port, p_mgrp); status = sm_mgrp_process(p_sm, p_mgrp); Exit: -- 1.6.4.2 From sashak at voltaire.com Sun Sep 6 08:49:31 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 18:49:31 +0300 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer as osm_sm_mcgrp_join/leave() parameter In-Reply-To: <20090906154901.GF25241@me> References: <20090906154901.GF25241@me> Message-ID: <20090906154931.GG25241@me> Use mgrp pointer to multicast group instead of mlid as parameter for osm_sm_mcgrp_join/leave() functions. This simplifies the current implementation, makes those functions MLID independent and in this way helps to implement MGID to MLID compression. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sm.h | 13 ++++--- opensm/opensm/osm_sa_mcmember_record.c | 8 +--- opensm/opensm/osm_sm.c | 57 ++++++------------------------- 3 files changed, 20 insertions(+), 58 deletions(-) diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index 152ecd7..0914a95 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -61,6 +61,7 @@ #include #include #include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -538,15 +539,15 @@ osm_resp_send(IN osm_sm_t * sm, */ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, - IN const ib_net16_t mlid, + IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid); /* * PARAMETERS * p_sm * [in] Pointer to an osm_sm_t object. * -* mlid -* [in] Multicast LID +* mgrp +* [in] Pointer to multicast group to join * * port_guid * [in] Port GUID to add to the group. @@ -572,14 +573,14 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, */ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm, - IN const ib_net16_t mlid, IN const ib_net64_t port_guid); + IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid); /* * PARAMETERS * p_sm * [in] Pointer to an osm_sm_t object. * -* mlid -* [in] Multicast LID +* mgrp +* [in] Poniter to multicast group to leave * * port_guid * [in] Port GUID to remove from the group. diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index a9e0a3b..7f2bc34 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1010,7 +1010,6 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) ib_sa_mad_t *p_sa_mad; ib_member_rec_t *p_recvd_mcmember_rec; ib_member_rec_t mcmember_rec; - ib_net16_t mlid; ib_net64_t portguid; osm_mcm_port_t *p_mcm_port; int removed; @@ -1041,7 +1040,6 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } - mlid = p_mgrp->mlid; portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; /* check validity of the delete request o15-0.1.14 */ @@ -1074,7 +1072,7 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) CL_PLOCK_RELEASE(sa->p_lock); /* we can leave if port was deleted from MCG */ - if (removed && osm_sm_mcgrp_leave(sa->sm, mlid, portguid)) + if (removed && osm_sm_mcgrp_leave(sa->sm, p_mgrp, portguid)) OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B09: " "osm_sm_mcgrp_leave failed\n"); @@ -1094,7 +1092,6 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) ib_sa_mad_t *p_sa_mad; ib_member_rec_t *p_recvd_mcmember_rec; ib_member_rec_t mcmember_rec; - ib_net16_t mlid; osm_mcm_port_t *p_mcmr_port; ib_net64_t portguid; osm_port_t *p_port; @@ -1217,7 +1214,6 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) is_new_group = 0; CL_ASSERT(p_mgrp); - mlid = p_mgrp->mlid; /* * o15-0.2.4: If SA supports UD multicast, then SA shall cause an @@ -1302,7 +1298,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) CL_PLOCK_RELEASE(sa->p_lock); /* do the actual routing (actually schedule the update) */ - status = osm_sm_mcgrp_join(sa->sm, mlid, + status = osm_sm_mcgrp_join(sa->sm, p_mgrp, p_recvd_mcmember_rec->port_gid.unicast. interface_id); diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index 2794775..50aee91 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -467,10 +467,9 @@ static ib_api_status_t sm_mgrp_process(IN osm_sm_t * p_sm, /********************************************************************** **********************************************************************/ -ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, +ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { - osm_mgrp_t *p_mgrp; osm_port_t *p_port; ib_api_status_t status = IB_SUCCESS; osm_mcm_info_t *p_mcm; @@ -479,7 +478,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, OSM_LOG(p_sm->p_log, OSM_LOG_VERBOSE, "Port 0x%016" PRIx64 " joining MLID 0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mlid)); + cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); /* * Acquire the port object for the port joining this group. @@ -495,51 +494,32 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, } /* - * If this multicast group does not already exist, create it. - */ - p_mgrp = osm_get_mgrp_by_mlid(p_sm->p_subn, mlid); - if (!p_mgrp || !osm_mgrp_is_guid(p_mgrp, port_guid)) { - /* - * The group removed or the port is not a - * member of the group, then fail immediately. - * This can happen since the spinlock is released briefly - * before the SA calls this function. - */ - OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E12: " - "MC group with mlid 0x%x doesn't exist or " - "port 0x%016" PRIx64 " is not in the group.\n", - cl_ntoh16(mlid), cl_ntoh64(port_guid)); - status = IB_NOT_FOUND; - goto Exit; - } - - /* * Check if the object (according to mlid) already exists on this port. * If it does - then no need to update it again, and no need to * create the mc tree again. Just goto Exit. */ p_mcm = (osm_mcm_info_t *) cl_qlist_head(&p_port->mcm_list); while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { - if (p_mcm->mgrp->mlid == mlid) { + if (p_mcm->mgrp == mgrp) { OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, "Found mlid object for Port:" "0x%016" PRIx64 " lid:0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mlid)); + cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); goto Exit; } p_mcm = (osm_mcm_info_t *) cl_qlist_next(&p_mcm->list_item); } - status = osm_port_add_mgrp(p_port, p_mgrp); + status = osm_port_add_mgrp(p_port, mgrp); if (status != IB_SUCCESS) { OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E03: " "Unable to associate port 0x%" PRIx64 " to mlid 0x%X\n", cl_ntoh64(osm_port_get_guid(p_port)), - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp))); + cl_ntoh16(osm_mgrp_get_mlid(mgrp))); goto Exit; } - status = sm_mgrp_process(p_sm, p_mgrp); + status = sm_mgrp_process(p_sm, mgrp); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); @@ -549,10 +529,9 @@ Exit: /********************************************************************** **********************************************************************/ -ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, +ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { - osm_mgrp_t *p_mgrp; osm_port_t *p_port; ib_api_status_t status; @@ -560,7 +539,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, OSM_LOG(p_sm->p_log, OSM_LOG_VERBOSE, "Port 0x%" PRIx64 " leaving MLID 0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mlid)); + cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); /* * Acquire the port object for the port leaving this group. @@ -576,23 +555,9 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN const ib_net16_t mlid, goto Exit; } - /* - * Get the multicast group object for this group. - */ - p_mgrp = osm_get_mgrp_by_mlid(p_sm->p_subn, mlid); - if (!p_mgrp) { - OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E08: " - "No multicast group for MLID 0x%X\n", cl_ntoh16(mlid)); - status = IB_INVALID_PARAMETER; - goto Exit; - } - - /* - * Walk the list of ports in the group, and remove the appropriate one. - */ - osm_port_remove_mgrp(p_port, p_mgrp); + osm_port_remove_mgrp(p_port, mgrp); - status = sm_mgrp_process(p_sm, p_mgrp); + status = sm_mgrp_process(p_sm, mgrp); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); -- 1.6.4.2 From sashak at voltaire.com Sun Sep 6 08:50:06 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 18:50:06 +0300 Subject: [ofa-general] [PATCH] opensm: remove not used osm_mgrp_apply_func() function In-Reply-To: <20090906154901.GF25241@me> References: <20090906154901.GF25241@me> Message-ID: <20090906155006.GH25241@me> Remove not used osm_mgrp_apply_func() function and associated types, variables, etc.. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_multicast.h | 66 --------------------------------- opensm/opensm/osm_multicast.c | 38 ------------------- 2 files changed, 0 insertions(+), 104 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index 9a47de5..eda447e 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -174,38 +174,6 @@ typedef struct osm_mgrp { * SEE ALSO *********/ -/****f* OpenSM: Vendor API/osm_mgrp_func_t -* NAME -* osm_mgrp_func_t -* -* DESCRIPTION -* Callback for the osm_mgrp_apply_func function. -* The callback function must not modify the tree linkage. -* -* SYNOPSIS -*/ -typedef void (*osm_mgrp_func_t) (IN const osm_mgrp_t * const p_mgrp, - IN const osm_mtree_node_t * const p_mtn, - IN void *context); -/* -* PARAMETERS -* p_mgrp -* [in] Pointer to the multicast group object. -* -* p_mtn -* [in] Pointer to the multicast tree node. -* -* context -* [in] User context. -* -* RETURN VALUES -* None. -* -* NOTES -* -* SEE ALSO -*********/ - /****f* OpenSM: Multicast Group/osm_mgrp_new * NAME * osm_mgrp_new @@ -456,39 +424,5 @@ osm_mgrp_delete_port(IN osm_subn_t * const p_subn, int osm_mgrp_remove_port(osm_subn_t *subn, osm_log_t *log, osm_mgrp_t *mgrp, osm_mcm_port_t *mcm, uint8_t join_state); -/****f* OpenSM: Multicast Group/osm_mgrp_apply_func -* NAME -* osm_mgrp_apply_func -* -* DESCRIPTION -* Calls the specified function for each element in the tree. -* Elements are passed to the callback function in no particular order. -* -* SYNOPSIS -*/ -void -osm_mgrp_apply_func(const osm_mgrp_t * const p_mgrp, - osm_mgrp_func_t p_func, void *context); -/* -* PARAMETERS -* p_mgrp -* [in] Pointer to an osm_mgrp_t object. -* -* p_func -* [in] Pointer to the users callback function. -* -* context -* [in] User context passed to the callback function. -* -* -* RETURN VALUES -* None. -* -* NOTES -* -* SEE ALSO -* Multicast Group -*********/ - END_C_DECLS #endif /* _OSM_MULTICAST_H_ */ diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index d2733c4..4b4a6b0 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -260,41 +260,3 @@ boolean_t osm_mgrp_is_port_present(IN const osm_mgrp_t * p_mgrp, *pp_mcm_port = NULL; return FALSE; } - -/********************************************************************** - **********************************************************************/ -static void mgrp_apply_func_sub(const osm_mgrp_t * p_mgrp, - const osm_mtree_node_t * p_mtn, - osm_mgrp_func_t p_func, void *context) -{ - uint8_t i = 0; - uint8_t max_children; - osm_mtree_node_t *p_child_mtn; - - /* Call the user, then recurse. */ - p_func(p_mgrp, p_mtn, context); - - max_children = osm_mtree_node_get_max_children(p_mtn); - for (i = 0; i < max_children; i++) { - p_child_mtn = osm_mtree_node_get_child(p_mtn, i); - if (p_child_mtn) - mgrp_apply_func_sub(p_mgrp, p_child_mtn, p_func, - context); - } -} - -/********************************************************************** - **********************************************************************/ -void osm_mgrp_apply_func(const osm_mgrp_t * p_mgrp, osm_mgrp_func_t p_func, - void *context) -{ - osm_mtree_node_t *p_mtn; - - CL_ASSERT(p_mgrp); - CL_ASSERT(p_func); - - p_mtn = p_mgrp->p_root; - - if (p_mtn) - mgrp_apply_func_sub(p_mgrp, p_mtn, p_func, context); -} -- 1.6.4.2 From sashak at voltaire.com Sun Sep 6 08:53:39 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 18:53:39 +0300 Subject: [ofa-general] Re: [PATCH] opensm doc: Indicated limited (rather than partial) partition membership In-Reply-To: <20090903130036.GA18519@comcast.net> References: <20090903130036.GA18519@comcast.net> Message-ID: <20090906155339.GI25241@me> On 09:00 Thu 03 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sun Sep 6 10:39:00 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 20:39:00 +0300 Subject: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing In-Reply-To: <20090906154931.GG25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> Message-ID: <20090906173900.GK25241@me> When we have two or more changes in a same multicast group multiple multicast rerouting requests will be created and processed. To prevent this we will use array of requests indexed by mlid value minus IB_LID_MCAST_START_HO and for each multicast group change we will just mark that specific mlid requires re-routing and "duplicated" requests will be merged there. Also in this way we will be able to process multicast group routing entries deletion for already removed groups by just knowing its MLID and not using its content - this will let us to not delay mutlicast groups deletion ('to_be_deleted' flag) and will simplify many multicast related code flows. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sm.h | 4 +- opensm/opensm/osm_mcast_mgr.c | 27 +++++++++------------ opensm/opensm/osm_sm.c | 49 +++++++++++++--------------------------- 3 files changed, 30 insertions(+), 50 deletions(-) diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index 0914a95..986143a 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -126,8 +126,8 @@ typedef struct osm_sm { cl_dispatcher_t *p_disp; cl_plock_t *p_lock; atomic32_t sm_trans_id; - cl_spinlock_t mgrp_lock; - cl_qlist_t mgrp_list; + unsigned mlids_req_max; + uint8_t *mlids_req; osm_sm_mad_ctrl_t mad_ctrl; osm_lid_mgr_t lid_mgr; osm_ucast_mgr_t ucast_mgr; diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index d7c5ce1..dd504ef 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1116,7 +1116,6 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) int osm_mcast_mgr_process(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl; - cl_qlist_t *p_list = &sm->mgrp_list; osm_mgrp_t *p_mgrp; int i, ret = 0; @@ -1150,16 +1149,14 @@ int osm_mcast_mgr_process(osm_sm_t * sm) mcast_mgr_process_mgrp(sm, p_mgrp); } + memset(sm->mlids_req, 0, sm->mlids_req_max); + sm->mlids_req_max = 0; + /* Walk the switches and download the tables for each. */ ret = mcast_mgr_set_mftables(sm); - while (!cl_is_qlist_empty(p_list)) { - cl_list_item_t *p = cl_qlist_remove_head(p_list); - free(p); - } - exit: CL_PLOCK_RELEASE(sm->p_lock); @@ -1174,11 +1171,10 @@ exit: **********************************************************************/ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) { - cl_qlist_t *p_list = &sm->mgrp_list; osm_mgrp_t *p_mgrp; ib_net16_t mlid; - osm_mcast_mgr_ctxt_t *ctx; int ret = 0; + unsigned i; OSM_LOG_ENTER(sm->p_log); @@ -1192,14 +1188,12 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) goto exit; } - while (!cl_is_qlist_empty(p_list)) { - ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list); - - /* nice copy no warning on size diff */ - memcpy(&mlid, &ctx->mlid, sizeof(mlid)); + for (i = 0; i <= sm->mlids_req_max; i++) { + if (!sm->mlids_req[i]) + continue; + sm->mlids_req[i] = 0; - /* we can destroy the context now */ - free(ctx); + mlid = cl_hton16(i + IB_LID_MCAST_START_HO); /* since we delayed the execution we prefer to pass the mlid as the mgrp identifier and then find it or abort */ @@ -1223,6 +1217,9 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) mcast_mgr_process_mgrp(sm, p_mgrp); } + memset(sm->mlids_req, 0, sm->mlids_req_max); + sm->mlids_req_max = 0; + /* Walk the switches and download the tables for each. */ diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index 50aee91..e446c9d 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -166,7 +166,6 @@ void osm_sm_construct(IN osm_sm_t * p_sm) cl_event_construct(&p_sm->subnet_up_event); cl_event_wheel_construct(&p_sm->trap_aging_tracker); cl_thread_construct(&p_sm->sweeper); - cl_spinlock_construct(&p_sm->mgrp_lock); osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); osm_lid_mgr_construct(&p_sm->lid_mgr); osm_ucast_mgr_construct(&p_sm->ucast_mgr); @@ -234,8 +233,8 @@ void osm_sm_destroy(IN osm_sm_t * p_sm) cl_event_destroy(&p_sm->signal_event); cl_event_destroy(&p_sm->subnet_up_event); cl_spinlock_destroy(&p_sm->signal_lock); - cl_spinlock_destroy(&p_sm->mgrp_lock); cl_spinlock_destroy(&p_sm->state_lock); + free(p_sm->mlids_req); osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n"); /* Format Waived */ OSM_LOG_EXIT(p_sm->p_log); @@ -288,11 +287,14 @@ ib_api_status_t osm_sm_init(IN osm_sm_t * p_sm, IN osm_subn_t * p_subn, if (status != CL_SUCCESS) goto Exit; - cl_qlist_init(&p_sm->mgrp_list); - - status = cl_spinlock_init(&p_sm->mgrp_lock); - if (status != CL_SUCCESS) + p_sm->mlids_req_max = 0; + p_sm->mlids_req = malloc((IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + + 1) * sizeof(p_sm->mlids_req[0])); + if (!p_sm->mlids_req) goto Exit; + memset(p_sm->mlids_req, 0, + (IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + + 1) * sizeof(p_sm->mlids_req[0])); status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl, p_sm->p_subn, p_sm->p_mad_pool, p_sm->p_vl15, @@ -441,32 +443,15 @@ Exit: /********************************************************************** **********************************************************************/ -static ib_api_status_t sm_mgrp_process(IN osm_sm_t * p_sm, - IN osm_mgrp_t * p_mgrp) +static void request_mlid(osm_sm_t * sm, uint16_t mlid) { - osm_mcast_mgr_ctxt_t *ctx; - - /* - * 'Schedule' all the QP0 traffic for when the state manager - * isn't busy trying to do something else. - */ - ctx = malloc(sizeof(*ctx)); - if (!ctx) - return IB_ERROR; - memset(ctx, 0, sizeof(*ctx)); - ctx->mlid = p_mgrp->mlid; - - cl_spinlock_acquire(&p_sm->mgrp_lock); - cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx->list_item); - cl_spinlock_release(&p_sm->mgrp_lock); - - osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); - - return IB_SUCCESS; + mlid -= IB_LID_MCAST_START_HO; + sm->mlids_req[mlid] = 1; + if (sm->mlids_req_max < mlid) + sm->mlids_req_max = mlid; + osm_sm_signal(sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); } -/********************************************************************** - **********************************************************************/ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { @@ -519,7 +504,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, goto Exit; } - status = sm_mgrp_process(p_sm, mgrp); + request_mlid(p_sm, cl_ntoh16(mgrp->mlid)); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); @@ -527,8 +512,6 @@ Exit: return status; } -/********************************************************************** - **********************************************************************/ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { @@ -557,7 +540,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, osm_port_remove_mgrp(p_port, mgrp); - status = sm_mgrp_process(p_sm, mgrp); + request_mlid(p_sm, cl_hton16(mgrp->mlid)); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); -- 1.6.4.2 From sashak at voltaire.com Sun Sep 6 10:45:48 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Sep 2009 20:45:48 +0300 Subject: [ofa-general] [PATCH] opensm/multicast: remove change id tracking In-Reply-To: <20090906173900.GK25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> Message-ID: <20090906174548.GL25241@me> Following previous patch 'opensm: improve multicast re-routing requests processing' remove multicast change id (last_change_id, last_tree_id), we don't need to track it anymore i- all requests are processed only once per mlid. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_multicast.h | 10 ---------- opensm/opensm/osm_mcast_mgr.c | 21 ++------------------- opensm/opensm/osm_multicast.c | 7 ------- 3 files changed, 2 insertions(+), 36 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index eda447e..ce3d310 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -128,8 +128,6 @@ typedef struct osm_mgrp { ib_member_rec_t mcmember_rec; boolean_t well_known; boolean_t to_be_deleted; - uint32_t last_change_id; - uint32_t last_tree_id; unsigned full_members; } osm_mgrp_t; /* @@ -163,14 +161,6 @@ typedef struct osm_mgrp { * track the fact the group is about to be deleted so we can * track the fact a new join is actually a create request. * -* last_change_id -* a counter for the number of changes applied to the group. -* This counter shuold be incremented on any modification -* to the group: joining or leaving of ports. -* -* last_tree_id -* the last change id used for building the current tree. -* * SEE ALSO *********/ diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index dd504ef..0553277 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1039,12 +1039,10 @@ static ib_api_status_t mcast_mgr_process_mgrp(osm_sm_t * sm, if (p_mgrp->full_members) { status = mcast_mgr_build_spanning_tree(sm, p_mgrp); - if (status != IB_SUCCESS) { + if (status != IB_SUCCESS) OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A17: " "Unable to create spanning tree (%s)\n", ib_get_err_str(status)); - goto Exit; - } } else if (p_mgrp->to_be_deleted) { OSM_LOG(sm->p_log, OSM_LOG_DEBUG, "Destroying mgrp with lid:0x%x\n", @@ -1054,12 +1052,8 @@ static ib_api_status_t mcast_mgr_process_mgrp(osm_sm_t * sm, cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, &p_mgrp->map_item); osm_mgrp_delete(p_mgrp); - goto Exit; } - p_mgrp->last_tree_id = p_mgrp->last_change_id; - -Exit: OSM_LOG_EXIT(sm->p_log); return status; } @@ -1201,19 +1195,8 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) if (!p_mgrp) continue; - /* if there was no change from the last time - * we processed the group we can skip doing anything - */ - if (p_mgrp->last_change_id == p_mgrp->last_tree_id) { - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Skip processing mgrp with lid:0x%X change id:%u\n", - cl_ntoh16(mlid), p_mgrp->last_change_id); - continue; - } - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Processing mgrp with lid:0x%X change id:%u\n", - cl_ntoh16(mlid), p_mgrp->last_change_id); + "Processing mgrp with lid:0x%x\n", cl_ntoh16(mlid)); mcast_mgr_process_mgrp(sm, p_mgrp); } diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 4b4a6b0..1326161 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -86,8 +86,6 @@ osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) memset(p_mgrp, 0, sizeof(*p_mgrp)); cl_qmap_init(&p_mgrp->mcm_port_tbl); p_mgrp->mlid = mlid; - p_mgrp->last_change_id = 0; - p_mgrp->last_tree_id = 0; p_mgrp->to_be_deleted = FALSE; return p_mgrp; @@ -167,9 +165,6 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, p_mcm_port->scope_state = ib_member_set_scope_state(prev_scope, prev_join_state | join_state); - } else { - /* track the fact we modified the group ports */ - p_mgrp->last_change_id++; } if ((join_state & IB_JOIN_STATE_FULL) && @@ -211,8 +206,6 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, OSM_LOG(log, OSM_LOG_DEBUG, "removing port 0x%" PRIx64 "\n", cl_ntoh64(mcm->port_gid.unicast.interface_id)); osm_mcm_port_delete(mcm); - /* track the fact we modified the group */ - mgrp->last_change_id++; ret = 1; } -- 1.6.4.2 From sashak at voltaire.com Sun Sep 6 14:38:52 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Sep 2009 00:38:52 +0300 Subject: [ofa-general] [PATCH] opensm/osm_mcast_mgr.c: multicast routing by mlid - renaming In-Reply-To: <20090906174548.GL25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> <20090906174548.GL25241@me> Message-ID: <20090906213852.GM25241@me> This patch renames mcast_mgr_process_mgrp() -> mcast_mgr_process_mlid() function and this gets MLID (in host byte order) as parameter. This simplifies caller flows and makes the code ready for MGID to MLID compression. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_mcast_mgr.c | 71 +++++++++++++--------------------------- 1 files changed, 23 insertions(+), 48 deletions(-) diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 0553277..24fb5b1 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1017,41 +1017,37 @@ Exit: Process the entire group. NOTE : The lock should be held externally! **********************************************************************/ -static ib_api_status_t mcast_mgr_process_mgrp(osm_sm_t * sm, - IN osm_mgrp_t * p_mgrp) +static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) { ib_api_status_t status = IB_SUCCESS; - ib_net16_t mlid; + osm_mgrp_t *mgrp; OSM_LOG_ENTER(sm->p_log); - mlid = osm_mgrp_get_mlid(p_mgrp); - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Processing multicast group 0x%X\n", cl_ntoh16(mlid)); + "Processing multicast group with lid 0x%X\n", mlid); - /* - Clear the multicast tables to start clean, then build + /* Clear the multicast tables to start clean, then build the spanning tree which sets the mcast table bits for each - port in the group. - */ - mcast_mgr_clear(sm, cl_ntoh16(mlid)); + port in the group. */ + mcast_mgr_clear(sm, mlid); + + mgrp = osm_get_mgrp_by_mlid(sm->p_subn, cl_hton16(mlid)); + if (!mgrp) /* already removed */ + return IB_SUCCESS; - if (p_mgrp->full_members) { - status = mcast_mgr_build_spanning_tree(sm, p_mgrp); + if (mgrp->full_members) { + status = mcast_mgr_build_spanning_tree(sm, mgrp); if (status != IB_SUCCESS) OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A17: " - "Unable to create spanning tree (%s)\n", - ib_get_err_str(status)); - } else if (p_mgrp->to_be_deleted) { + "Unable to create spanning tree (%s) for mlid " + "0x%x\n", ib_get_err_str(status), mlid); + } else if (mgrp->to_be_deleted) { OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Destroying mgrp with lid:0x%x\n", - cl_ntoh16(p_mgrp->mlid)); - sm->p_subn->mgroups[cl_ntoh16(p_mgrp->mlid) - - IB_LID_MCAST_START_HO] = NULL; - cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, - &p_mgrp->map_item); - osm_mgrp_delete(p_mgrp); + "Destroying mgrp with lid:0x%x\n", mlid); + sm->p_subn->mgroups[mlid - IB_LID_MCAST_START_HO] = NULL; + cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, &mgrp->map_item); + osm_mgrp_delete(mgrp); } OSM_LOG_EXIT(sm->p_log); @@ -1110,7 +1106,6 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) int osm_mcast_mgr_process(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl; - osm_mgrp_t *p_mgrp; int i, ret = 0; OSM_LOG_ENTER(sm->p_log); @@ -1132,16 +1127,9 @@ int osm_mcast_mgr_process(osm_sm_t * sm) } for (i = 0; i <= sm->p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; - i++) { - /* - We reached here due to some change that caused a heavy sweep - of the subnet. Not due to a specific multicast request. - So the request type is subnet_change and the port guid is 0. - */ - p_mgrp = sm->p_subn->mgroups[i]; - if (p_mgrp) - mcast_mgr_process_mgrp(sm, p_mgrp); - } + i++) + if (sm->p_subn->mgroups[i]) + mcast_mgr_process_mlid(sm, i + IB_LID_MCAST_START_HO); memset(sm->mlids_req, 0, sm->mlids_req_max); sm->mlids_req_max = 0; @@ -1165,8 +1153,6 @@ exit: **********************************************************************/ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) { - osm_mgrp_t *p_mgrp; - ib_net16_t mlid; int ret = 0; unsigned i; @@ -1186,18 +1172,7 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) if (!sm->mlids_req[i]) continue; sm->mlids_req[i] = 0; - - mlid = cl_hton16(i + IB_LID_MCAST_START_HO); - - /* since we delayed the execution we prefer to pass the - mlid as the mgrp identifier and then find it or abort */ - p_mgrp = osm_get_mgrp_by_mlid(sm->p_subn, mlid); - if (!p_mgrp) - continue; - - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Processing mgrp with lid:0x%x\n", cl_ntoh16(mlid)); - mcast_mgr_process_mgrp(sm, p_mgrp); + mcast_mgr_process_mlid(sm, i + IB_LID_MCAST_START_HO); } memset(sm->mlids_req, 0, sm->mlids_req_max); -- 1.6.4.2 From tom at opengridcomputing.com Sun Sep 6 22:32:39 2009 From: tom at opengridcomputing.com (Tom Tucker) Date: Mon, 07 Sep 2009 00:32:39 -0500 Subject: [ofa-general] Cannot export multiple directories using nfs-rdma In-Reply-To: References: Message-ID: <4AA49AF7.7030407@opengridcomputing.com> Jeff Johnson wrote: > I have a nfs-rdma configuration using Mellanox ConnectX-DDR, ofed-1.4.2 > on Centos 5.3 x86_64. > My ConnectX cards are running 2.5.0 firmware as I have read that 2.6.0 > had rdma issues. I saw these issues and down rev'd the cards to 2.5.0. > > I am seeing a peculiar behavior where if I export two separate > directories from the server and attempt to mount them separately from a > client I end up with the same export mounted to two different client > directories. > Hi Jeff: The mount service does not run over RDMA, it only runs on TCP/UDP. I believe you should be able to reproduce this behavior on plain old GigE/IPoIB. Is this the case? Tom > e.g.: server:/raid1 > server:/raid2 > > 'mount.rnfs 10.0.0.251:/raid1 /raid1 -i -o rdma,port=2050' > client:/raid1 <---has server:/raid1 contents > 'mount.rnfs 10.0.0.251:/raid2 /raid2 -i -o rdma,port=2050' > client:/raid2 <---has server:/raid1 contents > > I have tried creating multiple rdma ports on the server (2050 and > 2051) and then using different ports for each separate mount. The result > is the same. > > I have verified that I am indeed mounting rdma and not merely ipoib. > > Is nfs-rdma capable of multiple exports? If so, I cannot find a > method for dealing with multiple exports from the server or client side > in any ofed docs. > > Thanks for any assistance.. > > ------------------------------ > Jeff Johnson > Manager > Aeon Computing > > jeff.johnson at aeoncomputing.com > t: 858-412-3810 f: 858-412-3845 > m: 619-204-9061 > > 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117 > > > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general From sebastien.dugue at bull.net Sun Sep 6 23:24:12 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Mon, 7 Sep 2009 08:24:12 +0200 Subject: [ofa-general] Re: [PATCH 0/6] ibutils: Build fixes for FC11 In-Reply-To: <4AA3B468.8010802@dev.mellanox.co.il> References: <20090902120353.3ee1a8e2@frecb007965> <4AA3B468.8010802@dev.mellanox.co.il> Message-ID: <20090907082412.27564622@frecb007965> On Sun, 06 Sep 2009 16:08:56 +0300 Yevgeny Kliteynik wrote: > Sebastien, > > sebastien dugue wrote: > > Hi, > > > > here are some fixes I had to apply in order to be able to build under FC11 > > due to some changes in the toolchain. > > > > Sebastien. > > > > Thanks. I've checked in all but patch 3/6. > See details in the mail. Thanks Yevgeny. Sebastien. > > -- Yevgeny > From sebastien.dugue at bull.net Sun Sep 6 23:25:00 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Mon, 7 Sep 2009 08:25:00 +0200 Subject: [ofa-general] Re: [PATCH 3/6] ibutils/ibdm: Fix libibsysapi build In-Reply-To: <4AA3B580.2000807@dev.mellanox.co.il> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120646.0bc3db4b@frecb007965> <4AA3B580.2000807@dev.mellanox.co.il> Message-ID: <20090907082500.324510eb@frecb007965> On Sun, 06 Sep 2009 16:13:36 +0300 Yevgeny Kliteynik wrote: > Sebastien, > > sebastien dugue wrote: > > Add libibdmcom linker path to allow build under FC11. > > > > Signed-off-by: Sebastien Dugue > > --- > > ibdm/src/Makefile.am | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/ibdm/src/Makefile.am b/ibdm/src/Makefile.am > > index 8b2f9ba..b763387 100644 > > --- a/ibdm/src/Makefile.am > > +++ b/ibdm/src/Makefile.am > > @@ -61,7 +61,7 @@ ibnlparse_SOURCES = test_ibnl_parser.cpp > > lib_LTLIBRARIES = libibsysapi.la > > libibsysapi_la_SOURCES = ibsysapi.cpp > > libibsysapi_la_LDFLAGS = -version-info 1:0:0 > > -libibsysapi_la_LIBADD = -libdmcom > > +libibsysapi_la_LIBADD = -L../ibdm -libdmcom > > This problem was already pointed out by Dale Purdy, > and I already fixed it based on his suggestion. > In fact, my fix is the same as yours :) Great! I haven't noticed it. Thanks, Sebastien. > > -- Yevgeny > From kliteyn at dev.mellanox.co.il Sun Sep 6 23:44:03 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Sep 2009 09:44:03 +0300 Subject: [ofa-general] Re: [PATCH 3/6] ibutils/ibdm: Fix libibsysapi build In-Reply-To: <20090907082500.324510eb@frecb007965> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120646.0bc3db4b@frecb007965> <4AA3B580.2000807@dev.mellanox.co.il> <20090907082500.324510eb@frecb007965> Message-ID: <4AA4ABB3.9030101@dev.mellanox.co.il> sebastien dugue wrote: >> This problem was already pointed out by Dale Purdy, >> and I already fixed it based on his suggestion. >> In fact, my fix is the same as yours :) > > Great! I haven't noticed it. Well, it is probably because I forgot to push it into the trunk... Sorry :) -- Yevgeny > Thanks, > > > Sebastien. > >> -- Yevgeny >> > From sebastien.dugue at bull.net Mon Sep 7 00:14:45 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Mon, 7 Sep 2009 09:14:45 +0200 Subject: [ofa-general] Re: [PATCH 3/6] ibutils/ibdm: Fix libibsysapi build In-Reply-To: <4AA4ABB3.9030101@dev.mellanox.co.il> References: <20090902120353.3ee1a8e2@frecb007965> <20090902120646.0bc3db4b@frecb007965> <4AA3B580.2000807@dev.mellanox.co.il> <20090907082500.324510eb@frecb007965> <4AA4ABB3.9030101@dev.mellanox.co.il> Message-ID: <20090907091445.0cec6fdf@frecb007965> On Mon, 07 Sep 2009 09:44:03 +0300 Yevgeny Kliteynik wrote: > sebastien dugue wrote: > >> This problem was already pointed out by Dale Purdy, > >> and I already fixed it based on his suggestion. > >> In fact, my fix is the same as yours :) > > > > Great! I haven't noticed it. > > Well, it is probably because I forgot to push > it into the trunk... Sorry :) > Ah OK, just pulled it. Thanks, Sebastien. From vlad at lists.openfabrics.org Mon Sep 7 03:05:37 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 7 Sep 2009 03:05:37 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090907-0200 daily build status Message-ID: <20090907100538.11846E60FD1@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090907-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From vst at vlnb.net Mon Sep 7 04:58:25 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 07 Sep 2009 15:58:25 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> Message-ID: <4AA4F561.504@vlnb.net> Chris Worley, on 09/06/2009 05:41 PM wrote: > On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche wrote: >>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>> I've used a couple of initiators (different systems) w/ different >>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>> repeat). >>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>> targets, and the RHEL initiator (same machine as was running WinOF >>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>> both cases, the problem does not repeat. >>>> >>>> That makes it sound like OFED is the cure on either side of the >>>> connection, but does not explain the issue w/ WinOF (which does fail >>>> w/ either Ununtu or RHEL targets). >>> These results are strange. Regarding the Linux-only tests, I was >>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>> each of these components there is at least one test that passes and at >>> least one test that fails. So either my assumption is wrong or one of >>> the above test results is not repeatable. Do you have the time to >>> repeat the Linux-only tests ? >> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >> the problem repeated; now, I can't repeat the case where it didn't >> fail. Still, no errors, other than the eventual timeouts previously >> shown; the target thinks all is fine, the initiator is stuck. > > ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or 9.04. 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races you have, although not completely. 2. Try with another hardware, including motherboard. You can have something like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) > Chris >> Chris >>> Bart. >>> From sashak at voltaire.com Mon Sep 7 05:10:03 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Sep 2009 15:10:03 +0300 Subject: [ofa-general] [PATCH] opensm/multicast: kill mc group to_be_deleted flag In-Reply-To: <20090906213852.GM25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> <20090906174548.GL25241@me> <20090906213852.GM25241@me> Message-ID: <20090907121003.GN25241@me> This removes multicast group delayed deletion stuff - we don't need it anymore because MFTs cleanup (in case of group removing) is performed using MLID only (not mgrp content) which is requested by appropriate osm_sm_mgrp_leave() call. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_multicast.h | 7 +--- opensm/opensm/osm_mcast_mgr.c | 13 +----- opensm/opensm/osm_multicast.c | 20 ++++++---- opensm/opensm/osm_sa_mcmember_record.c | 65 ++++--------------------------- 4 files changed, 24 insertions(+), 81 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index ce3d310..181f0db 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -127,7 +127,6 @@ typedef struct osm_mgrp { cl_qmap_t mcm_port_tbl; ib_member_rec_t mcmember_rec; boolean_t well_known; - boolean_t to_be_deleted; unsigned full_members; } osm_mgrp_t; /* @@ -156,11 +155,6 @@ typedef struct osm_mgrp { * is created during the initialization of SM/SA and will be * present even if there are no ports for this group * -* to_be_deleted -* Since groups are deleted only after re-route we need to -* track the fact the group is about to be deleted so we can -* track the fact a new join is actually a create request. -* * SEE ALSO *********/ @@ -413,6 +407,7 @@ osm_mgrp_delete_port(IN osm_subn_t * const p_subn, int osm_mgrp_remove_port(osm_subn_t *subn, osm_log_t *log, osm_mgrp_t *mgrp, osm_mcm_port_t *mcm, uint8_t join_state); +void osm_mgrp_cleanup(osm_subn_t *subn, osm_mgrp_t *mpgr); END_C_DECLS #endif /* _OSM_MULTICAST_H_ */ diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index f3ddacf..c1d1916 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1030,21 +1030,12 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) mcast_mgr_clear(sm, mlid); mgrp = osm_get_mgrp_by_mlid(sm->p_subn, cl_hton16(mlid)); - if (!mgrp) /* already removed */ - return IB_SUCCESS; - - if (mgrp->full_members) { + if (mgrp) { status = mcast_mgr_build_spanning_tree(sm, mgrp); if (status != IB_SUCCESS) OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A17: " "Unable to create spanning tree (%s) for mlid " "0x%x\n", ib_get_err_str(status), mlid); - } else if (mgrp->to_be_deleted) { - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Destroying mgrp with lid:0x%x\n", mlid); - sm->p_subn->mgroups[mlid - IB_LID_MCAST_START_HO] = NULL; - cl_fmap_remove_item(&sm->p_subn->mgrp_mgid_tbl, &mgrp->map_item); - osm_mgrp_delete(mgrp); } OSM_LOG_EXIT(sm->p_log); @@ -1120,7 +1111,7 @@ int osm_mcast_mgr_process(osm_sm_t * sm) for (i = 0; i <= sm->p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; i++) - if (sm->p_subn->mgroups[i]) + if (sm->p_subn->mgroups[i] || sm->mlids_req[i]) mcast_mgr_process_mlid(sm, i + IB_LID_MCAST_START_HO); memset(sm->mlids_req, 0, sm->mlids_req_max); diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 1326161..242eae7 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -86,11 +86,19 @@ osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) memset(p_mgrp, 0, sizeof(*p_mgrp)); cl_qmap_init(&p_mgrp->mcm_port_tbl); p_mgrp->mlid = mlid; - p_mgrp->to_be_deleted = FALSE; return p_mgrp; } +void osm_mgrp_cleanup(osm_subn_t *subn, osm_mgrp_t *mgrp) +{ + if (mgrp->full_members || mgrp->well_known) + return; + subn->mgroups[cl_ntoh16(mgrp->mlid) - IB_LID_MCAST_START_HO] = NULL; + cl_fmap_remove_item(&subn->mgrp_mgid_tbl, &mgrp->map_item); + osm_mgrp_delete(mgrp); +} + /********************************************************************** **********************************************************************/ static void mgrp_send_notice(osm_subn_t * subn, osm_log_t * log, @@ -169,10 +177,8 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, if ((join_state & IB_JOIN_STATE_FULL) && !(prev_join_state & IB_JOIN_STATE_FULL) && - (++p_mgrp->full_members == 1)) { + ++p_mgrp->full_members == 1) mgrp_send_notice(subn, log, p_mgrp, 66); - p_mgrp->to_be_deleted = 0; - } return (p_mcm_port); } @@ -213,11 +219,8 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, but only if it is not a well known group */ if ((port_join_state & IB_JOIN_STATE_FULL) && !(new_join_state & IB_JOIN_STATE_FULL) && - (--mgrp->full_members == 0)) { + --mgrp->full_members == 0) mgrp_send_notice(subn, log, mgrp, 67); - if (!mgrp->well_known) - mgrp->to_be_deleted = 1; - } return ret; } @@ -230,6 +233,7 @@ void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, if (item != cl_qmap_end(&mgrp->mcm_port_tbl)) osm_mgrp_remove_port(subn, log, mgrp, (osm_mcm_port_t *) item, 0xf); + osm_mgrp_cleanup(subn, mgrp); } /********************************************************************** diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index e8aecc4..5fc1064 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -125,36 +125,14 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) return requested_mlid; max = p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO + 1; - for (i = 0; i < max; i++) { - osm_mgrp_t *p_mgrp = sa->p_subn->mgroups[i]; - if (!p_mgrp || p_mgrp->to_be_deleted) + for (i = 0; i < max; i++) + if (!sa->p_subn->mgroups[i]) return cl_hton16(i + IB_LID_MCAST_START_HO); - } return 0; } /********************************************************************* - This procedure is only invoked to cleanup an INTERMEDIATE mgrp. - If there is only one port on the mgrp it means that the current - request was the only member and the group is not really needed. So - we silently drop it. Since it was an intermediate group no need to - re-route it. -**********************************************************************/ -static void cleanup_mgrp(IN osm_sa_t * sa, osm_mgrp_t * mgrp) -{ - /* Remove MGRP only if osm_mcm_port_t count is 0 and - not a well known group */ - if (cl_is_qmap_empty(&mgrp->mcm_port_tbl) && !mgrp->well_known) { - sa->p_subn->mgroups[cl_ntoh16(mgrp->mlid) - - IB_LID_MCAST_START_HO] = NULL; - cl_fmap_remove_item(&sa->p_subn->mgrp_mgid_tbl, - &mgrp->map_item); - osm_mgrp_delete(mgrp); - } -} - -/********************************************************************* Add a port to the group. Calculating its PROXY_JOIN by the Port and requester gids. **********************************************************************/ @@ -812,7 +790,6 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, unsigned zero_mgid, i; uint8_t scope; ib_gid_t *p_mgid; - osm_mgrp_t *p_prev_mgrp; ib_api_status_t status = IB_SUCCESS; ib_member_rec_t mcm_rec = *p_recvd_mcmember_rec; /* copy for modifications */ @@ -913,25 +890,8 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, (*pp_mgrp)->mcmember_rec.pkt_life |= 2 << 6; /* exactly */ /* Insert the new group in the data base */ - - /* since we might have an old group by that mlid - one whose deletion was delayed for an idle time - we need to deallocate it first */ - p_prev_mgrp = osm_get_mgrp_by_mlid(sa->p_subn, mlid); - if (p_prev_mgrp) { - OSM_LOG(sa->p_log, OSM_LOG_DEBUG, - "Found previous group for mlid:0x%04x - " - "Destroying it first\n", cl_ntoh16(mlid)); - sa->p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = - NULL; - cl_fmap_remove_item(&sa->p_subn->mgrp_mgid_tbl, - &p_prev_mgrp->map_item); - osm_mgrp_delete(p_prev_mgrp); - } - cl_fmap_insert(&sa->p_subn->mgrp_mgid_tbl, &(*pp_mgrp)->mcmember_rec.mgid, &(*pp_mgrp)->map_item); - sa->p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = *pp_mgrp; Exit: @@ -975,8 +935,7 @@ osm_mgrp_t *osm_get_mgrp_by_mgid(IN osm_sa_t * sa, IN ib_gid_t * p_mgid) } mg = (osm_mgrp_t *)cl_fmap_get(&sa->p_subn->mgrp_mgid_tbl, p_mgid); - if (mg != (osm_mgrp_t *)cl_fmap_end(&sa->p_subn->mgrp_mgid_tbl) - && !mg->to_be_deleted) + if (mg != (osm_mgrp_t *)cl_fmap_end(&sa->p_subn->mgrp_mgid_tbl)) return mg; return NULL; @@ -1078,6 +1037,9 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) mcmr_rcv_respond(sa, p_madw, &mcmember_rec); + CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); + osm_mgrp_cleanup(sa->p_subn, p_mgrp); + CL_PLOCK_RELEASE(sa->p_lock); Exit: OSM_LOG_EXIT(sa->p_log); } @@ -1151,7 +1113,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) /* do we need to create a new group? */ p_mgrp = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid); - if (!p_mgrp || p_mgrp->to_be_deleted) { + if (!p_mgrp) { /* check for JoinState.FullMember = 1 o15.0.1.9 */ if ((join_state & 0x01) != 0x01) { char gid_str[INET6_ADDRSTRLEN]; @@ -1235,7 +1197,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) || !validate_port_caps(sa->p_log, p_mgrp, p_physp) || !(join_state != 0)) { /* since we might have created the new group we need to cleanup */ - cleanup_mgrp(sa, p_mgrp); + osm_mgrp_cleanup(sa->p_subn, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B12: " "validate_more_comp_fields, validate_port_caps, " @@ -1268,7 +1230,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) &p_mcmr_port); if (status != IB_SUCCESS) { /* we fail to add the port so we might need to delete the group */ - cleanup_mgrp(sa, p_mgrp); + osm_mgrp_cleanup(sa->p_subn, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); osm_sa_send_error(sa, p_madw, status == IB_INVALID_PARAMETER ? IB_SA_MAD_STATUS_REQ_INVALID : @@ -1299,7 +1261,6 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) osm_mgrp_delete_port(sa->p_subn, sa->p_log, p_mgrp, p_recvd_mcmember_rec->port_gid. unicast.interface_id); - cleanup_mgrp(sa, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -1371,14 +1332,6 @@ static void mcmr_by_comp_mask(osm_sa_t * sa, const ib_member_rec_t * p_rcvd_rec, OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "Checking mlid:0x%X\n", cl_ntoh16(p_mgrp->mlid)); - /* the group might be marked for deletion */ - if (p_mgrp->to_be_deleted) { - OSM_LOG(sa->p_log, OSM_LOG_DEBUG, - "Group mlid:0x%X is marked to be deleted\n", - cl_ntoh16(p_mgrp->mlid)); - goto Exit; - } - /* first try to eliminate the group by MGID, MLID, or P_Key */ if ((IB_MCR_COMPMASK_MGID & comp_mask) && memcmp(&p_rcvd_rec->mgid, &p_mgrp->mcmember_rec.mgid, -- 1.6.4.2 From slavas at Voltaire.COM Mon Sep 7 06:39:39 2009 From: slavas at Voltaire.COM (Slava Strebkov) Date: Mon, 07 Sep 2009 16:39:39 +0300 Subject: [ofa-general] [PATCH v3] opensm: support routing engine update Message-ID: <4AA50D1B.70509@Voltaire.COM> setup routing engine when in use and delete when failed. setup routing engine before use. delete resources when routing algorithm fails. this will save allocation for routing algorithms that are not used. Signed-off-by: Slava Strebkov --- opensm/include/opensm/osm_opensm.h | 5 +++ opensm/opensm/osm_opensm.c | 57 +++++++++++++++++++++++++++++++----- opensm/opensm/osm_subnet.c | 7 ++++- opensm/opensm/osm_ucast_mgr.c | 28 +++++++++++++++++ 4 files changed, 88 insertions(+), 9 deletions(-) diff --git a/opensm/include/opensm/osm_opensm.h b/opensm/include/opensm/osm_opensm.h index c121be4..ca0fddb 100644 --- a/opensm/include/opensm/osm_opensm.h +++ b/opensm/include/opensm/osm_opensm.h @@ -109,6 +109,7 @@ typedef enum _osm_routing_engine_type { } osm_routing_engine_type_t; /***********/ +struct osm_opensm; /****s* OpenSM: OpenSM/osm_routing_engine * NAME * struct osm_routing_engine @@ -122,6 +123,8 @@ typedef enum _osm_routing_engine_type { struct osm_routing_engine { const char *name; void *context; + int initialized; + int (*setup) (struct osm_routing_engine *re, struct osm_opensm *p_osm); int (*build_lid_matrices) (void *context); int (*ucast_build_fwd_tables) (void *context); void (*ucast_dump_tables) (void *context); @@ -183,6 +186,7 @@ typedef struct osm_opensm { cl_dispatcher_t disp; cl_plock_t lock; struct osm_routing_engine *routing_engine_list; + struct osm_routing_engine *last_routing_engine; osm_routing_engine_type_t routing_engine_used; osm_stats_t stats; osm_console_t console; @@ -522,6 +526,7 @@ extern volatile unsigned int osm_exit_flag; * DESCRIPTION * Set to one to cause all threads to leave *********/ +void osm_update_routing_engines(osm_opensm_t *osm, const char *engine_names); END_C_DECLS #endif /* _OSM_OPENSM_H_ */ diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c index 50d1349..f90584d 100644 --- a/opensm/opensm/osm_opensm.c +++ b/opensm/opensm/osm_opensm.c @@ -169,14 +169,7 @@ static void setup_routing_engine(osm_opensm_t *osm, const char *name) memset(re, 0, sizeof(struct osm_routing_engine)); re->name = m->name; - if (m->setup(re, osm)) { - OSM_LOG(&osm->log, OSM_LOG_VERBOSE, - "setup of routing" - " engine \'%s\' failed\n", name); - return; - } - OSM_LOG(&osm->log, OSM_LOG_DEBUG, - "\'%s\' routing engine set up\n", re->name); + re->setup = m->setup; append_routing_engine(osm, re); return; } @@ -236,6 +229,54 @@ static void destroy_routing_engines(osm_opensm_t *osm) r->delete(r->context); free(r); } + osm->routing_engine_list = NULL; +} + +static void update_routing_engine( + struct osm_routing_engine *cur, + struct osm_routing_engine *last) +{ + struct osm_routing_engine *next = cur->next; + if (!last) + return; /* no last routing engine */ + memcpy(cur, last, sizeof(*cur)); + /* restore next */ + cur->next = next; +} + +void osm_update_routing_engines(osm_opensm_t *osm, const char *engine_names) +{ + struct osm_routing_engine *r, *l; + /* find used routing engine and save as last */ + l = r = osm->routing_engine_list; + if (r && osm->routing_engine_used == osm_routing_engine_type(r->name)) { + osm->last_routing_engine = r; + osm->routing_engine_list = r->next; + } + else while ((r = r->next)) { + if (osm->routing_engine_used == + osm_routing_engine_type(r->name)) { + osm->last_routing_engine = r; + l->next = r->next; + break; + } + l = r; + } + /* cleanup prev routing engine list and replace with current list */ + destroy_routing_engines(osm); + setup_routing_engines(osm, engine_names); + /* check if last routing engine exist in new list and update callbacks */ + r = osm->routing_engine_list; + while (r) { + if (osm->routing_engine_used == + osm_routing_engine_type(r->name)) { + update_routing_engine(r, osm->last_routing_engine); + free(osm->last_routing_engine); + osm->last_routing_engine = NULL; + break; + } + r = r->next; + } } /********************************************************************** diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 8d63a75..742ae64 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -152,6 +152,11 @@ static void opts_setup_sm_priority(osm_subn_t *p_subn, void *p_val) osm_set_sm_priority(p_sm, sm_priority); } +static void opts_setup_routing_engine(osm_subn_t *p_subn, void *p_val) +{ + osm_update_routing_engines(p_subn->p_osm, p_val); +} + static void opts_parse_net64(IN osm_subn_t *p_subn, IN char *p_key, IN char *p_val_str, void *p_v1, void *p_v2, void (*pfn)(osm_subn_t *, void *)) @@ -324,7 +329,7 @@ static const opt_rec_t opt_tbl[] = { { "hop_weights_file", OPT_OFFSET(hop_weights_file), opts_parse_charp, NULL, 0 }, { "port_profile_switch_nodes", OPT_OFFSET(port_profile_switch_nodes), opts_parse_boolean, NULL, 1 }, { "sweep_on_trap", OPT_OFFSET(sweep_on_trap), opts_parse_boolean, NULL, 1 }, - { "routing_engine", OPT_OFFSET(routing_engine_names), opts_parse_charp, NULL, 0 }, + { "routing_engine", OPT_OFFSET(routing_engine_names), opts_parse_charp, opts_setup_routing_engine, 1 }, { "connect_roots", OPT_OFFSET(connect_roots), opts_parse_boolean, NULL, 1 }, { "use_ucast_cache", OPT_OFFSET(use_ucast_cache), opts_parse_boolean, NULL, 1 }, { "log_file", OPT_OFFSET(log_file), opts_parse_charp, NULL, 0 }, diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index 39d825c..d6294ac 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -998,8 +998,23 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr) p_osm->routing_engine_used = OSM_ROUTING_ENGINE_TYPE_NONE; while (p_routing_eng) { + if (!p_routing_eng->initialized && + p_routing_eng->setup(p_routing_eng, p_osm)) { + OSM_LOG(p_mgr->p_log, OSM_LOG_ERROR, + "ERR 3A0F: setup of routing engine \'%s\' failed\n", + p_routing_eng->name); + p_routing_eng = p_routing_eng->next; + continue; + } + OSM_LOG(p_mgr->p_log, OSM_LOG_INFO, + "\'%s\' routing engine set up\n", p_routing_eng->name); + p_routing_eng->initialized = 1; if (!ucast_mgr_route(p_routing_eng, p_osm)) break; + /* delete unused routing engine */ + if (p_routing_eng->delete) + p_routing_eng->delete(p_routing_eng->context); + p_routing_eng->initialized = 0; p_routing_eng = p_routing_eng->next; } @@ -1011,6 +1026,19 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr) p_osm->routing_engine_used = OSM_ROUTING_ENGINE_TYPE_MINHOP; } + /* if for some reason different routing engine is used */ + /* cleanup last unused routing engine */ + p_routing_eng = p_osm->last_routing_engine; + if (p_routing_eng) { + if (p_routing_eng->initialized && + p_routing_eng->delete && + p_osm->routing_engine_used != + osm_routing_engine_type(p_routing_eng->name)) + p_routing_eng->delete(p_routing_eng->context); + free(p_routing_eng); + p_osm->last_routing_engine = NULL; + } + OSM_LOG(p_mgr->p_log, OSM_LOG_INFO, "%s tables configured on all switches\n", osm_routing_engine_type_str(p_osm->routing_engine_used)); -- 1.6.3.3 From rdreier at cisco.com Mon Sep 7 08:37:26 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 07 Sep 2009 08:37:26 -0700 Subject: [ofa-general] [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock Message-ID: A new interface was added to the core workqueue API to make handling cancel_delayed_work() deadlocks easier, so a simpler fix for bug 13757 as below becomes possible. Bart, it would be great if you could retest this, since it is what I am planning on sending upstream for 2.6.31. (This patch depends on 4e49627b, "workqueues: introduce __cancel_delayed_work()", which was merged for 2.6.31-rc9; alternatively my for-next branch is now rebased on top of -rc9 and has this patch plus everything else queued for 2.6.32). Thanks, Roland Lockdep reported a possible deadlock with cm_id_priv->lock, mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this happens because the mad module does cancel_delayed_work(&mad_agent_priv->timed_work); while holding mad_agent_priv->lock. cancel_delayed_work() internally does del_timer_sync(&mad_agent_priv->timed_work.timer). This can turn into a deadlock because mad_agent_priv->lock is taken inside cm_id_priv->lock, so we can get the following set of contexts that deadlock each other: A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock B: holding mad_agent_priv->lock, waiting for del_timer_sync() C: interrupt during mad_agent_priv->timed_work.timer that takes cm_id_priv->lock Fix this by using the new __cancel_delayed_work() interface (which internally does del_timer() instead of del_timer_sync()) in all the places where we are holding a lock. Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757 Reported-by: Bart Van Assche Signed-off-by: Roland Dreier --- drivers/infiniband/core/mad.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index de922a0..bc30c00 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -1974,7 +1974,7 @@ static void adjust_timeout(struct ib_mad_agent_private *mad_agent_priv) unsigned long delay; if (list_empty(&mad_agent_priv->wait_list)) { - cancel_delayed_work(&mad_agent_priv->timed_work); + __cancel_delayed_work(&mad_agent_priv->timed_work); } else { mad_send_wr = list_entry(mad_agent_priv->wait_list.next, struct ib_mad_send_wr_private, @@ -1983,7 +1983,7 @@ static void adjust_timeout(struct ib_mad_agent_private *mad_agent_priv) if (time_after(mad_agent_priv->timeout, mad_send_wr->timeout)) { mad_agent_priv->timeout = mad_send_wr->timeout; - cancel_delayed_work(&mad_agent_priv->timed_work); + __cancel_delayed_work(&mad_agent_priv->timed_work); delay = mad_send_wr->timeout - jiffies; if ((long)delay <= 0) delay = 1; @@ -2023,7 +2023,7 @@ static void wait_for_response(struct ib_mad_send_wr_private *mad_send_wr) /* Reschedule a work item if we have a shorter timeout */ if (mad_agent_priv->wait_list.next == &mad_send_wr->agent_list) { - cancel_delayed_work(&mad_agent_priv->timed_work); + __cancel_delayed_work(&mad_agent_priv->timed_work); queue_delayed_work(mad_agent_priv->qp_info->port_priv->wq, &mad_agent_priv->timed_work, delay); } -- 1.6.4 From sashak at voltaire.com Mon Sep 7 08:47:47 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Sep 2009 18:47:47 +0300 Subject: [ofa-general] [PATCH] opensm: port object reference in mcm ports list Message-ID: <20090907154747.GO25241@me> This adds port object reference to port related structures list in multicast group. In this way we are saving couple of lookups over port guid table and simplifying some interfaces. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_mcm_port.h | 24 +++++++++----- opensm/include/opensm/osm_multicast.h | 14 +++++--- opensm/opensm/osm_mcast_mgr.c | 54 ++----------------------------- opensm/opensm/osm_mcm_port.c | 11 +++--- opensm/opensm/osm_multicast.c | 16 +++------ opensm/opensm/osm_sa.c | 24 +++++++------- opensm/opensm/osm_sa_mcmember_record.c | 13 +++----- 7 files changed, 56 insertions(+), 100 deletions(-) diff --git a/opensm/include/opensm/osm_mcm_port.h b/opensm/include/opensm/osm_mcm_port.h index c2b18de..99ded21 100644 --- a/opensm/include/opensm/osm_mcm_port.h +++ b/opensm/include/opensm/osm_mcm_port.h @@ -46,6 +46,7 @@ #include #include #include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -71,6 +72,7 @@ BEGIN_C_DECLS */ typedef struct osm_mcm_port { cl_map_item_t map_item; + const osm_port_t *port; ib_gid_t port_gid; uint8_t scope_state; boolean_t proxy_join; @@ -80,6 +82,9 @@ typedef struct osm_mcm_port { * map_item * Map Item for qmap linkage. Must be first element!! * +* port +* Reference to the parent port. +* * port_gid * GID of the member port. * @@ -106,19 +111,20 @@ typedef struct osm_mcm_port { * * SYNOPSIS */ -osm_mcm_port_t *osm_mcm_port_new(IN const ib_gid_t * const p_port_gid, - IN const uint8_t scope_state, - IN const boolean_t proxy_join); +osm_mcm_port_t *osm_mcm_port_new(IN const osm_port_t * port, + IN ib_member_rec_t *mcmr, + IN boolean_t proxy_join); /* * PARAMETERS -* p_port_gid -* [in] Pointer to the GID of the port to add to the multicast group. +* port +* [in] Pointer to the port object. +* GID of the port to add to the multicast group. * -* scope_state -* [in] scope state of the join request +* mcmr +* [in] Pointer to MCMember record of the join request * -* proxy_join -* [in] proxy_join state analyzed from the request +* proxy_join +* [in] proxy_join state analyzed from the request * * RETURN VALUES * Pointer to the allocated and initialized MCM Port object. diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index 181f0db..759fba1 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -310,19 +310,21 @@ static inline ib_net16_t osm_mgrp_get_mlid(IN const osm_mgrp_t * const p_mgrp) */ osm_mcm_port_t *osm_mgrp_add_port(osm_subn_t *subn, osm_log_t *log, IN osm_mgrp_t * const p_mgrp, - IN const ib_gid_t * const p_port_gid, - IN const uint8_t join_state, + IN osm_port_t *port, IN ib_member_rec_t *mcmr, IN boolean_t proxy_join); /* * PARAMETERS * p_mgrp * [in] Pointer to an osm_mgrp_t object to initialize. * -* p_port_gid -* [in] Pointer to the GID of the port to add to the multicast group. +* port +* [in] Pointer to an osm_port_t object * -* join_state -* [in] The join state for this port in the group. +* mcmr +* [in] Pointer to MCMember record received for the join +* +* proxy_join +* [in] The proxy join state for this port in the group. * * RETURN VALUES * IB_SUCCESS diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..3894677 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -132,7 +132,6 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, float avg_hops = 0; uint32_t hops = 0; uint32_t num_ports = 0; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; const cl_qmap_t *p_mcm_tbl; @@ -148,23 +147,7 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); p_mcm_port = (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - /* - Acquire the port object for this port guid, then create - the new worker object to build the list. - */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A18: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - hops += osm_switch_get_port_least_hops(p_sw, p_port); + hops += osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); num_ports++; } @@ -190,7 +173,6 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, { uint32_t max_hops = 0; uint32_t hops = 0; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; const cl_qmap_t *p_mcm_tbl; @@ -206,23 +188,7 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); p_mcm_port = (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - /* - Acquire the port object for this port guid, then create - the new worker object to build the list. - */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1A: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - hops = osm_switch_get_port_least_hops(p_sw, p_port); + hops = osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); if (hops > max_hops) max_hops = hops; } @@ -714,7 +680,6 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, osm_mgrp_t * p_mgrp) { const cl_qmap_t *p_mcm_tbl; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; uint32_t num_ports; cl_qlist_t port_list; @@ -781,23 +746,12 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, Acquire the port object for this port guid, then create the new worker object to build the list. */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A09: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - p_wobj = mcast_work_obj_new(p_port); + p_wobj = mcast_work_obj_new(p_mcm_port->port); if (p_wobj == NULL) { OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A10: " "Insufficient memory to route port 0x%016" PRIx64 "\n", - cl_ntoh64(osm_port_get_guid(p_port))); + cl_ntoh64(osm_port_get_guid(p_mcm_port->port))); continue; } diff --git a/opensm/opensm/osm_mcm_port.c b/opensm/opensm/osm_mcm_port.c index b6b6149..7f5e9ff 100644 --- a/opensm/opensm/osm_mcm_port.c +++ b/opensm/opensm/osm_mcm_port.c @@ -50,17 +50,18 @@ /********************************************************************** **********************************************************************/ -osm_mcm_port_t *osm_mcm_port_new(IN const ib_gid_t * const p_port_gid, - IN const uint8_t scope_state, - IN const boolean_t proxy_join) +osm_mcm_port_t *osm_mcm_port_new(IN const osm_port_t *port, + IN ib_member_rec_t *mcmr, + IN boolean_t proxy_join) { osm_mcm_port_t *p_mcm; p_mcm = malloc(sizeof(*p_mcm)); if (p_mcm) { memset(p_mcm, 0, sizeof(*p_mcm)); - p_mcm->port_gid = *p_port_gid; - p_mcm->scope_state = scope_state; + p_mcm->port = port; + p_mcm->port_gid = mcmr->port_gid; + p_mcm->scope_state = mcmr->scope_state; p_mcm->proxy_join = proxy_join; } diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 242eae7..b0ab4df 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -131,23 +131,19 @@ static void mgrp_send_notice(osm_subn_t * subn, osm_log_t * log, /********************************************************************** **********************************************************************/ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, - IN osm_mgrp_t * p_mgrp, - IN const ib_gid_t * p_port_gid, - IN const uint8_t join_state, + IN osm_mgrp_t * p_mgrp, osm_port_t *port, + IN ib_member_rec_t *mcmr, IN boolean_t proxy_join) { - ib_net64_t port_guid; osm_mcm_port_t *p_mcm_port; cl_map_item_t *prev_item; - uint8_t prev_join_state = 0; + uint8_t prev_join_state = 0, join_state = mcmr->scope_state; uint8_t prev_scope; - p_mcm_port = osm_mcm_port_new(p_port_gid, join_state, proxy_join); + p_mcm_port = osm_mcm_port_new(port, mcmr, proxy_join); if (!p_mcm_port) return NULL; - port_guid = p_port_gid->unicast.interface_id; - /* prev_item = cl_qmap_insert(...) Pointer to the item in the map with the specified key. If insertion @@ -155,8 +151,8 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, specified key already exists in the map, the pointer to that item is returned. */ - prev_item = cl_qmap_insert(&p_mgrp->mcm_port_tbl, - port_guid, &p_mcm_port->map_item); + prev_item = cl_qmap_insert(&p_mgrp->mcm_port_tbl, port->guid, + &p_mcm_port->map_item); /* if already exists - revert the insertion and only update join state */ if (prev_item != &p_mcm_port->map_item) { diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index fcc3f27..02737c2 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -995,26 +995,26 @@ int osm_sa_db_file_load(osm_opensm_t * p_osm) if (!p_mgrp) rereg_clients = 1; } else if (p_mgrp && !strncmp(p, "mcm_port", 8)) { - ib_gid_t port_gid; + ib_member_rec_t mcmr; ib_net64_t guid; - uint8_t scope_state; - boolean_t proxy_join; + osm_port_t *port; + boolean_t proxy; PARSE_AHEAD(p, net64, " port_gid=0x", - &port_gid.unicast.prefix); + &mcmr.port_gid.unicast.prefix); PARSE_AHEAD(p, net64, ":0x", - &port_gid.unicast.interface_id); - PARSE_AHEAD(p, net8, " scope_state=0x", &scope_state); + &mcmr.port_gid.unicast.interface_id); + PARSE_AHEAD(p, net8, " scope_state=0x", &mcmr.scope_state); PARSE_AHEAD(p, net8, " proxy_join=0x", &val); - proxy_join = val; + proxy = val; - guid = port_gid.unicast.interface_id; - if (cl_qmap_get(&p_mgrp->mcm_port_tbl, - port_gid.unicast.interface_id) == + guid = mcmr.port_gid.unicast.interface_id; + port = osm_get_port_by_guid(&p_osm->subn, guid); + if (port && + cl_qmap_get(&p_mgrp->mcm_port_tbl, guid) == cl_qmap_end(&p_mgrp->mcm_port_tbl)) osm_mgrp_add_port(&p_osm->subn, &p_osm->log, - p_mgrp, &port_gid, - scope_state, proxy_join); + p_mgrp, port, &mcmr, proxy); } else if (!strncmp(p, "Service Record:", 15)) { ib_service_record_t s_rec; uint32_t modified_time, lease_period; diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 5fc1064..c2027d1 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -137,6 +137,7 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) requester gids. **********************************************************************/ static ib_api_status_t add_new_mgrp_port(osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, + IN osm_port_t *port, IN ib_member_rec_t * p_recvd_mcmember_rec, IN osm_mad_addr_t * p_mad_addr, @@ -172,10 +173,8 @@ static ib_api_status_t add_new_mgrp_port(osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, "Create new port with proxy_join TRUE\n"); } - *pp_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, - &p_recvd_mcmember_rec->port_gid, - p_recvd_mcmember_rec->scope_state, - proxy_join); + *pp_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, port, + p_recvd_mcmember_rec, proxy_join); if (*pp_mcmr_port == NULL) { OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B06: " "osm_mgrp_add_port failed\n"); @@ -410,7 +409,6 @@ static boolean_t validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, if the requester GID == PortGID */ res = osm_get_gid_by_mad_addr(sa->p_log, sa->p_subn, p_mad_addr, &request_gid); - if (res != IB_SUCCESS) { OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "Could not find port for requested address\n"); @@ -443,8 +441,7 @@ static boolean_t validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "ProxyJoin but port not in partition. stored:" "0x%016" PRIx64 " request:0x%016" PRIx64 "\n", - cl_ntoh64((*pp_mcm_port)->port_gid.unicast. - interface_id), + cl_ntoh64((*pp_mcm_port)->port->guid), cl_ntoh64(p_mad_addr->addr_type.gsi.grh_info. src_gid.unicast.interface_id)); return FALSE; @@ -1225,7 +1222,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) } /* create or update existing port (join-state will be updated) */ - status = add_new_mgrp_port(sa, p_mgrp, p_recvd_mcmember_rec, + status = add_new_mgrp_port(sa, p_mgrp, p_port, p_recvd_mcmember_rec, osm_madw_get_mad_addr_ptr(p_madw), &p_mcmr_port); if (status != IB_SUCCESS) { -- 1.6.4.2 From sashak at voltaire.com Mon Sep 7 09:00:06 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Sep 2009 19:00:06 +0300 Subject: [ofa-general] Re: [PATCH 1/2 v3] opensm: Storage organization for multicast groups In-Reply-To: <20090906152505.GC25241@me> References: <4A798D56.2020408@Voltaire.COM> <20090906152505.GC25241@me> Message-ID: <20090907160006.GP25241@me> On 18:25 Sun 06 Sep , Sasha Khapyorsky wrote: > > > > Subject: [PATCH 1/2] Storage organization for multicast groups BTW, I pushed now series of various multicast improvements. As far as I understand this should simplify your implementation dramatically. Sasha From bart.vanassche at gmail.com Mon Sep 7 13:27:14 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 7 Sep 2009 22:27:14 +0200 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: References: Message-ID: On Mon, Sep 7, 2009 at 5:37 PM, Roland Dreier wrote: > A new interface was added to the core workqueue API to make handling > cancel_delayed_work() deadlocks easier, so a simpler fix for bug 13757 > as below becomes possible.  Bart, it would be great if you could retest > this, since it is what I am planning on sending upstream for 2.6.31. > (This patch depends on 4e49627b, "workqueues: introduce > __cancel_delayed_work()", which was merged for 2.6.31-rc9; alternatively > my for-next branch is now rebased on top of -rc9 and has this patch plus > everything else queued for 2.6.32). Hello Roland, With 2.6.31-rc9 + patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 + the patch you posted at the start of this thread the following lockdep complaint was triggered on the SRP initiator system during SRP login: ====================================================== [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] 2.6.31-rc9 #2 ------------------------------------------------------ ibsrpdm/4290 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (&(&rmpp_recv->cleanup_work)->timer){+.-...}, at: [] del_timer_sync+0x0/0xa0 and this task is already holding: (&mad_agent_priv->lock){..-...}, at: [] ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] which would create a new lock dependency: (&mad_agent_priv->lock){..-...} -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} but this new dependency connects a HARDIRQ-irq-safe lock: (&priv->lock){-.-...} ... which became HARDIRQ-irq-safe at: [] 0xffffffffffffffff to a HARDIRQ-irq-unsafe lock: (&(&rmpp_recv->cleanup_work)->timer){+.-...} ... which became HARDIRQ-irq-unsafe at: ... [] 0xffffffffffffffff other info that might help us debug this: 2 locks held by ibsrpdm/4290: #0: (&port->file_mutex){+.+.+.}, at: [] ib_umad_close+0x39/0x120 [ib_umad] #1: (&mad_agent_priv->lock){..-...}, at: [] ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] [ ... ] stack backtrace: Pid: 4290, comm: ibsrpdm Not tainted 2.6.31-rc9 #2 Call Trace: [] check_usage+0x3ba/0x470 [] check_irq_usage+0x64/0x100 [] __lock_acquire+0xf72/0x1b50 [] lock_acquire+0x56/0x80 [] ? del_timer_sync+0x0/0xa0 [] del_timer_sync+0x3d/0xa0 [] ? del_timer_sync+0x0/0xa0 [] ib_cancel_rmpp_recvs+0x62/0x118 [ib_mad] [] ib_unregister_mad_agent+0x385/0x580 [ib_mad] [] ? mark_held_locks+0x6c/0x90 [] ib_umad_close+0xd2/0x120 [ib_umad] [] __fput+0xd0/0x1e0 [] fput+0x1d/0x30 [] filp_close+0x5b/0x90 [] put_files_struct+0x84/0xe0 [] exit_files+0x4e/0x60 [] do_exit+0x709/0x790 [] ? up_read+0x26/0x30 [] ? retint_swapgs+0xe/0x13 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b Bart. From sashak at voltaire.com Mon Sep 7 16:09:57 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Sep 2009 02:09:57 +0300 Subject: [ofa-general] Re: [PATCH 0/3] Fat-Tree code cleanup In-Reply-To: <4A8C4943.6090408@morey-chaisemartin.com> References: <4A8C4943.6090408@morey-chaisemartin.com> Message-ID: <20090907230957.GQ25241@me> On 20:49 Wed 19 Aug , Nicolas Morey-Chaisemartin wrote: > Except the first one, patches in this series are trivial cleanups. > The first one remove the reverse_hop parameter which is not used anymore thanks to the current_hop. > > Nicolas Morey-Chaisemartin (3): > osm_ucast_ftree.c: Removed reverse_hop parameters from > fabric_route_upgoing_by_going_down > osm_ucast_ftree.c: Cleaned up many comments > osm_ucast_ftree.c: Applied osm_indent Applied all three. Thanks. Sasha From rdreier at cisco.com Mon Sep 7 21:21:37 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 07 Sep 2009 21:21:37 -0700 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: (Bart Van Assche's message of "Mon, 7 Sep 2009 22:27:14 +0200") References: Message-ID: > With 2.6.31-rc9 + patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 + the > patch you posted at the start of this thread the following lockdep > complaint was triggered on the SRP initiator system during SRP login: > > ====================================================== > [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] > 2.6.31-rc9 #2 > ------------------------------------------------------ > ibsrpdm/4290 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: > (&(&rmpp_recv->cleanup_work)->timer){+.-...}, at: > [] del_timer_sync+0x0/0xa0 > > and this task is already holding: > (&mad_agent_priv->lock){..-...}, at: [] > ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] > which would create a new lock dependency: > (&mad_agent_priv->lock){..-...} -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} And this report doesn't happen with the older patch? (Did you do the same testing with the older patch that triggered this) Because this looks like a *different* incarnation of the same lock->lock->delayed work/timer that we're trying to fix here -- the delayed work is now rmpp_recv->cleanup_work in this case instead of mad_agent_priv->timed_work as it was before. - R. From rdreier at cisco.com Mon Sep 7 21:56:41 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 07 Sep 2009 21:56:41 -0700 Subject: [ofa-general] [PLEASE READ] Transition from general@lists.openfabrics.org to linux-rdma@vger.kernel.org Message-ID: Hi everyone, As you may have noticed, the linux-rdma at vger.kernel.org list is up and running. A number of sites have started archiving it -- in no particular order, the sites I know of are: http://www.spinics.net/lists/linux-rdma/ http://marc.info/?l=linux-rdma http://www.mail-archive.com/linux-rdma at vger.kernel.org/ http://dir.gmane.org/gmane.linux.drivers.rdma If you know of any archives I've missed, please let me know. Another nice tool we have with the new list is http://patchwork.kernel.org/project/linux-rdma/list/ which will make me have to work a little harder to lose patches. Also, if you are someone who handles a driver or some other part of the tree, you can register and I will be able to delegate patches to you. I would suggest that everyone who is currently following general at lists.openfabrics.org to subscribe to the new list (send the line "subscribe linux-rdma" in the body of a message to majordomo at vger.kernel.org). Also, everyone sending email to the old list to also please start to copy the new vger list as well (especially for patches, so that the patchwork tool catches them). As a date to finalize the transition, I propose the end of the month -- that is, on October 1 we shut down the old general@ list and expect everything to be cut over by then. I'll send a number of reminders before then, but please do speak up if this schedule is too tight -- I don't think keeping the old list running for longer is a big hardship for anyone. Finally, I'll plan to merge the following for 2.6.32: MAINTAINERS: InfiniBand/RDMA mailing list transition to vger InfiniBand/RDMA development discussion is moving from general at lists.openfabrics.org to linux-rdma at vger.kernel.org. Signed-off-by: Roland Dreier --- MAINTAINERS | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 8dca9d8..989ff11 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -439,7 +439,7 @@ F: drivers/hwmon/ams/ AMSO1100 RNIC DRIVER M: Tom Tucker M: Steve Wise -L: general at lists.openfabrics.org +L: linux-rdma at vger.kernel.org S: Maintained F: drivers/infiniband/hw/amso1100/ @@ -1494,7 +1494,7 @@ F: drivers/net/cxgb3/ CXGB3 IWARP RNIC DRIVER (IW_CXGB3) M: Steve Wise -L: general at lists.openfabrics.org +L: linux-rdma at vger.kernel.org W: http://www.openfabrics.org S: Supported F: drivers/infiniband/hw/cxgb3/ @@ -1868,7 +1868,7 @@ F: fs/efs/ EHCA (IBM GX bus InfiniBand adapter) DRIVER M: Hoang-Nam Nguyen M: Christoph Raisch -L: general at lists.openfabrics.org +L: linux-rdma at vger.kernel.org S: Supported F: drivers/infiniband/hw/ehca/ @@ -2552,7 +2552,7 @@ INFINIBAND SUBSYSTEM M: Roland Dreier M: Sean Hefty M: Hal Rosenstock -L: general at lists.openfabrics.org (moderated for non-subscribers) +L: linux-rdma at vger.kernel.org W: http://www.openib.org/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git S: Supported @@ -2729,7 +2729,7 @@ F: drivers/net/ipg.c IPATH DRIVER M: Ralph Campbell -L: general at lists.openfabrics.org +L: linux-rdma at vger.kernel.org T: git git://git.qlogic.com/ipath-linux-2.6 S: Supported F: drivers/infiniband/hw/ipath/ @@ -3485,7 +3485,7 @@ F: drivers/scsi/NCR_D700.* NETEFFECT IWARP RNIC DRIVER (IW_NES) M: Faisal Latif M: Chien Tung -L: general at lists.openfabrics.org +L: linux-rdma at vger.kernel.org W: http://www.neteffect.com S: Supported F: drivers/infiniband/hw/nes/ -- 1.6.4 From bart.vanassche at gmail.com Mon Sep 7 23:25:53 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 8 Sep 2009 08:25:53 +0200 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 6:21 AM, Roland Dreier wrote: > >  > With 2.6.31-rc9 + patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 + the >  > patch you posted at the start of this thread the following lockdep >  > complaint was triggered on the SRP initiator system during SRP login: >  > >  > ====================================================== >  > [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] >  > 2.6.31-rc9 #2 >  > ------------------------------------------------------ >  > ibsrpdm/4290 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: >  >  (&(&rmpp_recv->cleanup_work)->timer){+.-...}, at: >  > [] del_timer_sync+0x0/0xa0 >  > >  > and this task is already holding: >  >  (&mad_agent_priv->lock){..-...}, at: [] >  > ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] >  > which would create a new lock dependency: >  >  (&mad_agent_priv->lock){..-...} -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} > > And this report doesn't happen with the older patch?  (Did you do the > same testing with the older patch that triggered this) > > Because this looks like a *different* incarnation of the same > lock->lock->delayed work/timer that we're trying to fix here -- the > delayed work is now rmpp_recv->cleanup_work in this case instead of > mad_agent_priv->timed_work as it was before. The above issue does not occur with the for-next branch of the infiniband git tree, but does occur with 2.6.31-rc9 + aforementioned patches. As far as I can see commit 721d67cdca5b7642b380ca0584de8dceecf6102f (http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=721d67cdca5b7642b380ca0584de8dceecf6102f) is not yet included in 2.6.31-rc9. Could this be related to the above issue ? Bart. From Lars.Paul.Huse at Sun.COM Tue Sep 8 00:58:35 2009 From: Lars.Paul.Huse at Sun.COM (Lars Paul Huse) Date: Tue, 08 Sep 2009 09:58:35 +0200 Subject: [ofa-general] [PATCH] ibdm/ibnl/SUNDSC*.ibnl Corrected ibnl definition files for Sun IB QDR products Message-ID: <4AA60EAB.1020101@Sun.COM> Updated ibnl definition files for Sun IB QDR products: SUNDCS648QDR: Corrected card numbering & plugg row SUNDCS72QDR: Corrected plugg row Signed-off-by: Lars Paul Huse --- diff --git a/ibdm/ibnl/SUNDCS648QDR.ibnl b/ibdm/ibnl/SUNDCS648QDR.ibnl index a8b6558..af4ab6b 100644 --- a/ibdm/ibnl/SUNDCS648QDR.ibnl +++ b/ibdm/ibnl/SUNDCS648QDR.ibnl @@ -80,2054 +80,2054 @@ NODE SW 36 MT48436 U1 TOPSYSTEM SUNDCS648QDR,SUN-M9-648 +SUBSYSTEM SPINE fc0A + P1 -10G-> lc0A P13 + P2 -10G-> lc0B P14 + P3 -10G-> lc0C P13 + P4 -10G-> lc0D P14 + P5 -10G-> lc8A P13 + P6 -10G-> lc8C P13 + P7 -10G-> lc8B P14 + P8 -10G-> lc7A P13 + P9 -10G-> lc8D P14 + P10 -10G-> lc7C P13 + P11 -10G-> lc7B P140 + P12 -10G-> lc6A P13 + P13 -10G-> lc5B P14 + P14 -10G-> lc5A P13 + P15 -10G-> lc6D P14 + P16 -10G-> lc6C P13 + P17 -10G-> lc6B P14 + P18 -10G-> lc7D P14 + P19 -10G-> lc1D P14 + P20 -10G-> lc1C P13 + P21 -10G-> lc1B P14 + P22 -10G-> lc1A P13 + P23 -10G-> lc2D P14 + P24 -10G-> lc2B P14 + P25 -10G-> lc2C P13 + P26 -10G-> lc3D P14 + P27 -10G-> lc2A P13 + P28 -10G-> lc3B P14 + P29 -10G-> lc3C P13 + P30 -10G-> lc4D P14 + P31 -10G-> lc5C P13 + P32 -10G-> lc5D P14 + P33 -10G-> lc4A P13 + P34 -10G-> lc4B P14 + P35 -10G-> lc4C P13 + P36 -10G-> lc3A P13 + +SUBSYSTEM SPINE fc0B + P1 -10G-> lc7D P13 + P2 -10G-> lc7A P14 + P3 -10G-> lc7B P13 + P4 -10G-> lc7C P14 + P5 -10G-> lc6D P13 + P6 -10G-> lc6B P13 + P7 -10G-> lc6A P14 + P8 -10G-> lc5D P13 + P9 -10G-> lc6C P14 + P10 -10G-> lc5B P13 + P11 -10G-> lc5A P14 + P12 -10G-> lc4D P13 + P13 -10G-> lc3A P14 + P14 -10G-> lc3D P13 + P15 -10G-> lc4C P14 + P16 -10G-> lc4B P13 + P17 -10G-> lc4A P14 + P18 -10G-> lc5C P14 + P19 -10G-> lc8C P14 + P20 -10G-> lc8B P13 + P21 -10G-> lc8A P14 + P22 -10G-> lc8D P13 + P23 -10G-> lc0C P14 + P24 -10G-> lc0A P14 + P25 -10G-> lc0B P13 + P26 -10G-> lc1C P14 + P27 -10G-> lc0D P13 + P28 -10G-> lc1A P14 + P29 -10G-> lc1B P13 + P30 -10G-> lc2C P14 + P31 -10G-> lc3B P13 + P32 -10G-> lc3C P14 + P33 -10G-> lc2D P13 + P34 -10G-> lc2A P14 + P35 -10G-> lc2B P13 + P36 -10G-> lc1D P13 + SUBSYSTEM SPINE fc1A - P1 -10G-> lc1A P13 - P2 -10G-> lc1B P14 - P3 -10G-> lc1C P13 - P4 -10G-> lc1D P14 - P5 -10G-> lc9A P13 - P6 -10G-> lc9C P13 - P7 -10G-> lc9B P14 - P8 -10G-> lc8A P13 - P9 -10G-> lc9D P14 - P10 -10G-> lc8C P13 - P11 -10G-> lc8B P140 - P12 -10G-> lc7A P13 - P13 -10G-> lc6B P14 - P14 -10G-> lc6A P13 - P15 -10G-> lc7D P14 - P16 -10G-> lc7C P13 - P17 -10G-> lc7B P14 - P18 -10G-> lc8D P14 - P19 -10G-> lc2D P14 - P20 -10G-> lc2C P13 - P21 -10G-> lc2B P14 - P22 -10G-> lc2A P13 - P23 -10G-> lc3D P14 - P24 -10G-> lc3B P14 - P25 -10G-> lc3C P13 - P26 -10G-> lc4D P14 - P27 -10G-> lc3A P13 - P28 -10G-> lc4B P14 - P29 -10G-> lc4C P13 - P30 -10G-> lc5D P14 - P31 -10G-> lc6C P13 - P32 -10G-> lc6D P14 - P33 -10G-> lc5A P13 - P34 -10G-> lc5B P14 - P35 -10G-> lc5C P13 - P36 -10G-> lc4A P13 + P1 -10G-> lc0A P15 + P2 -10G-> lc0B P16 + P3 -10G-> lc0C P15 + P4 -10G-> lc0D P16 + P5 -10G-> lc8A P15 + P6 -10G-> lc8C P15 + P7 -10G-> lc8B P16 + P8 -10G-> lc7A P15 + P9 -10G-> lc8D P16 + P10 -10G-> lc7C P15 + P11 -10G-> lc7B P16 + P12 -10G-> lc6A P15 + P13 -10G-> lc5B P16 + P14 -10G-> lc5A P15 + P15 -10G-> lc6D P16 + P16 -10G-> lc6C P15 + P17 -10G-> lc6B P16 + P18 -10G-> lc7D P16 + P19 -10G-> lc1D P16 + P20 -10G-> lc1C P15 + P21 -10G-> lc1B P16 + P22 -10G-> lc1A P15 + P23 -10G-> lc2D P16 + P24 -10G-> lc2B P16 + P25 -10G-> lc2C P15 + P26 -10G-> lc3D P16 + P27 -10G-> lc2A P15 + P28 -10G-> lc3B P16 + P29 -10G-> lc3C P15 + P30 -10G-> lc4D P16 + P31 -10G-> lc5C P15 + P32 -10G-> lc5D P16 + P33 -10G-> lc4A P15 + P34 -10G-> lc4B P16 + P35 -10G-> lc4C P15 + P36 -10G-> lc3A P15 SUBSYSTEM SPINE fc1B - P1 -10G-> lc8D P13 - P2 -10G-> lc8A P14 - P3 -10G-> lc8B P13 - P4 -10G-> lc8C P14 - P5 -10G-> lc7D P13 - P6 -10G-> lc7B P13 - P7 -10G-> lc7A P14 - P8 -10G-> lc6D P13 - P9 -10G-> lc7C P14 - P10 -10G-> lc6B P13 - P11 -10G-> lc6A P14 - P12 -10G-> lc5D P13 - P13 -10G-> lc4A P14 - P14 -10G-> lc4D P13 - P15 -10G-> lc5C P14 - P16 -10G-> lc5B P13 - P17 -10G-> lc5A P14 - P18 -10G-> lc6C P14 - P19 -10G-> lc9C P14 - P20 -10G-> lc9B P13 - P21 -10G-> lc9A P14 - P22 -10G-> lc9D P13 - P23 -10G-> lc1C P14 - P24 -10G-> lc1A P14 - P25 -10G-> lc1B P13 - P26 -10G-> lc2C P14 - P27 -10G-> lc1D P13 - P28 -10G-> lc2A P14 - P29 -10G-> lc2B P13 - P30 -10G-> lc3C P14 - P31 -10G-> lc4B P13 - P32 -10G-> lc4C P14 - P33 -10G-> lc3D P13 - P34 -10G-> lc3A P14 - P35 -10G-> lc3B P13 - P36 -10G-> lc2D P13 + P1 -10G-> lc7D P15 + P2 -10G-> lc7A P16 + P3 -10G-> lc7B P15 + P4 -10G-> lc7C P16 + P5 -10G-> lc6D P15 + P6 -10G-> lc6B P15 + P7 -10G-> lc6A P16 + P8 -10G-> lc5D P15 + P9 -10G-> lc6C P16 + P10 -10G-> lc5B P15 + P11 -10G-> lc5A P16 + P12 -10G-> lc4D P15 + P13 -10G-> lc3A P16 + P14 -10G-> lc3D P15 + P15 -10G-> lc4C P16 + P16 -10G-> lc4B P15 + P17 -10G-> lc4A P16 + P18 -10G-> lc5C P16 + P19 -10G-> lc8C P16 + P20 -10G-> lc8B P15 + P21 -10G-> lc8A P16 + P22 -10G-> lc8D P15 + P23 -10G-> lc0C P16 + P24 -10G-> lc0A P16 + P25 -10G-> lc0B P15 + P26 -10G-> lc1C P16 + P27 -10G-> lc0D P15 + P28 -10G-> lc1A P16 + P29 -10G-> lc1B P15 + P30 -10G-> lc2C P16 + P31 -10G-> lc3B P15 + P32 -10G-> lc3C P16 + P33 -10G-> lc2D P15 + P34 -10G-> lc2A P16 + P35 -10G-> lc2B P15 + P36 -10G-> lc1D P15 SUBSYSTEM SPINE fc2A - P1 -10G-> lc1A P15 - P2 -10G-> lc1B P16 - P3 -10G-> lc1C P15 - P4 -10G-> lc1D P16 - P5 -10G-> lc9A P15 - P6 -10G-> lc9C P15 - P7 -10G-> lc9B P16 - P8 -10G-> lc8A P15 - P9 -10G-> lc9D P16 - P10 -10G-> lc8C P15 - P11 -10G-> lc8B P16 - P12 -10G-> lc7A P15 - P13 -10G-> lc6B P16 - P14 -10G-> lc6A P15 - P15 -10G-> lc7D P16 - P16 -10G-> lc7C P15 - P17 -10G-> lc7B P16 - P18 -10G-> lc8D P16 - P19 -10G-> lc2D P16 - P20 -10G-> lc2C P15 - P21 -10G-> lc2B P16 - P22 -10G-> lc2A P15 - P23 -10G-> lc3D P16 - P24 -10G-> lc3B P16 - P25 -10G-> lc3C P15 - P26 -10G-> lc4D P16 - P27 -10G-> lc3A P15 - P28 -10G-> lc4B P16 - P29 -10G-> lc4C P15 - P30 -10G-> lc5D P16 - P31 -10G-> lc6C P15 - P32 -10G-> lc6D P16 - P33 -10G-> lc5A P15 - P34 -10G-> lc5B P16 - P35 -10G-> lc5C P15 - P36 -10G-> lc4A P15 + P1 -10G-> lc0A P17 + P2 -10G-> lc0B P18 + P3 -10G-> lc0C P17 + P4 -10G-> lc0D P18 + P5 -10G-> lc8A P17 + P6 -10G-> lc8C P17 + P7 -10G-> lc8B P18 + P8 -10G-> lc7A P17 + P9 -10G-> lc8D P18 + P10 -10G-> lc7C P17 + P11 -10G-> lc7B P18 + P12 -10G-> lc6A P17 + P13 -10G-> lc5B P18 + P14 -10G-> lc5A P17 + P15 -10G-> lc6D P18 + P16 -10G-> lc6C P17 + P17 -10G-> lc6B P18 + P18 -10G-> lc7D P18 + P19 -10G-> lc1D P18 + P20 -10G-> lc1C P17 + P21 -10G-> lc1B P18 + P22 -10G-> lc1A P17 + P23 -10G-> lc2D P18 + P24 -10G-> lc2B P18 + P25 -10G-> lc2C P17 + P26 -10G-> lc3D P18 + P27 -10G-> lc2A P17 + P28 -10G-> lc3B P18 + P29 -10G-> lc3C P17 + P30 -10G-> lc4D P18 + P31 -10G-> lc5C P17 + P32 -10G-> lc5D P18 + P33 -10G-> lc4A P17 + P34 -10G-> lc4B P18 + P35 -10G-> lc4C P17 + P36 -10G-> lc3A P17 SUBSYSTEM SPINE fc2B - P1 -10G-> lc8D P15 - P2 -10G-> lc8A P16 - P3 -10G-> lc8B P15 - P4 -10G-> lc8C P16 - P5 -10G-> lc7D P15 - P6 -10G-> lc7B P15 - P7 -10G-> lc7A P16 - P8 -10G-> lc6D P15 - P9 -10G-> lc7C P16 - P10 -10G-> lc6B P15 - P11 -10G-> lc6A P16 - P12 -10G-> lc5D P15 - P13 -10G-> lc4A P16 - P14 -10G-> lc4D P15 - P15 -10G-> lc5C P16 - P16 -10G-> lc5B P15 - P17 -10G-> lc5A P16 - P18 -10G-> lc6C P16 - P19 -10G-> lc9C P16 - P20 -10G-> lc9B P15 - P21 -10G-> lc9A P16 - P22 -10G-> lc9D P15 - P23 -10G-> lc1C P16 - P24 -10G-> lc1A P16 - P25 -10G-> lc1B P15 - P26 -10G-> lc2C P16 - P27 -10G-> lc1D P15 - P28 -10G-> lc2A P16 - P29 -10G-> lc2B P15 - P30 -10G-> lc3C P16 - P31 -10G-> lc4B P15 - P32 -10G-> lc4C P16 - P33 -10G-> lc3D P15 - P34 -10G-> lc3A P16 - P35 -10G-> lc3B P15 - P36 -10G-> lc2D P15 + P1 -10G-> lc7D P17 + P2 -10G-> lc7A P18 + P3 -10G-> lc7B P17 + P4 -10G-> lc7C P18 + P5 -10G-> lc6D P17 + P6 -10G-> lc6B P17 + P7 -10G-> lc6A P18 + P8 -10G-> lc5D P17 + P9 -10G-> lc6C P18 + P10 -10G-> lc5B P17 + P11 -10G-> lc5A P18 + P12 -10G-> lc4D P17 + P13 -10G-> lc3A P18 + P14 -10G-> lc3D P17 + P15 -10G-> lc4C P18 + P16 -10G-> lc4B P17 + P17 -10G-> lc4A P18 + P18 -10G-> lc5C P18 + P19 -10G-> lc8C P18 + P20 -10G-> lc8B P17 + P21 -10G-> lc8A P18 + P22 -10G-> lc8D P17 + P23 -10G-> lc0C P18 + P24 -10G-> lc0A P18 + P25 -10G-> lc0B P17 + P26 -10G-> lc1C P18 + P27 -10G-> lc0D P17 + P28 -10G-> lc1A P18 + P29 -10G-> lc1B P17 + P30 -10G-> lc2C P18 + P31 -10G-> lc3B P17 + P32 -10G-> lc3C P18 + P33 -10G-> lc2D P17 + P34 -10G-> lc2A P18 + P35 -10G-> lc2B P17 + P36 -10G-> lc1D P17 SUBSYSTEM SPINE fc3A - P1 -10G-> lc1A P17 - P2 -10G-> lc1B P18 - P3 -10G-> lc1C P17 - P4 -10G-> lc1D P18 - P5 -10G-> lc9A P17 - P6 -10G-> lc9C P17 - P7 -10G-> lc9B P18 - P8 -10G-> lc8A P17 - P9 -10G-> lc9D P18 - P10 -10G-> lc8C P17 - P11 -10G-> lc8B P18 - P12 -10G-> lc7A P17 - P13 -10G-> lc6B P18 - P14 -10G-> lc6A P17 - P15 -10G-> lc7D P18 - P16 -10G-> lc7C P17 - P17 -10G-> lc7B P18 - P18 -10G-> lc8D P18 - P19 -10G-> lc2D P18 - P20 -10G-> lc2C P17 - P21 -10G-> lc2B P18 - P22 -10G-> lc2A P17 - P23 -10G-> lc3D P18 - P24 -10G-> lc3B P18 - P25 -10G-> lc3C P17 - P26 -10G-> lc4D P18 - P27 -10G-> lc3A P17 - P28 -10G-> lc4B P18 - P29 -10G-> lc4C P17 - P30 -10G-> lc5D P18 - P31 -10G-> lc6C P17 - P32 -10G-> lc6D P18 - P33 -10G-> lc5A P17 - P34 -10G-> lc5B P18 - P35 -10G-> lc5C P17 - P36 -10G-> lc4A P17 + P1 -10G-> lc0A P12 + P2 -10G-> lc0B P11 + P3 -10G-> lc0C P12 + P4 -10G-> lc0D P11 + P5 -10G-> lc8A P12 + P6 -10G-> lc8C P12 + P7 -10G-> lc8B P11 + P8 -10G-> lc7A P12 + P9 -10G-> lc8D P11 + P10 -10G-> lc7C P12 + P11 -10G-> lc7B P11 + P12 -10G-> lc6A P12 + P13 -10G-> lc5B P11 + P14 -10G-> lc5A P12 + P15 -10G-> lc6D P11 + P16 -10G-> lc6C P12 + P17 -10G-> lc6B P11 + P18 -10G-> lc7D P11 + P19 -10G-> lc1D P11 + P20 -10G-> lc1C P12 + P21 -10G-> lc1B P11 + P22 -10G-> lc1A P12 + P23 -10G-> lc2D P11 + P24 -10G-> lc2B P11 + P25 -10G-> lc2C P12 + P26 -10G-> lc3D P11 + P27 -10G-> lc2A P12 + P28 -10G-> lc3B P11 + P29 -10G-> lc3C P12 + P30 -10G-> lc4D P11 + P31 -10G-> lc5C P12 + P32 -10G-> lc5D P11 + P33 -10G-> lc4A P12 + P34 -10G-> lc4B P11 + P35 -10G-> lc4C P12 + P36 -10G-> lc3A P12 SUBSYSTEM SPINE fc3B - P1 -10G-> lc8D P17 - P2 -10G-> lc8A P18 - P3 -10G-> lc8B P17 - P4 -10G-> lc8C P18 - P5 -10G-> lc7D P17 - P6 -10G-> lc7B P17 - P7 -10G-> lc7A P18 - P8 -10G-> lc6D P17 - P9 -10G-> lc7C P18 - P10 -10G-> lc6B P17 - P11 -10G-> lc6A P18 - P12 -10G-> lc5D P17 - P13 -10G-> lc4A P18 - P14 -10G-> lc4D P17 - P15 -10G-> lc5C P18 - P16 -10G-> lc5B P17 - P17 -10G-> lc5A P18 - P18 -10G-> lc6C P18 - P19 -10G-> lc9C P18 - P20 -10G-> lc9B P17 - P21 -10G-> lc9A P18 - P22 -10G-> lc9D P17 - P23 -10G-> lc1C P18 - P24 -10G-> lc1A P18 - P25 -10G-> lc1B P17 - P26 -10G-> lc2C P18 - P27 -10G-> lc1D P17 - P28 -10G-> lc2A P18 - P29 -10G-> lc2B P17 - P30 -10G-> lc3C P18 - P31 -10G-> lc4B P17 - P32 -10G-> lc4C P18 - P33 -10G-> lc3D P17 - P34 -10G-> lc3A P18 - P35 -10G-> lc3B P17 - P36 -10G-> lc2D P17 + P1 -10G-> lc7D P12 + P2 -10G-> lc7A P11 + P3 -10G-> lc7B P12 + P4 -10G-> lc7C P11 + P5 -10G-> lc6D P12 + P6 -10G-> lc6B P12 + P7 -10G-> lc6A P11 + P8 -10G-> lc5D P12 + P9 -10G-> lc6C P11 + P10 -10G-> lc5B P12 + P11 -10G-> lc5A P11 + P12 -10G-> lc4D P12 + P13 -10G-> lc3A P11 + P14 -10G-> lc3D P12 + P15 -10G-> lc4C P11 + P16 -10G-> lc4B P12 + P17 -10G-> lc4A P11 + P18 -10G-> lc5C P11 + P19 -10G-> lc8C P11 + P20 -10G-> lc8B P12 + P21 -10G-> lc8A P11 + P22 -10G-> lc8D P12 + P23 -10G-> lc0C P11 + P24 -10G-> lc0A P11 + P25 -10G-> lc0B P12 + P26 -10G-> lc1C P11 + P27 -10G-> lc0D P12 + P28 -10G-> lc1A P11 + P29 -10G-> lc1B P12 + P30 -10G-> lc2C P11 + P31 -10G-> lc3B P12 + P32 -10G-> lc3C P11 + P33 -10G-> lc2D P12 + P34 -10G-> lc2A P11 + P35 -10G-> lc2B P12 + P36 -10G-> lc1D P12 SUBSYSTEM SPINE fc4A - P1 -10G-> lc1A P12 - P2 -10G-> lc1B P11 - P3 -10G-> lc1C P12 - P4 -10G-> lc1D P11 - P5 -10G-> lc9A P12 - P6 -10G-> lc9C P12 - P7 -10G-> lc9B P11 - P8 -10G-> lc8A P12 - P9 -10G-> lc9D P11 - P10 -10G-> lc8C P12 - P11 -10G-> lc8B P11 - P12 -10G-> lc7A P12 - P13 -10G-> lc6B P11 - P14 -10G-> lc6A P12 - P15 -10G-> lc7D P11 - P16 -10G-> lc7C P12 - P17 -10G-> lc7B P11 - P18 -10G-> lc8D P11 - P19 -10G-> lc2D P11 - P20 -10G-> lc2C P12 - P21 -10G-> lc2B P11 - P22 -10G-> lc2A P12 - P23 -10G-> lc3D P11 - P24 -10G-> lc3B P11 - P25 -10G-> lc3C P12 - P26 -10G-> lc4D P11 - P27 -10G-> lc3A P12 - P28 -10G-> lc4B P11 - P29 -10G-> lc4C P12 - P30 -10G-> lc5D P11 - P31 -10G-> lc6C P12 - P32 -10G-> lc6D P11 - P33 -10G-> lc5A P12 - P34 -10G-> lc5B P11 - P35 -10G-> lc5C P12 - P36 -10G-> lc4A P12 + P1 -10G-> lc0A P10 + P2 -10G-> lc0B P9 + P3 -10G-> lc0C P10 + P4 -10G-> lc0D P9 + P5 -10G-> lc8A P10 + P6 -10G-> lc8C P10 + P7 -10G-> lc8B P9 + P8 -10G-> lc7A P10 + P9 -10G-> lc8D P9 + P10 -10G-> lc7C P10 + P11 -10G-> lc7B P9 + P12 -10G-> lc6A P10 + P13 -10G-> lc5B P9 + P14 -10G-> lc5A P10 + P15 -10G-> lc6D P9 + P16 -10G-> lc6C P10 + P17 -10G-> lc6B P9 + P18 -10G-> lc7D P9 + P19 -10G-> lc1D P9 + P20 -10G-> lc1C P10 + P21 -10G-> lc1B P9 + P22 -10G-> lc1A P10 + P23 -10G-> lc2D P9 + P24 -10G-> lc2B P9 + P25 -10G-> lc2C P10 + P26 -10G-> lc3D P9 + P27 -10G-> lc2A P10 + P28 -10G-> lc3B P9 + P29 -10G-> lc3C P10 + P30 -10G-> lc4D P9 + P31 -10G-> lc5C P10 + P32 -10G-> lc5D P9 + P33 -10G-> lc4A P10 + P34 -10G-> lc4B P9 + P35 -10G-> lc4C P10 + P36 -10G-> lc3A P10 SUBSYSTEM SPINE fc4B - P1 -10G-> lc8D P12 - P2 -10G-> lc8A P11 - P3 -10G-> lc8B P12 - P4 -10G-> lc8C P11 - P5 -10G-> lc7D P12 - P6 -10G-> lc7B P12 - P7 -10G-> lc7A P11 - P8 -10G-> lc6D P12 - P9 -10G-> lc7C P11 - P10 -10G-> lc6B P12 - P11 -10G-> lc6A P11 - P12 -10G-> lc5D P12 - P13 -10G-> lc4A P11 - P14 -10G-> lc4D P12 - P15 -10G-> lc5C P11 - P16 -10G-> lc5B P12 - P17 -10G-> lc5A P11 - P18 -10G-> lc6C P11 - P19 -10G-> lc9C P11 - P20 -10G-> lc9B P12 - P21 -10G-> lc9A P11 - P22 -10G-> lc9D P12 - P23 -10G-> lc1C P11 - P24 -10G-> lc1A P11 - P25 -10G-> lc1B P12 - P26 -10G-> lc2C P11 - P27 -10G-> lc1D P12 - P28 -10G-> lc2A P11 - P29 -10G-> lc2B P12 - P30 -10G-> lc3C P11 - P31 -10G-> lc4B P12 - P32 -10G-> lc4C P11 - P33 -10G-> lc3D P12 - P34 -10G-> lc3A P11 - P35 -10G-> lc3B P12 - P36 -10G-> lc2D P12 + P1 -10G-> lc7D P10 + P2 -10G-> lc7A P9 + P3 -10G-> lc7B P10 + P4 -10G-> lc7C P9 + P5 -10G-> lc6D P10 + P6 -10G-> lc6B P10 + P7 -10G-> lc6A P9 + P8 -10G-> lc5D P10 + P9 -10G-> lc6C P9 + P10 -10G-> lc5B P10 + P11 -10G-> lc5A P9 + P12 -10G-> lc4D P10 + P13 -10G-> lc3A P9 + P14 -10G-> lc3D P10 + P15 -10G-> lc4C P9 + P16 -10G-> lc4B P10 + P17 -10G-> lc4A P9 + P18 -10G-> lc5C P9 + P19 -10G-> lc8C P9 + P20 -10G-> lc8B P10 + P21 -10G-> lc8A P9 + P22 -10G-> lc8D P10 + P23 -10G-> lc0C P9 + P24 -10G-> lc0A P9 + P25 -10G-> lc0B P10 + P26 -10G-> lc1C P9 + P27 -10G-> lc0D P10 + P28 -10G-> lc1A P9 + P29 -10G-> lc1B P10 + P30 -10G-> lc2C P9 + P31 -10G-> lc3B P10 + P32 -10G-> lc3C P9 + P33 -10G-> lc2D P10 + P34 -10G-> lc2A P9 + P35 -10G-> lc2B P10 + P36 -10G-> lc1D P10 SUBSYSTEM SPINE fc5A - P1 -10G-> lc1A P10 - P2 -10G-> lc1B P9 - P3 -10G-> lc1C P10 - P4 -10G-> lc1D P9 - P5 -10G-> lc9A P10 - P6 -10G-> lc9C P10 - P7 -10G-> lc9B P9 - P8 -10G-> lc8A P10 - P9 -10G-> lc9D P9 - P10 -10G-> lc8C P10 - P11 -10G-> lc8B P9 - P12 -10G-> lc7A P10 - P13 -10G-> lc6B P9 - P14 -10G-> lc6A P10 - P15 -10G-> lc7D P9 - P16 -10G-> lc7C P10 - P17 -10G-> lc7B P9 - P18 -10G-> lc8D P9 - P19 -10G-> lc2D P9 - P20 -10G-> lc2C P10 - P21 -10G-> lc2B P9 - P22 -10G-> lc2A P10 - P23 -10G-> lc3D P9 - P24 -10G-> lc3B P9 - P25 -10G-> lc3C P10 - P26 -10G-> lc4D P9 - P27 -10G-> lc3A P10 - P28 -10G-> lc4B P9 - P29 -10G-> lc4C P10 - P30 -10G-> lc5D P9 - P31 -10G-> lc6C P10 - P32 -10G-> lc6D P9 - P33 -10G-> lc5A P10 - P34 -10G-> lc5B P9 - P35 -10G-> lc5C P10 - P36 -10G-> lc4A P10 + P1 -10G-> lc0A P8 + P2 -10G-> lc0B P7 + P3 -10G-> lc0C P8 + P4 -10G-> lc0D P7 + P5 -10G-> lc8A P8 + P6 -10G-> lc8C P8 + P7 -10G-> lc8B P7 + P8 -10G-> lc7A P8 + P9 -10G-> lc8D P7 + P10 -10G-> lc7C P8 + P11 -10G-> lc7B P7 + P12 -10G-> lc6A P8 + P13 -10G-> lc5B P7 + P14 -10G-> lc5A P8 + P15 -10G-> lc6D P7 + P16 -10G-> lc6C P8 + P17 -10G-> lc6B P7 + P18 -10G-> lc7D P7 + P19 -10G-> lc1D P7 + P20 -10G-> lc1C P8 + P21 -10G-> lc1B P7 + P22 -10G-> lc1A P8 + P23 -10G-> lc2D P7 + P24 -10G-> lc2B P7 + P25 -10G-> lc2C P8 + P26 -10G-> lc3D P7 + P27 -10G-> lc2A P8 + P28 -10G-> lc3B P7 + P29 -10G-> lc3C P8 + P30 -10G-> lc4D P7 + P31 -10G-> lc5C P8 + P32 -10G-> lc5D P7 + P33 -10G-> lc4A P8 + P34 -10G-> lc4B P7 + P35 -10G-> lc4C P8 + P36 -10G-> lc3A P8 SUBSYSTEM SPINE fc5B - P1 -10G-> lc8D P10 - P2 -10G-> lc8A P9 - P3 -10G-> lc8B P10 - P4 -10G-> lc8C P9 - P5 -10G-> lc7D P10 - P6 -10G-> lc7B P10 - P7 -10G-> lc7A P9 - P8 -10G-> lc6D P10 - P9 -10G-> lc7C P9 - P10 -10G-> lc6B P10 - P11 -10G-> lc6A P9 - P12 -10G-> lc5D P10 - P13 -10G-> lc4A P9 - P14 -10G-> lc4D P10 - P15 -10G-> lc5C P9 - P16 -10G-> lc5B P10 - P17 -10G-> lc5A P9 - P18 -10G-> lc6C P9 - P19 -10G-> lc9C P9 - P20 -10G-> lc9B P10 - P21 -10G-> lc9A P9 - P22 -10G-> lc9D P10 - P23 -10G-> lc1C P9 - P24 -10G-> lc1A P9 - P25 -10G-> lc1B P10 - P26 -10G-> lc2C P9 - P27 -10G-> lc1D P10 - P28 -10G-> lc2A P9 - P29 -10G-> lc2B P10 - P30 -10G-> lc3C P9 - P31 -10G-> lc4B P10 - P32 -10G-> lc4C P9 - P33 -10G-> lc3D P10 - P34 -10G-> lc3A P9 - P35 -10G-> lc3B P10 - P36 -10G-> lc2D P10 + P1 -10G-> lc7D P8 + P2 -10G-> lc7A P7 + P3 -10G-> lc7B P8 + P4 -10G-> lc7C P7 + P5 -10G-> lc6D P8 + P6 -10G-> lc6B P8 + P7 -10G-> lc6A P7 + P8 -10G-> lc5D P8 + P9 -10G-> lc6C P7 + P10 -10G-> lc5B P8 + P11 -10G-> lc5A P7 + P12 -10G-> lc4D P8 + P13 -10G-> lc3A P7 + P14 -10G-> lc3D P8 + P15 -10G-> lc4C P7 + P16 -10G-> lc4B P8 + P17 -10G-> lc4A P7 + P18 -10G-> lc5C P7 + P19 -10G-> lc8C P7 + P20 -10G-> lc8B P8 + P21 -10G-> lc8A P7 + P22 -10G-> lc8D P8 + P23 -10G-> lc0C P7 + P24 -10G-> lc0A P7 + P25 -10G-> lc0B P8 + P26 -10G-> lc1C P7 + P27 -10G-> lc0D P8 + P28 -10G-> lc1A P7 + P29 -10G-> lc1B P8 + P30 -10G-> lc2C P7 + P31 -10G-> lc3B P8 + P32 -10G-> lc3C P7 + P33 -10G-> lc2D P8 + P34 -10G-> lc2A P7 + P35 -10G-> lc2B P8 + P36 -10G-> lc1D P8 SUBSYSTEM SPINE fc6A - P1 -10G-> lc1A P8 - P2 -10G-> lc1B P7 - P3 -10G-> lc1C P8 - P4 -10G-> lc1D P7 - P5 -10G-> lc9A P8 - P6 -10G-> lc9C P8 - P7 -10G-> lc9B P7 - P8 -10G-> lc8A P8 - P9 -10G-> lc9D P7 - P10 -10G-> lc8C P8 - P11 -10G-> lc8B P7 - P12 -10G-> lc7A P8 - P13 -10G-> lc6B P7 - P14 -10G-> lc6A P8 - P15 -10G-> lc7D P7 - P16 -10G-> lc7C P8 - P17 -10G-> lc7B P7 - P18 -10G-> lc8D P7 - P19 -10G-> lc2D P7 - P20 -10G-> lc2C P8 - P21 -10G-> lc2B P7 - P22 -10G-> lc2A P8 - P23 -10G-> lc3D P7 - P24 -10G-> lc3B P7 - P25 -10G-> lc3C P8 - P26 -10G-> lc4D P7 - P27 -10G-> lc3A P8 - P28 -10G-> lc4B P7 - P29 -10G-> lc4C P8 - P30 -10G-> lc5D P7 - P31 -10G-> lc6C P8 - P32 -10G-> lc6D P7 - P33 -10G-> lc5A P8 - P34 -10G-> lc5B P7 - P35 -10G-> lc5C P8 - P36 -10G-> lc4A P8 + P1 -10G-> lc0A P6 + P2 -10G-> lc0B P5 + P3 -10G-> lc0C P6 + P4 -10G-> lc0D P5 + P5 -10G-> lc8A P6 + P6 -10G-> lc8C P6 + P7 -10G-> lc8B P5 + P8 -10G-> lc7A P6 + P9 -10G-> lc8D P5 + P10 -10G-> lc7C P6 + P11 -10G-> lc7B P5 + P12 -10G-> lc6A P6 + P13 -10G-> lc5B P5 + P14 -10G-> lc5A P6 + P15 -10G-> lc6D P5 + P16 -10G-> lc6C P6 + P17 -10G-> lc6B P5 + P18 -10G-> lc7D P5 + P19 -10G-> lc1D P5 + P20 -10G-> lc1C P6 + P21 -10G-> lc1B P5 + P22 -10G-> lc1A P6 + P23 -10G-> lc2D P5 + P24 -10G-> lc2B P5 + P25 -10G-> lc2C P6 + P26 -10G-> lc3D P5 + P27 -10G-> lc2A P6 + P28 -10G-> lc3B P5 + P29 -10G-> lc3C P6 + P30 -10G-> lc4D P5 + P31 -10G-> lc5C P6 + P32 -10G-> lc5D P5 + P33 -10G-> lc4A P6 + P34 -10G-> lc4B P5 + P35 -10G-> lc4C P6 + P36 -10G-> lc3A P6 SUBSYSTEM SPINE fc6B - P1 -10G-> lc8D P8 - P2 -10G-> lc8A P7 - P3 -10G-> lc8B P8 - P4 -10G-> lc8C P7 - P5 -10G-> lc7D P8 - P6 -10G-> lc7B P8 - P7 -10G-> lc7A P7 - P8 -10G-> lc6D P8 - P9 -10G-> lc7C P7 - P10 -10G-> lc6B P8 - P11 -10G-> lc6A P7 - P12 -10G-> lc5D P8 - P13 -10G-> lc4A P7 - P14 -10G-> lc4D P8 - P15 -10G-> lc5C P7 - P16 -10G-> lc5B P8 - P17 -10G-> lc5A P7 - P18 -10G-> lc6C P7 - P19 -10G-> lc9C P7 - P20 -10G-> lc9B P8 - P21 -10G-> lc9A P7 - P22 -10G-> lc9D P8 - P23 -10G-> lc1C P7 - P24 -10G-> lc1A P7 - P25 -10G-> lc1B P8 - P26 -10G-> lc2C P7 - P27 -10G-> lc1D P8 - P28 -10G-> lc2A P7 - P29 -10G-> lc2B P8 - P30 -10G-> lc3C P7 - P31 -10G-> lc4B P8 - P32 -10G-> lc4C P7 - P33 -10G-> lc3D P8 - P34 -10G-> lc3A P7 - P35 -10G-> lc3B P8 - P36 -10G-> lc2D P8 + P1 -10G-> lc7D P6 + P2 -10G-> lc7A P5 + P3 -10G-> lc7B P6 + P4 -10G-> lc7C P5 + P5 -10G-> lc6D P6 + P6 -10G-> lc6B P6 + P7 -10G-> lc6A P5 + P8 -10G-> lc5D P6 + P9 -10G-> lc6C P5 + P10 -10G-> lc5B P6 + P11 -10G-> lc5A P5 + P12 -10G-> lc4D P6 + P13 -10G-> lc3A P5 + P14 -10G-> lc3D P6 + P15 -10G-> lc4C P5 + P16 -10G-> lc4B P6 + P17 -10G-> lc4A P5 + P18 -10G-> lc5C P5 + P19 -10G-> lc8C P5 + P20 -10G-> lc8B P6 + P21 -10G-> lc8A P5 + P22 -10G-> lc8D P6 + P23 -10G-> lc0C P5 + P24 -10G-> lc0A P5 + P25 -10G-> lc0B P6 + P26 -10G-> lc1C P5 + P27 -10G-> lc0D P6 + P28 -10G-> lc1A P5 + P29 -10G-> lc1B P6 + P30 -10G-> lc2C P5 + P31 -10G-> lc3B P6 + P32 -10G-> lc3C P5 + P33 -10G-> lc2D P6 + P34 -10G-> lc2A P5 + P35 -10G-> lc2B P6 + P36 -10G-> lc1D P6 SUBSYSTEM SPINE fc7A - P1 -10G-> lc1A P6 - P2 -10G-> lc1B P5 - P3 -10G-> lc1C P6 - P4 -10G-> lc1D P5 - P5 -10G-> lc9A P6 - P6 -10G-> lc9C P6 - P7 -10G-> lc9B P5 - P8 -10G-> lc8A P6 - P9 -10G-> lc9D P5 - P10 -10G-> lc8C P6 - P11 -10G-> lc8B P5 - P12 -10G-> lc7A P6 - P13 -10G-> lc6B P5 - P14 -10G-> lc6A P6 - P15 -10G-> lc7D P5 - P16 -10G-> lc7C P6 - P17 -10G-> lc7B P5 - P18 -10G-> lc8D P5 - P19 -10G-> lc2D P5 - P20 -10G-> lc2C P6 - P21 -10G-> lc2B P5 - P22 -10G-> lc2A P6 - P23 -10G-> lc3D P5 - P24 -10G-> lc3B P5 - P25 -10G-> lc3C P6 - P26 -10G-> lc4D P5 - P27 -10G-> lc3A P6 - P28 -10G-> lc4B P5 - P29 -10G-> lc4C P6 - P30 -10G-> lc5D P5 - P31 -10G-> lc6C P6 - P32 -10G-> lc6D P5 - P33 -10G-> lc5A P6 - P34 -10G-> lc5B P5 - P35 -10G-> lc5C P6 - P36 -10G-> lc4A P6 + P1 -10G-> lc0A P4 + P2 -10G-> lc0B P3 + P3 -10G-> lc0C P4 + P4 -10G-> lc0D P3 + P5 -10G-> lc8A P4 + P6 -10G-> lc8C P4 + P7 -10G-> lc8B P3 + P8 -10G-> lc7A P4 + P9 -10G-> lc8D P3 + P10 -10G-> lc7C P4 + P11 -10G-> lc7B P3 + P12 -10G-> lc6A P4 + P13 -10G-> lc5B P3 + P14 -10G-> lc5A P4 + P15 -10G-> lc6D P3 + P16 -10G-> lc6C P4 + P17 -10G-> lc6B P3 + P18 -10G-> lc7D P3 + P19 -10G-> lc1D P3 + P20 -10G-> lc1C P4 + P21 -10G-> lc1B P3 + P22 -10G-> lc1A P4 + P23 -10G-> lc2D P3 + P24 -10G-> lc2B P3 + P25 -10G-> lc2C P4 + P26 -10G-> lc3D P3 + P27 -10G-> lc2A P4 + P28 -10G-> lc3B P3 + P29 -10G-> lc3C P4 + P30 -10G-> lc4D P3 + P31 -10G-> lc5C P4 + P32 -10G-> lc5D P3 + P33 -10G-> lc4A P4 + P34 -10G-> lc4B P3 + P35 -10G-> lc4C P4 + P36 -10G-> lc3A P4 SUBSYSTEM SPINE fc7B - P1 -10G-> lc8D P6 - P2 -10G-> lc8A P5 - P3 -10G-> lc8B P6 - P4 -10G-> lc8C P5 - P5 -10G-> lc7D P6 - P6 -10G-> lc7B P6 - P7 -10G-> lc7A P5 - P8 -10G-> lc6D P6 - P9 -10G-> lc7C P5 - P10 -10G-> lc6B P6 - P11 -10G-> lc6A P5 - P12 -10G-> lc5D P6 - P13 -10G-> lc4A P5 - P14 -10G-> lc4D P6 - P15 -10G-> lc5C P5 - P16 -10G-> lc5B P6 - P17 -10G-> lc5A P5 - P18 -10G-> lc6C P5 - P19 -10G-> lc9C P5 - P20 -10G-> lc9B P6 - P21 -10G-> lc9A P5 - P22 -10G-> lc9D P6 - P23 -10G-> lc1C P5 - P24 -10G-> lc1A P5 - P25 -10G-> lc1B P6 - P26 -10G-> lc2C P5 - P27 -10G-> lc1D P6 - P28 -10G-> lc2A P5 - P29 -10G-> lc2B P6 - P30 -10G-> lc3C P5 - P31 -10G-> lc4B P6 - P32 -10G-> lc4C P5 - P33 -10G-> lc3D P6 - P34 -10G-> lc3A P5 - P35 -10G-> lc3B P6 - P36 -10G-> lc2D P6 + P1 -10G-> lc7D P4 + P2 -10G-> lc7A P3 + P3 -10G-> lc7B P4 + P4 -10G-> lc7C P3 + P5 -10G-> lc6D P4 + P6 -10G-> lc6B P4 + P7 -10G-> lc6A P3 + P8 -10G-> lc5D P4 + P9 -10G-> lc6C P3 + P10 -10G-> lc5B P4 + P11 -10G-> lc5A P3 + P12 -10G-> lc4D P4 + P13 -10G-> lc3A P3 + P14 -10G-> lc3D P4 + P15 -10G-> lc4C P3 + P16 -10G-> lc4B P4 + P17 -10G-> lc4A P3 + P18 -10G-> lc5C P3 + P19 -10G-> lc8C P3 + P20 -10G-> lc8B P4 + P21 -10G-> lc8A P3 + P22 -10G-> lc8D P4 + P23 -10G-> lc0C P3 + P24 -10G-> lc0A P3 + P25 -10G-> lc0B P4 + P26 -10G-> lc1C P3 + P27 -10G-> lc0D P4 + P28 -10G-> lc1A P3 + P29 -10G-> lc1B P4 + P30 -10G-> lc2C P3 + P31 -10G-> lc3B P4 + P32 -10G-> lc3C P3 + P33 -10G-> lc2D P4 + P34 -10G-> lc2A P3 + P35 -10G-> lc2B P4 + P36 -10G-> lc1D P4 SUBSYSTEM SPINE fc8A - P1 -10G-> lc1A P4 - P2 -10G-> lc1B P3 - P3 -10G-> lc1C P4 - P4 -10G-> lc1D P3 - P5 -10G-> lc9A P4 - P6 -10G-> lc9C P4 - P7 -10G-> lc9B P3 - P8 -10G-> lc8A P4 - P9 -10G-> lc9D P3 - P10 -10G-> lc8C P4 - P11 -10G-> lc8B P3 - P12 -10G-> lc7A P4 - P13 -10G-> lc6B P3 - P14 -10G-> lc6A P4 - P15 -10G-> lc7D P3 - P16 -10G-> lc7C P4 - P17 -10G-> lc7B P3 - P18 -10G-> lc8D P3 - P19 -10G-> lc2D P3 - P20 -10G-> lc2C P4 - P21 -10G-> lc2B P3 - P22 -10G-> lc2A P4 - P23 -10G-> lc3D P3 - P24 -10G-> lc3B P3 - P25 -10G-> lc3C P4 - P26 -10G-> lc4D P3 - P27 -10G-> lc3A P4 - P28 -10G-> lc4B P3 - P29 -10G-> lc4C P4 - P30 -10G-> lc5D P3 - P31 -10G-> lc6C P4 - P32 -10G-> lc6D P3 - P33 -10G-> lc5A P4 - P34 -10G-> lc5B P3 - P35 -10G-> lc5C P4 - P36 -10G-> lc4A P4 + P1 -10G-> lc0A P2 + P2 -10G-> lc0B P1 + P3 -10G-> lc0C P2 + P4 -10G-> lc0D P1 + P5 -10G-> lc8A P2 + P6 -10G-> lc8C P2 + P7 -10G-> lc8B P1 + P8 -10G-> lc7A P2 + P9 -10G-> lc8D P1 + P10 -10G-> lc7C P2 + P11 -10G-> lc7B P1 + P12 -10G-> lc6A P2 + P13 -10G-> lc5B P1 + P14 -10G-> lc5A P2 + P15 -10G-> lc6D P1 + P16 -10G-> lc6C P2 + P17 -10G-> lc6B P1 + P18 -10G-> lc7D P1 + P19 -10G-> lc1D P1 + P20 -10G-> lc1C P2 + P21 -10G-> lc1B P1 + P22 -10G-> lc1A P2 + P23 -10G-> lc2D P1 + P24 -10G-> lc2B P1 + P25 -10G-> lc2C P2 + P26 -10G-> lc3D P1 + P27 -10G-> lc2A P2 + P28 -10G-> lc3B P1 + P29 -10G-> lc3C P2 + P30 -10G-> lc4D P1 + P31 -10G-> lc5C P2 + P32 -10G-> lc5D P1 + P33 -10G-> lc4A P2 + P34 -10G-> lc4B P1 + P35 -10G-> lc4C P2 + P36 -10G-> lc3A P2 SUBSYSTEM SPINE fc8B - P1 -10G-> lc8D P4 - P2 -10G-> lc8A P3 - P3 -10G-> lc8B P4 - P4 -10G-> lc8C P3 - P5 -10G-> lc7D P4 - P6 -10G-> lc7B P4 - P7 -10G-> lc7A P3 - P8 -10G-> lc6D P4 - P9 -10G-> lc7C P3 - P10 -10G-> lc6B P4 - P11 -10G-> lc6A P3 - P12 -10G-> lc5D P4 - P13 -10G-> lc4A P3 - P14 -10G-> lc4D P4 - P15 -10G-> lc5C P3 - P16 -10G-> lc5B P4 - P17 -10G-> lc5A P3 - P18 -10G-> lc6C P3 - P19 -10G-> lc9C P3 - P20 -10G-> lc9B P4 - P21 -10G-> lc9A P3 - P22 -10G-> lc9D P4 - P23 -10G-> lc1C P3 - P24 -10G-> lc1A P3 - P25 -10G-> lc1B P4 - P26 -10G-> lc2C P3 - P27 -10G-> lc1D P4 - P28 -10G-> lc2A P3 - P29 -10G-> lc2B P4 - P30 -10G-> lc3C P3 - P31 -10G-> lc4B P4 - P32 -10G-> lc4C P3 - P33 -10G-> lc3D P4 - P34 -10G-> lc3A P3 - P35 -10G-> lc3B P4 - P36 -10G-> lc2D P4 + P1 -10G-> lc7D P2 + P2 -10G-> lc7A P1 + P3 -10G-> lc7B P2 + P4 -10G-> lc7C P1 + P5 -10G-> lc6D P2 + P6 -10G-> lc6B P2 + P7 -10G-> lc6A P1 + P8 -10G-> lc5D P2 + P9 -10G-> lc6C P1 + P10 -10G-> lc5B P2 + P11 -10G-> lc5A P1 + P12 -10G-> lc4D P2 + P13 -10G-> lc3A P1 + P14 -10G-> lc3D P2 + P15 -10G-> lc4C P1 + P16 -10G-> lc4B P2 + P17 -10G-> lc4A P1 + P18 -10G-> lc5C P1 + P19 -10G-> lc8C P1 + P20 -10G-> lc8B P2 + P21 -10G-> lc8A P1 + P22 -10G-> lc8D P2 + P23 -10G-> lc0C P1 + P24 -10G-> lc0A P1 + P25 -10G-> lc0B P2 + P26 -10G-> lc1C P1 + P27 -10G-> lc0D P2 + P28 -10G-> lc1A P1 + P29 -10G-> lc1B P2 + P30 -10G-> lc2C P1 + P31 -10G-> lc3B P2 + P32 -10G-> lc3C P1 + P33 -10G-> lc2D P2 + P34 -10G-> lc2A P1 + P35 -10G-> lc2B P2 + P36 -10G-> lc1D P2 + +SUBSYSTEM LEAF lc0A + P1 -10G-> fc8B P24 + P2 -10G-> fc8A P1 + P3 -10G-> fc7B P24 + P4 -10G-> fc7A P1 + P5 -10G-> fc6B P24 + P6 -10G-> fc6A P1 + P7 -10G-> fc5B P24 + P8 -10G-> fc5A P1 + P9 -10G-> fc4B P24 + P10 -10G-> fc4A P1 + P11 -10G-> fc3B P24 + P12 -10G-> fc3A P1 + P13 -10G-> fc0A P1 + P14 -10G-> fc0B P24 + P15 -10G-> fc1A P1 + P16 -10G-> fc1B P24 + P17 -10G-> fc2A P1 + P18 -10G-> fc2B P24 + P19 -10G-> lc0-0B/P3 + P20 -10G-> lc0-0A/P3 + P21 -10G-> lc0-0A/P2 + P22 -10G-> lc0-0A/P1 + P23 -10G-> lc0-0B/P2 + P24 -10G-> lc0-0B/P1 + P25 -10G-> lc0-1B/P3 + P26 -10G-> lc0-1A/P3 + P27 -10G-> lc0-1A/P2 + P28 -10G-> lc0-1A/P1 + P29 -10G-> lc0-1B/P2 + P30 -10G-> lc0-1B/P1 + P31 -10G-> lc0-2B/P1 + P32 -10G-> lc0-2B/P2 + P33 -10G-> lc0-2A/P1 + P34 -10G-> lc0-2A/P2 + P35 -10G-> lc0-2A/P3 + P36 -10G-> lc0-2B/P3 + +SUBSYSTEM LEAF lc0B + P1 -10G-> fc8A P2 + P2 -10G-> fc8B P25 + P3 -10G-> fc7A P2 + P4 -10G-> fc7B P25 + P5 -10G-> fc6A P2 + P6 -10G-> fc6B P25 + P7 -10G-> fc5A P2 + P8 -10G-> fc5B P25 + P9 -10G-> fc4A P2 + P10 -10G-> fc4B P25 + P11 -10G-> fc3A P2 + P12 -10G-> fc3B P25 + P13 -10G-> fc0B P25 + P14 -10G-> fc0A P2 + P15 -10G-> fc1B P25 + P16 -10G-> fc1A P2 + P17 -10G-> fc2B P25 + P18 -10G-> fc2A P2 + P19 -10G-> lc0-3B/P3 + P20 -10G-> lc0-3A/P3 + P21 -10G-> lc0-3A/P2 + P22 -10G-> lc0-3A/P1 + P23 -10G-> lc0-3B/P2 + P24 -10G-> lc0-3B/P1 + P25 -10G-> lc0-4B/P3 + P26 -10G-> lc0-4A/P3 + P27 -10G-> lc0-4A/P2 + P28 -10G-> lc0-4A/P1 + P29 -10G-> lc0-4B/P2 + P30 -10G-> lc0-4B/P1 + P31 -10G-> lc0-5B/P1 + P32 -10G-> lc0-5B/P2 + P33 -10G-> lc0-5A/P1 + P34 -10G-> lc0-5A/P2 + P35 -10G-> lc0-5A/P3 + P36 -10G-> lc0-5B/P3 -SUBSYSTEM SPINE fc9A - P1 -10G-> lc1A P2 - P2 -10G-> lc1B P1 - P3 -10G-> lc1C P2 - P4 -10G-> lc1D P1 - P5 -10G-> lc9A P2 - P6 -10G-> lc9C P2 - P7 -10G-> lc9B P1 - P8 -10G-> lc8A P2 - P9 -10G-> lc9D P1 - P10 -10G-> lc8C P2 - P11 -10G-> lc8B P1 - P12 -10G-> lc7A P2 - P13 -10G-> lc6B P1 - P14 -10G-> lc6A P2 - P15 -10G-> lc7D P1 - P16 -10G-> lc7C P2 - P17 -10G-> lc7B P1 - P18 -10G-> lc8D P1 - P19 -10G-> lc2D P1 - P20 -10G-> lc2C P2 - P21 -10G-> lc2B P1 - P22 -10G-> lc2A P2 - P23 -10G-> lc3D P1 - P24 -10G-> lc3B P1 - P25 -10G-> lc3C P2 - P26 -10G-> lc4D P1 - P27 -10G-> lc3A P2 - P28 -10G-> lc4B P1 - P29 -10G-> lc4C P2 - P30 -10G-> lc5D P1 - P31 -10G-> lc6C P2 - P32 -10G-> lc6D P1 - P33 -10G-> lc5A P2 - P34 -10G-> lc5B P1 - P35 -10G-> lc5C P2 - P36 -10G-> lc4A P2 +SUBSYSTEM LEAF lc0C + P1 -10G-> fc8B P23 + P2 -10G-> fc8A P3 + P3 -10G-> fc7B P23 + P4 -10G-> fc7A P3 + P5 -10G-> fc6B P23 + P6 -10G-> fc6A P3 + P7 -10G-> fc5B P23 + P8 -10G-> fc5A P3 + P9 -10G-> fc4B P23 + P10 -10G-> fc4A P3 + P11 -10G-> fc3B P23 + P12 -10G-> fc3A P3 + P13 -10G-> fc0A P3 + P14 -10G-> fc0B P23 + P15 -10G-> fc1A P3 + P16 -10G-> fc1B P23 + P17 -10G-> fc2A P3 + P18 -10G-> fc2B P23 + P19 -10G-> lc0-6B/P3 + P20 -10G-> lc0-6A/P3 + P21 -10G-> lc0-6A/P2 + P22 -10G-> lc0-6A/P1 + P23 -10G-> lc0-6B/P2 + P24 -10G-> lc0-6B/P1 + P25 -10G-> lc0-7B/P3 + P26 -10G-> lc0-7A/P3 + P27 -10G-> lc0-7A/P2 + P28 -10G-> lc0-7A/P1 + P29 -10G-> lc0-7B/P2 + P30 -10G-> lc0-7B/P1 + P31 -10G-> lc0-8B/P1 + P32 -10G-> lc0-8B/P2 + P33 -10G-> lc0-8A/P1 + P34 -10G-> lc0-8A/P2 + P35 -10G-> lc0-8A/P3 + P36 -10G-> lc0-8B/P3 -SUBSYSTEM SPINE fc9B - P1 -10G-> lc8D P2 - P2 -10G-> lc8A P1 - P3 -10G-> lc8B P2 - P4 -10G-> lc8C P1 - P5 -10G-> lc7D P2 - P6 -10G-> lc7B P2 - P7 -10G-> lc7A P1 - P8 -10G-> lc6D P2 - P9 -10G-> lc7C P1 - P10 -10G-> lc6B P2 - P11 -10G-> lc6A P1 - P12 -10G-> lc5D P2 - P13 -10G-> lc4A P1 - P14 -10G-> lc4D P2 - P15 -10G-> lc5C P1 - P16 -10G-> lc5B P2 - P17 -10G-> lc5A P1 - P18 -10G-> lc6C P1 - P19 -10G-> lc9C P1 - P20 -10G-> lc9B P2 - P21 -10G-> lc9A P1 - P22 -10G-> lc9D P2 - P23 -10G-> lc1C P1 - P24 -10G-> lc1A P1 - P25 -10G-> lc1B P2 - P26 -10G-> lc2C P1 - P27 -10G-> lc1D P2 - P28 -10G-> lc2A P1 - P29 -10G-> lc2B P2 - P30 -10G-> lc3C P1 - P31 -10G-> lc4B P2 - P32 -10G-> lc4C P1 - P33 -10G-> lc3D P2 - P34 -10G-> lc3A P1 - P35 -10G-> lc3B P2 - P36 -10G-> lc2D P2 +SUBSYSTEM LEAF lc0D + P1 -10G-> fc8A P4 + P2 -10G-> fc8B P27 + P3 -10G-> fc7A P4 + P4 -10G-> fc7B P27 + P5 -10G-> fc6A P4 + P6 -10G-> fc6B P27 + P7 -10G-> fc5A P4 + P8 -10G-> fc5B P27 + P9 -10G-> fc4A P4 + P10 -10G-> fc4B P27 + P11 -10G-> fc3A P4 + P12 -10G-> fc3B P27 + P13 -10G-> fc0B P27 + P14 -10G-> fc0A P4 + P15 -10G-> fc1B P27 + P16 -10G-> fc1A P4 + P17 -10G-> fc2B P27 + P18 -10G-> fc2A P4 + P19 -10G-> lc0-9B/P3 + P20 -10G-> lc0-9A/P3 + P21 -10G-> lc0-9A/P2 + P22 -10G-> lc0-9A/P1 + P23 -10G-> lc0-9B/P2 + P24 -10G-> lc0-9B/P1 + P25 -10G-> lc0-10B/P3 + P26 -10G-> lc0-10A/P3 + P27 -10G-> lc0-10A/P2 + P28 -10G-> lc0-10A/P1 + P29 -10G-> lc0-10B/P2 + P30 -10G-> lc0-10B/P1 + P31 -10G-> lc0-11B/P1 + P32 -10G-> lc0-11B/P2 + P33 -10G-> lc0-11A/P1 + P34 -10G-> lc0-11A/P2 + P35 -10G-> lc0-11A/P3 + P36 -10G-> lc0-11B/P3 SUBSYSTEM LEAF lc1A - P1 -10G-> fc9B P24 - P2 -10G-> fc9A P1 - P3 -10G-> fc8B P24 - P4 -10G-> fc8A P1 - P5 -10G-> fc7B P24 - P6 -10G-> fc7A P1 - P7 -10G-> fc6B P24 - P8 -10G-> fc6A P1 - P9 -10G-> fc5B P24 - P10 -10G-> fc5A P1 - P11 -10G-> fc4B P24 - P12 -10G-> fc4A P1 - P13 -10G-> fc1A P1 - P14 -10G-> fc1B P24 - P15 -10G-> fc2A P1 - P16 -10G-> fc2B P24 - P17 -10G-> fc3A P1 - P18 -10G-> fc3B P24 - P19 -10G-> lc1-0A/P3 - P20 -10G-> lc1-0B/P3 - P21 -10G-> lc1-0B/P2 - P22 -10G-> lc1-0B/P1 - P23 -10G-> lc1-0A/P2 - P24 -10G-> lc1-0A/P1 - P25 -10G-> lc1-1A/P3 - P26 -10G-> lc1-1B/P3 - P27 -10G-> lc1-1B/P2 - P28 -10G-> lc1-1B/P1 - P29 -10G-> lc1-1A/P2 - P30 -10G-> lc1-1A/P1 - P31 -10G-> lc1-2A/P1 - P32 -10G-> lc1-2A/P2 - P33 -10G-> lc1-2B/P1 - P34 -10G-> lc1-2B/P2 - P35 -10G-> lc1-2B/P3 - P36 -10G-> lc1-2A/P3 + P1 -10G-> fc8B P28 + P2 -10G-> fc8A P22 + P3 -10G-> fc7B P28 + P4 -10G-> fc7A P22 + P5 -10G-> fc6B P28 + P6 -10G-> fc6A P22 + P7 -10G-> fc5B P28 + P8 -10G-> fc5A P22 + P9 -10G-> fc4B P28 + P10 -10G-> fc4A P22 + P11 -10G-> fc3B P28 + P12 -10G-> fc3A P22 + P13 -10G-> fc0A P22 + P14 -10G-> fc0B P28 + P15 -10G-> fc1A P22 + P16 -10G-> fc1B P28 + P17 -10G-> fc2A P22 + P18 -10G-> fc2B P28 + P19 -10G-> lc1-0B/P3 + P20 -10G-> lc1-0A/P3 + P21 -10G-> lc1-0A/P2 + P22 -10G-> lc1-0A/P1 + P23 -10G-> lc1-0B/P2 + P24 -10G-> lc1-0B/P1 + P25 -10G-> lc1-1B/P3 + P26 -10G-> lc1-1A/P3 + P27 -10G-> lc1-1A/P2 + P28 -10G-> lc1-1A/P1 + P29 -10G-> lc1-1B/P2 + P30 -10G-> lc1-1B/P1 + P31 -10G-> lc1-2B/P1 + P32 -10G-> lc1-2B/P2 + P33 -10G-> lc1-2A/P1 + P34 -10G-> lc1-2A/P2 + P35 -10G-> lc1-2A/P3 + P36 -10G-> lc1-2B/P3 SUBSYSTEM LEAF lc1B - P1 -10G-> fc9A P2 - P2 -10G-> fc9B P25 - P3 -10G-> fc8A P2 - P4 -10G-> fc8B P25 - P5 -10G-> fc7A P2 - P6 -10G-> fc7B P25 - P7 -10G-> fc6A P2 - P8 -10G-> fc6B P25 - P9 -10G-> fc5A P2 - P10 -10G-> fc5B P25 - P11 -10G-> fc4A P2 - P12 -10G-> fc4B P25 - P13 -10G-> fc1B P25 - P14 -10G-> fc1A P2 - P15 -10G-> fc2B P25 - P16 -10G-> fc2A P2 - P17 -10G-> fc3B P25 - P18 -10G-> fc3A P2 - P19 -10G-> lc1-3A/P3 - P20 -10G-> lc1-3B/P3 - P21 -10G-> lc1-3B/P2 - P22 -10G-> lc1-3B/P1 - P23 -10G-> lc1-3A/P2 - P24 -10G-> lc1-3A/P1 - P25 -10G-> lc1-4A/P3 - P26 -10G-> lc1-4B/P3 - P27 -10G-> lc1-4B/P2 - P28 -10G-> lc1-4B/P1 - P29 -10G-> lc1-4A/P2 - P30 -10G-> lc1-4A/P1 - P31 -10G-> lc1-5A/P1 - P32 -10G-> lc1-5A/P2 - P33 -10G-> lc1-5B/P1 - P34 -10G-> lc1-5B/P2 - P35 -10G-> lc1-5B/P3 - P36 -10G-> lc1-5A/P3 + P1 -10G-> fc8A P21 + P2 -10G-> fc8B P29 + P3 -10G-> fc7A P21 + P4 -10G-> fc7B P29 + P5 -10G-> fc6A P21 + P6 -10G-> fc6B P29 + P7 -10G-> fc5A P21 + P8 -10G-> fc5B P29 + P9 -10G-> fc4A P21 + P10 -10G-> fc4B P29 + P11 -10G-> fc3A P21 + P12 -10G-> fc3B P29 + P13 -10G-> fc0B P29 + P14 -10G-> fc0A P21 + P15 -10G-> fc1B P29 + P16 -10G-> fc1A P21 + P17 -10G-> fc2B P29 + P18 -10G-> fc2A P21 + P19 -10G-> lc1-3B/P3 + P20 -10G-> lc1-3A/P3 + P21 -10G-> lc1-3A/P2 + P22 -10G-> lc1-3A/P1 + P23 -10G-> lc1-3B/P2 + P24 -10G-> lc1-3B/P1 + P25 -10G-> lc1-4B/P3 + P26 -10G-> lc1-4A/P3 + P27 -10G-> lc1-4A/P2 + P28 -10G-> lc1-4A/P1 + P29 -10G-> lc1-4B/P2 + P30 -10G-> lc1-4B/P1 + P31 -10G-> lc1-5B/P1 + P32 -10G-> lc1-5B/P2 + P33 -10G-> lc1-5A/P1 + P34 -10G-> lc1-5A/P2 + P35 -10G-> lc1-5A/P3 + P36 -10G-> lc1-5B/P3 SUBSYSTEM LEAF lc1C - P1 -10G-> fc9B P23 - P2 -10G-> fc9A P3 - P3 -10G-> fc8B P23 - P4 -10G-> fc8A P3 - P5 -10G-> fc7B P23 - P6 -10G-> fc7A P3 - P7 -10G-> fc6B P23 - P8 -10G-> fc6A P3 - P9 -10G-> fc5B P23 - P10 -10G-> fc5A P3 - P11 -10G-> fc4B P23 - P12 -10G-> fc4A P3 - P13 -10G-> fc1A P3 - P14 -10G-> fc1B P23 - P15 -10G-> fc2A P3 - P16 -10G-> fc2B P23 - P17 -10G-> fc3A P3 - P18 -10G-> fc3B P23 - P19 -10G-> lc1-6A/P3 - P20 -10G-> lc1-6B/P3 - P21 -10G-> lc1-6B/P2 - P22 -10G-> lc1-6B/P1 - P23 -10G-> lc1-6A/P2 - P24 -10G-> lc1-6A/P1 - P25 -10G-> lc1-7A/P3 - P26 -10G-> lc1-7B/P3 - P27 -10G-> lc1-7B/P2 - P28 -10G-> lc1-7B/P1 - P29 -10G-> lc1-7A/P2 - P30 -10G-> lc1-7A/P1 - P31 -10G-> lc1-8A/P1 - P32 -10G-> lc1-8A/P2 - P33 -10G-> lc1-8B/P1 - P34 -10G-> lc1-8B/P2 - P35 -10G-> lc1-8B/P3 - P36 -10G-> lc1-8A/P3 + P1 -10G-> fc8B P26 + P2 -10G-> fc8A P20 + P3 -10G-> fc7B P26 + P4 -10G-> fc7A P20 + P5 -10G-> fc6B P26 + P6 -10G-> fc6A P20 + P7 -10G-> fc5B P26 + P8 -10G-> fc5A P20 + P9 -10G-> fc4B P26 + P10 -10G-> fc4A P20 + P11 -10G-> fc3B P26 + P12 -10G-> fc3A P20 + P13 -10G-> fc0A P20 + P14 -10G-> fc0B P26 + P15 -10G-> fc1A P20 + P16 -10G-> fc1B P26 + P17 -10G-> fc2A P20 + P18 -10G-> fc2B P26 + P19 -10G-> lc1-6B/P3 + P20 -10G-> lc1-6A/P3 + P21 -10G-> lc1-6A/P2 + P22 -10G-> lc1-6A/P1 + P23 -10G-> lc1-6B/P2 + P24 -10G-> lc1-6B/P1 + P25 -10G-> lc1-7B/P3 + P26 -10G-> lc1-7A/P3 + P27 -10G-> lc1-7A/P2 + P28 -10G-> lc1-7A/P1 + P29 -10G-> lc1-7B/P2 + P30 -10G-> lc1-7B/P1 + P31 -10G-> lc1-8B/P1 + P32 -10G-> lc1-8B/P2 + P33 -10G-> lc1-8A/P1 + P34 -10G-> lc1-8A/P2 + P35 -10G-> lc1-8A/P3 + P36 -10G-> lc1-8B/P3 SUBSYSTEM LEAF lc1D - P1 -10G-> fc9A P4 - P2 -10G-> fc9B P27 - P3 -10G-> fc8A P4 - P4 -10G-> fc8B P27 - P5 -10G-> fc7A P4 - P6 -10G-> fc7B P27 - P7 -10G-> fc6A P4 - P8 -10G-> fc6B P27 - P9 -10G-> fc5A P4 - P10 -10G-> fc5B P27 - P11 -10G-> fc4A P4 - P12 -10G-> fc4B P27 - P13 -10G-> fc1B P27 - P14 -10G-> fc1A P4 - P15 -10G-> fc2B P27 - P16 -10G-> fc2A P4 - P17 -10G-> fc3B P27 - P18 -10G-> fc3A P4 - P19 -10G-> lc1-9A/P3 - P20 -10G-> lc1-9B/P3 - P21 -10G-> lc1-9B/P2 - P22 -10G-> lc1-9B/P1 - P23 -10G-> lc1-9A/P2 - P24 -10G-> lc1-9A/P1 - P25 -10G-> lc1-10A/P3 - P26 -10G-> lc1-10B/P3 - P27 -10G-> lc1-10B/P2 - P28 -10G-> lc1-10B/P1 - P29 -10G-> lc1-10A/P2 - P30 -10G-> lc1-10A/P1 - P31 -10G-> lc1-11A/P1 - P32 -10G-> lc1-11A/P2 - P33 -10G-> lc1-11B/P1 - P34 -10G-> lc1-11B/P2 - P35 -10G-> lc1-11B/P3 - P36 -10G-> lc1-11A/P3 + P1 -10G-> fc8A P19 + P2 -10G-> fc8B P36 + P3 -10G-> fc7A P19 + P4 -10G-> fc7B P36 + P5 -10G-> fc6A P19 + P6 -10G-> fc6B P36 + P7 -10G-> fc5A P19 + P8 -10G-> fc5B P36 + P9 -10G-> fc4A P19 + P10 -10G-> fc4B P36 + P11 -10G-> fc3A P19 + P12 -10G-> fc3B P36 + P13 -10G-> fc0B P36 + P14 -10G-> fc0A P19 + P15 -10G-> fc1B P36 + P16 -10G-> fc1A P19 + P17 -10G-> fc2B P36 + P18 -10G-> fc2A P19 + P19 -10G-> lc1-9B/P3 + P20 -10G-> lc1-9A/P3 + P21 -10G-> lc1-9A/P2 + P22 -10G-> lc1-9A/P1 + P23 -10G-> lc1-9B/P2 + P24 -10G-> lc1-9B/P1 + P25 -10G-> lc1-10B/P3 + P26 -10G-> lc1-10A/P3 + P27 -10G-> lc1-10A/P2 + P28 -10G-> lc1-10A/P1 + P29 -10G-> lc1-10B/P2 + P30 -10G-> lc1-10B/P1 + P31 -10G-> lc1-11B/P1 + P32 -10G-> lc1-11B/P2 + P33 -10G-> lc1-11A/P1 + P34 -10G-> lc1-11A/P2 + P35 -10G-> lc1-11A/P3 + P36 -10G-> lc1-11B/P3 SUBSYSTEM LEAF lc2A - P1 -10G-> fc9B P28 - P2 -10G-> fc9A P22 - P3 -10G-> fc8B P28 - P4 -10G-> fc8A P22 - P5 -10G-> fc7B P28 - P6 -10G-> fc7A P22 - P7 -10G-> fc6B P28 - P8 -10G-> fc6A P22 - P9 -10G-> fc5B P28 - P10 -10G-> fc5A P22 - P11 -10G-> fc4B P28 - P12 -10G-> fc4A P22 - P13 -10G-> fc1A P22 - P14 -10G-> fc1B P28 - P15 -10G-> fc2A P22 - P16 -10G-> fc2B P28 - P17 -10G-> fc3A P22 - P18 -10G-> fc3B P28 - P19 -10G-> lc2-0A/P3 - P20 -10G-> lc2-0B/P3 - P21 -10G-> lc2-0B/P2 - P22 -10G-> lc2-0B/P1 - P23 -10G-> lc2-0A/P2 - P24 -10G-> lc2-0A/P1 - P25 -10G-> lc2-1A/P3 - P26 -10G-> lc2-1B/P3 - P27 -10G-> lc2-1B/P2 - P28 -10G-> lc2-1B/P1 - P29 -10G-> lc2-1A/P2 - P30 -10G-> lc2-1A/P1 - P31 -10G-> lc2-2A/P1 - P32 -10G-> lc2-2A/P2 - P33 -10G-> lc2-2B/P1 - P34 -10G-> lc2-2B/P2 - P35 -10G-> lc2-2B/P3 - P36 -10G-> lc2-2A/P3 + P1 -10G-> fc8B P34 + P2 -10G-> fc8A P27 + P3 -10G-> fc7B P34 + P4 -10G-> fc7A P27 + P5 -10G-> fc6B P34 + P6 -10G-> fc6A P27 + P7 -10G-> fc5B P34 + P8 -10G-> fc5A P27 + P9 -10G-> fc4B P34 + P10 -10G-> fc4A P27 + P11 -10G-> fc3B P34 + P12 -10G-> fc3A P27 + P13 -10G-> fc0A P27 + P14 -10G-> fc0B P34 + P15 -10G-> fc1A P27 + P16 -10G-> fc1B P34 + P17 -10G-> fc2A P27 + P18 -10G-> fc2B P34 + P19 -10G-> lc2-0B/P3 + P20 -10G-> lc2-0A/P3 + P21 -10G-> lc2-0A/P2 + P22 -10G-> lc2-0A/P1 + P23 -10G-> lc2-0B/P2 + P24 -10G-> lc2-0B/P1 + P25 -10G-> lc2-1B/P3 + P26 -10G-> lc2-1A/P3 + P27 -10G-> lc2-1A/P2 + P28 -10G-> lc2-1A/P1 + P29 -10G-> lc2-1B/P2 + P30 -10G-> lc2-1B/P1 + P31 -10G-> lc2-2B/P1 + P32 -10G-> lc2-2B/P2 + P33 -10G-> lc2-2A/P1 + P34 -10G-> lc2-2A/P2 + P35 -10G-> lc2-2A/P3 + P36 -10G-> lc2-2B/P3 SUBSYSTEM LEAF lc2B - P1 -10G-> fc9A P21 - P2 -10G-> fc9B P29 - P3 -10G-> fc8A P21 - P4 -10G-> fc8B P29 - P5 -10G-> fc7A P21 - P6 -10G-> fc7B P29 - P7 -10G-> fc6A P21 - P8 -10G-> fc6B P29 - P9 -10G-> fc5A P21 - P10 -10G-> fc5B P29 - P11 -10G-> fc4A P21 - P12 -10G-> fc4B P29 - P13 -10G-> fc1B P29 - P14 -10G-> fc1A P21 - P15 -10G-> fc2B P29 - P16 -10G-> fc2A P21 - P17 -10G-> fc3B P29 - P18 -10G-> fc3A P21 - P19 -10G-> lc2-3A/P3 - P20 -10G-> lc2-3B/P3 - P21 -10G-> lc2-3B/P2 - P22 -10G-> lc2-3B/P1 - P23 -10G-> lc2-3A/P2 - P24 -10G-> lc2-3A/P1 - P25 -10G-> lc2-4A/P3 - P26 -10G-> lc2-4B/P3 - P27 -10G-> lc2-4B/P2 - P28 -10G-> lc2-4B/P1 - P29 -10G-> lc2-4A/P2 - P30 -10G-> lc2-4A/P1 - P31 -10G-> lc2-5A/P1 - P32 -10G-> lc2-5A/P2 - P33 -10G-> lc2-5B/P1 - P34 -10G-> lc2-5B/P2 - P35 -10G-> lc2-5B/P3 - P36 -10G-> lc2-5A/P3 + P1 -10G-> fc8A P24 + P2 -10G-> fc8B P35 + P3 -10G-> fc7A P24 + P4 -10G-> fc7B P35 + P5 -10G-> fc6A P24 + P6 -10G-> fc6B P35 + P7 -10G-> fc5A P24 + P8 -10G-> fc5B P35 + P9 -10G-> fc4A P24 + P10 -10G-> fc4B P35 + P11 -10G-> fc3A P24 + P12 -10G-> fc3B P35 + P13 -10G-> fc0B P35 + P14 -10G-> fc0A P24 + P15 -10G-> fc1B P35 + P16 -10G-> fc1A P24 + P17 -10G-> fc2B P35 + P18 -10G-> fc2A P24 + P19 -10G-> lc2-3B/P3 + P20 -10G-> lc2-3A/P3 + P21 -10G-> lc2-3A/P2 + P22 -10G-> lc2-3A/P1 + P23 -10G-> lc2-3B/P2 + P24 -10G-> lc2-3B/P1 + P25 -10G-> lc2-4B/P3 + P26 -10G-> lc2-4A/P3 + P27 -10G-> lc2-4A/P2 + P28 -10G-> lc2-4A/P1 + P29 -10G-> lc2-4B/P2 + P30 -10G-> lc2-4B/P1 + P31 -10G-> lc2-5B/P1 + P32 -10G-> lc2-5B/P2 + P33 -10G-> lc2-5A/P1 + P34 -10G-> lc2-5A/P2 + P35 -10G-> lc2-5A/P3 + P36 -10G-> lc2-5B/P3 SUBSYSTEM LEAF lc2C - P1 -10G-> fc9B P26 - P2 -10G-> fc9A P20 - P3 -10G-> fc8B P26 - P4 -10G-> fc8A P20 - P5 -10G-> fc7B P26 - P6 -10G-> fc7A P20 - P7 -10G-> fc6B P26 - P8 -10G-> fc6A P20 - P9 -10G-> fc5B P26 - P10 -10G-> fc5A P20 - P11 -10G-> fc4B P26 - P12 -10G-> fc4A P20 - P13 -10G-> fc1A P20 - P14 -10G-> fc1B P26 - P15 -10G-> fc2A P20 - P16 -10G-> fc2B P26 - P17 -10G-> fc3A P20 - P18 -10G-> fc3B P26 - P19 -10G-> lc2-6A/P3 - P20 -10G-> lc2-6B/P3 - P21 -10G-> lc2-6B/P2 - P22 -10G-> lc2-6B/P1 - P23 -10G-> lc2-6A/P2 - P24 -10G-> lc2-6A/P1 - P25 -10G-> lc2-7A/P3 - P26 -10G-> lc2-7B/P3 - P27 -10G-> lc2-7B/P2 - P28 -10G-> lc2-7B/P1 - P29 -10G-> lc2-7A/P2 - P30 -10G-> lc2-7A/P1 - P31 -10G-> lc2-8A/P1 - P32 -10G-> lc2-8A/P2 - P33 -10G-> lc2-8B/P1 - P34 -10G-> lc2-8B/P2 - P35 -10G-> lc2-8B/P3 - P36 -10G-> lc2-8A/P3 + P1 -10G-> fc8B P30 + P2 -10G-> fc8A P25 + P3 -10G-> fc7B P30 + P4 -10G-> fc7A P25 + P5 -10G-> fc6B P30 + P6 -10G-> fc6A P25 + P7 -10G-> fc5B P30 + P8 -10G-> fc5A P25 + P9 -10G-> fc4B P30 + P10 -10G-> fc4A P25 + P11 -10G-> fc3B P30 + P12 -10G-> fc3A P25 + P13 -10G-> fc0A P25 + P14 -10G-> fc0B P30 + P15 -10G-> fc1A P25 + P16 -10G-> fc1B P30 + P17 -10G-> fc2A P25 + P18 -10G-> fc2B P30 + P19 -10G-> lc2-6B/P3 + P20 -10G-> lc2-6A/P3 + P21 -10G-> lc2-6A/P2 + P22 -10G-> lc2-6A/P1 + P23 -10G-> lc2-6B/P2 + P24 -10G-> lc2-6B/P1 + P25 -10G-> lc2-7B/P3 + P26 -10G-> lc2-7A/P3 + P27 -10G-> lc2-7A/P2 + P28 -10G-> lc2-7A/P1 + P29 -10G-> lc2-7B/P2 + P30 -10G-> lc2-7B/P1 + P31 -10G-> lc2-8B/P1 + P32 -10G-> lc2-8B/P2 + P33 -10G-> lc2-8A/P1 + P34 -10G-> lc2-8A/P2 + P35 -10G-> lc2-8A/P3 + P36 -10G-> lc2-8B/P3 SUBSYSTEM LEAF lc2D - P1 -10G-> fc9A P19 - P2 -10G-> fc9B P36 - P3 -10G-> fc8A P19 - P4 -10G-> fc8B P36 - P5 -10G-> fc7A P19 - P6 -10G-> fc7B P36 - P7 -10G-> fc6A P19 - P8 -10G-> fc6B P36 - P9 -10G-> fc5A P19 - P10 -10G-> fc5B P36 - P11 -10G-> fc4A P19 - P12 -10G-> fc4B P36 - P13 -10G-> fc1B P36 - P14 -10G-> fc1A P19 - P15 -10G-> fc2B P36 - P16 -10G-> fc2A P19 - P17 -10G-> fc3B P36 - P18 -10G-> fc3A P19 - P19 -10G-> lc2-9A/P3 - P20 -10G-> lc2-9B/P3 - P21 -10G-> lc2-9B/P2 - P22 -10G-> lc2-9B/P1 - P23 -10G-> lc2-9A/P2 - P24 -10G-> lc2-9A/P1 - P25 -10G-> lc2-10A/P3 - P26 -10G-> lc2-10B/P3 - P27 -10G-> lc2-10B/P2 - P28 -10G-> lc2-10B/P1 - P29 -10G-> lc2-10A/P2 - P30 -10G-> lc2-10A/P1 - P31 -10G-> lc2-11A/P1 - P32 -10G-> lc2-11A/P2 - P33 -10G-> lc2-11B/P1 - P34 -10G-> lc2-11B/P2 - P35 -10G-> lc2-11B/P3 - P36 -10G-> lc2-11A/P3 + P1 -10G-> fc8A P23 + P2 -10G-> fc8B P33 + P3 -10G-> fc7A P23 + P4 -10G-> fc7B P33 + P5 -10G-> fc6A P23 + P6 -10G-> fc6B P33 + P7 -10G-> fc5A P23 + P8 -10G-> fc5B P33 + P9 -10G-> fc4A P23 + P10 -10G-> fc4B P33 + P11 -10G-> fc3A P23 + P12 -10G-> fc3B P33 + P13 -10G-> fc0B P33 + P14 -10G-> fc0A P23 + P15 -10G-> fc1B P33 + P16 -10G-> fc1A P23 + P17 -10G-> fc2B P33 + P18 -10G-> fc2A P23 + P19 -10G-> lc2-9B/P3 + P20 -10G-> lc2-9A/P3 + P21 -10G-> lc2-9A/P2 + P22 -10G-> lc2-9A/P1 + P23 -10G-> lc2-9B/P2 + P24 -10G-> lc2-9B/P1 + P25 -10G-> lc2-10B/P3 + P26 -10G-> lc2-10A/P3 + P27 -10G-> lc2-10A/P2 + P28 -10G-> lc2-10A/P1 + P29 -10G-> lc2-10B/P2 + P30 -10G-> lc2-10B/P1 + P31 -10G-> lc2-11B/P1 + P32 -10G-> lc2-11B/P2 + P33 -10G-> lc2-11A/P1 + P34 -10G-> lc2-11A/P2 + P35 -10G-> lc2-11A/P3 + P36 -10G-> lc2-11B/P3 SUBSYSTEM LEAF lc3A - P1 -10G-> fc9B P34 - P2 -10G-> fc9A P27 - P3 -10G-> fc8B P34 - P4 -10G-> fc8A P27 - P5 -10G-> fc7B P34 - P6 -10G-> fc7A P27 - P7 -10G-> fc6B P34 - P8 -10G-> fc6A P27 - P9 -10G-> fc5B P34 - P10 -10G-> fc5A P27 - P11 -10G-> fc4B P34 - P12 -10G-> fc4A P27 - P13 -10G-> fc1A P27 - P14 -10G-> fc1B P34 - P15 -10G-> fc2A P27 - P16 -10G-> fc2B P34 - P17 -10G-> fc3A P27 - P18 -10G-> fc3B P34 - P19 -10G-> lc3-0A/P3 - P20 -10G-> lc3-0B/P3 - P21 -10G-> lc3-0B/P2 - P22 -10G-> lc3-0B/P1 - P23 -10G-> lc3-0A/P2 - P24 -10G-> lc3-0A/P1 - P25 -10G-> lc3-1A/P3 - P26 -10G-> lc3-1B/P3 - P27 -10G-> lc3-1B/P2 - P28 -10G-> lc3-1B/P1 - P29 -10G-> lc3-1A/P2 - P30 -10G-> lc3-1A/P1 - P31 -10G-> lc3-2A/P1 - P32 -10G-> lc3-2A/P2 - P33 -10G-> lc3-2B/P1 - P34 -10G-> lc3-2B/P2 - P35 -10G-> lc3-2B/P3 - P36 -10G-> lc3-2A/P3 + P1 -10G-> fc8B P13 + P2 -10G-> fc8A P36 + P3 -10G-> fc7B P13 + P4 -10G-> fc7A P36 + P5 -10G-> fc6B P13 + P6 -10G-> fc6A P36 + P7 -10G-> fc5B P13 + P8 -10G-> fc5A P36 + P9 -10G-> fc4B P13 + P10 -10G-> fc4A P36 + P11 -10G-> fc3B P13 + P12 -10G-> fc3A P36 + P13 -10G-> fc0A P36 + P14 -10G-> fc0B P13 + P15 -10G-> fc1A P36 + P16 -10G-> fc1B P13 + P17 -10G-> fc2A P36 + P18 -10G-> fc2B P13 + P19 -10G-> lc3-0B/P3 + P20 -10G-> lc3-0A/P3 + P21 -10G-> lc3-0A/P2 + P22 -10G-> lc3-0A/P1 + P23 -10G-> lc3-0B/P2 + P24 -10G-> lc3-0B/P1 + P25 -10G-> lc3-1B/P3 + P26 -10G-> lc3-1A/P3 + P27 -10G-> lc3-1A/P2 + P28 -10G-> lc3-1A/P1 + P29 -10G-> lc3-1B/P2 + P30 -10G-> lc3-1B/P1 + P31 -10G-> lc3-2B/P1 + P32 -10G-> lc3-2B/P2 + P33 -10G-> lc3-2A/P1 + P34 -10G-> lc3-2A/P2 + P35 -10G-> lc3-2A/P3 + P36 -10G-> lc3-2B/P3 SUBSYSTEM LEAF lc3B - P1 -10G-> fc9A P24 - P2 -10G-> fc9B P35 - P3 -10G-> fc8A P24 - P4 -10G-> fc8B P35 - P5 -10G-> fc7A P24 - P6 -10G-> fc7B P35 - P7 -10G-> fc6A P24 - P8 -10G-> fc6B P35 - P9 -10G-> fc5A P24 - P10 -10G-> fc5B P35 - P11 -10G-> fc4A P24 - P12 -10G-> fc4B P35 - P13 -10G-> fc1B P35 - P14 -10G-> fc1A P24 - P15 -10G-> fc2B P35 - P16 -10G-> fc2A P24 - P17 -10G-> fc3B P35 - P18 -10G-> fc3A P24 - P19 -10G-> lc3-3A/P3 - P20 -10G-> lc3-3B/P3 - P21 -10G-> lc3-3B/P2 - P22 -10G-> lc3-3B/P1 - P23 -10G-> lc3-3A/P2 - P24 -10G-> lc3-3A/P1 - P25 -10G-> lc3-4A/P3 - P26 -10G-> lc3-4B/P3 - P27 -10G-> lc3-4B/P2 - P28 -10G-> lc3-4B/P1 - P29 -10G-> lc3-4A/P2 - P30 -10G-> lc3-4A/P1 - P31 -10G-> lc3-5A/P1 - P32 -10G-> lc3-5A/P2 - P33 -10G-> lc3-5B/P1 - P34 -10G-> lc3-5B/P2 - P35 -10G-> lc3-5B/P3 - P36 -10G-> lc3-5A/P3 + P1 -10G-> fc8A P28 + P2 -10G-> fc8B P31 + P3 -10G-> fc7A P28 + P4 -10G-> fc7B P31 + P5 -10G-> fc6A P28 + P6 -10G-> fc6B P31 + P7 -10G-> fc5A P28 + P8 -10G-> fc5B P31 + P9 -10G-> fc4A P28 + P10 -10G-> fc4B P31 + P11 -10G-> fc3A P28 + P12 -10G-> fc3B P31 + P13 -10G-> fc0B P31 + P14 -10G-> fc0A P28 + P15 -10G-> fc1B P31 + P16 -10G-> fc1A P28 + P17 -10G-> fc2B P31 + P18 -10G-> fc2A P28 + P19 -10G-> lc3-3B/P3 + P20 -10G-> lc3-3A/P3 + P21 -10G-> lc3-3A/P2 + P22 -10G-> lc3-3A/P1 + P23 -10G-> lc3-3B/P2 + P24 -10G-> lc3-3B/P1 + P25 -10G-> lc3-4B/P3 + P26 -10G-> lc3-4A/P3 + P27 -10G-> lc3-4A/P2 + P28 -10G-> lc3-4A/P1 + P29 -10G-> lc3-4B/P2 + P30 -10G-> lc3-4B/P1 + P31 -10G-> lc3-5B/P1 + P32 -10G-> lc3-5B/P2 + P33 -10G-> lc3-5A/P1 + P34 -10G-> lc3-5A/P2 + P35 -10G-> lc3-5A/P3 + P36 -10G-> lc3-5B/P3 SUBSYSTEM LEAF lc3C - P1 -10G-> fc9B P30 - P2 -10G-> fc9A P25 - P3 -10G-> fc8B P30 - P4 -10G-> fc8A P25 - P5 -10G-> fc7B P30 - P6 -10G-> fc7A P25 - P7 -10G-> fc6B P30 - P8 -10G-> fc6A P25 - P9 -10G-> fc5B P30 - P10 -10G-> fc5A P25 - P11 -10G-> fc4B P30 - P12 -10G-> fc4A P25 - P13 -10G-> fc1A P25 - P14 -10G-> fc1B P30 - P15 -10G-> fc2A P25 - P16 -10G-> fc2B P30 - P17 -10G-> fc3A P25 - P18 -10G-> fc3B P30 - P19 -10G-> lc3-6A/P3 - P20 -10G-> lc3-6B/P3 - P21 -10G-> lc3-6B/P2 - P22 -10G-> lc3-6B/P1 - P23 -10G-> lc3-6A/P2 - P24 -10G-> lc3-6A/P1 - P25 -10G-> lc3-7A/P3 - P26 -10G-> lc3-7B/P3 - P27 -10G-> lc3-7B/P2 - P28 -10G-> lc3-7B/P1 - P29 -10G-> lc3-7A/P2 - P30 -10G-> lc3-7A/P1 - P31 -10G-> lc3-8A/P1 - P32 -10G-> lc3-8A/P2 - P33 -10G-> lc3-8B/P1 - P34 -10G-> lc3-8B/P2 - P35 -10G-> lc3-8B/P3 - P36 -10G-> lc3-8A/P3 + P1 -10G-> fc8B P32 + P2 -10G-> fc8A P29 + P3 -10G-> fc7B P32 + P4 -10G-> fc7A P29 + P5 -10G-> fc6B P32 + P6 -10G-> fc6A P29 + P7 -10G-> fc5B P32 + P8 -10G-> fc5A P29 + P9 -10G-> fc4B P32 + P10 -10G-> fc4A P29 + P11 -10G-> fc3B P32 + P12 -10G-> fc3A P29 + P13 -10G-> fc0A P29 + P14 -10G-> fc0B P32 + P15 -10G-> fc1A P29 + P16 -10G-> fc1B P32 + P17 -10G-> fc2A P29 + P18 -10G-> fc2B P32 + P19 -10G-> lc3-6B/P3 + P20 -10G-> lc3-6A/P3 + P21 -10G-> lc3-6A/P2 + P22 -10G-> lc3-6A/P1 + P23 -10G-> lc3-6B/P2 + P24 -10G-> lc3-6B/P1 + P25 -10G-> lc3-7B/P3 + P26 -10G-> lc3-7A/P3 + P27 -10G-> lc3-7A/P2 + P28 -10G-> lc3-7A/P1 + P29 -10G-> lc3-7B/P2 + P30 -10G-> lc3-7B/P1 + P31 -10G-> lc3-8B/P1 + P32 -10G-> lc3-8B/P2 + P33 -10G-> lc3-8A/P1 + P34 -10G-> lc3-8A/P2 + P35 -10G-> lc3-8A/P3 + P36 -10G-> lc3-8B/P3 SUBSYSTEM LEAF lc3D - P1 -10G-> fc9A P23 - P2 -10G-> fc9B P33 - P3 -10G-> fc8A P23 - P4 -10G-> fc8B P33 - P5 -10G-> fc7A P23 - P6 -10G-> fc7B P33 - P7 -10G-> fc6A P23 - P8 -10G-> fc6B P33 - P9 -10G-> fc5A P23 - P10 -10G-> fc5B P33 - P11 -10G-> fc4A P23 - P12 -10G-> fc4B P33 - P13 -10G-> fc1B P33 - P14 -10G-> fc1A P23 - P15 -10G-> fc2B P33 - P16 -10G-> fc2A P23 - P17 -10G-> fc3B P33 - P18 -10G-> fc3A P23 - P19 -10G-> lc3-9A/P3 - P20 -10G-> lc3-9B/P3 - P21 -10G-> lc3-9B/P2 - P22 -10G-> lc3-9B/P1 - P23 -10G-> lc3-9A/P2 - P24 -10G-> lc3-9A/P1 - P25 -10G-> lc3-10A/P3 - P26 -10G-> lc3-10B/P3 - P27 -10G-> lc3-10B/P2 - P28 -10G-> lc3-10B/P1 - P29 -10G-> lc3-10A/P2 - P30 -10G-> lc3-10A/P1 - P31 -10G-> lc3-11A/P1 - P32 -10G-> lc3-11A/P2 - P33 -10G-> lc3-11B/P1 - P34 -10G-> lc3-11B/P2 - P35 -10G-> lc3-11B/P3 - P36 -10G-> lc3-11A/P3 + P1 -10G-> fc8A P26 + P2 -10G-> fc8B P14 + P3 -10G-> fc7A P26 + P4 -10G-> fc7B P14 + P5 -10G-> fc6A P26 + P6 -10G-> fc6B P14 + P7 -10G-> fc5A P26 + P8 -10G-> fc5B P14 + P9 -10G-> fc4A P26 + P10 -10G-> fc4B P14 + P11 -10G-> fc3A P26 + P12 -10G-> fc3B P14 + P13 -10G-> fc0B P14 + P14 -10G-> fc0A P26 + P15 -10G-> fc1B P14 + P16 -10G-> fc1A P26 + P17 -10G-> fc2B P14 + P18 -10G-> fc2A P26 + P19 -10G-> lc3-9B/P3 + P20 -10G-> lc3-9A/P3 + P21 -10G-> lc3-9A/P2 + P22 -10G-> lc3-9A/P1 + P23 -10G-> lc3-9B/P2 + P24 -10G-> lc3-9B/P1 + P25 -10G-> lc3-10B/P3 + P26 -10G-> lc3-10A/P3 + P27 -10G-> lc3-10A/P2 + P28 -10G-> lc3-10A/P1 + P29 -10G-> lc3-10B/P2 + P30 -10G-> lc3-10B/P1 + P31 -10G-> lc3-11B/P1 + P32 -10G-> lc3-11B/P2 + P33 -10G-> lc3-11A/P1 + P34 -10G-> lc3-11A/P2 + P35 -10G-> lc3-11A/P3 + P36 -10G-> lc3-11B/P3 SUBSYSTEM LEAF lc4A - P1 -10G-> fc9B P13 - P2 -10G-> fc9A P36 - P3 -10G-> fc8B P13 - P4 -10G-> fc8A P36 - P5 -10G-> fc7B P13 - P6 -10G-> fc7A P36 - P7 -10G-> fc6B P13 - P8 -10G-> fc6A P36 - P9 -10G-> fc5B P13 - P10 -10G-> fc5A P36 - P11 -10G-> fc4B P13 - P12 -10G-> fc4A P36 - P13 -10G-> fc1A P36 - P14 -10G-> fc1B P13 - P15 -10G-> fc2A P36 - P16 -10G-> fc2B P13 - P17 -10G-> fc3A P36 - P18 -10G-> fc3B P13 - P19 -10G-> lc4-0A/P3 - P20 -10G-> lc4-0B/P3 - P21 -10G-> lc4-0B/P2 - P22 -10G-> lc4-0B/P1 - P23 -10G-> lc4-0A/P2 - P24 -10G-> lc4-0A/P1 - P25 -10G-> lc4-1A/P3 - P26 -10G-> lc4-1B/P3 - P27 -10G-> lc4-1B/P2 - P28 -10G-> lc4-1B/P1 - P29 -10G-> lc4-1A/P2 - P30 -10G-> lc4-1A/P1 - P31 -10G-> lc4-2A/P1 - P32 -10G-> lc4-2A/P2 - P33 -10G-> lc4-2B/P1 - P34 -10G-> lc4-2B/P2 - P35 -10G-> lc4-2B/P3 - P36 -10G-> lc4-2A/P3 + P1 -10G-> fc8B P17 + P2 -10G-> fc8A P33 + P3 -10G-> fc7B P17 + P4 -10G-> fc7A P33 + P5 -10G-> fc6B P17 + P6 -10G-> fc6A P33 + P7 -10G-> fc5B P17 + P8 -10G-> fc5A P33 + P9 -10G-> fc4B P17 + P10 -10G-> fc4A P33 + P11 -10G-> fc3B P17 + P12 -10G-> fc3A P33 + P13 -10G-> fc0A P33 + P14 -10G-> fc0B P17 + P15 -10G-> fc1A P33 + P16 -10G-> fc1B P17 + P17 -10G-> fc2A P33 + P18 -10G-> fc2B P17 + P19 -10G-> lc4-0B/P3 + P20 -10G-> lc4-0A/P3 + P21 -10G-> lc4-0A/P2 + P22 -10G-> lc4-0A/P1 + P23 -10G-> lc4-0B/P2 + P24 -10G-> lc4-0B/P1 + P25 -10G-> lc4-1B/P3 + P26 -10G-> lc4-1A/P3 + P27 -10G-> lc4-1A/P2 + P28 -10G-> lc4-1A/P1 + P29 -10G-> lc4-1B/P2 + P30 -10G-> lc4-1B/P1 + P31 -10G-> lc4-2B/P1 + P32 -10G-> lc4-2B/P2 + P33 -10G-> lc4-2A/P1 + P34 -10G-> lc4-2A/P2 + P35 -10G-> lc4-2A/P3 + P36 -10G-> lc4-2B/P3 SUBSYSTEM LEAF lc4B - P1 -10G-> fc9A P28 - P2 -10G-> fc9B P31 - P3 -10G-> fc8A P28 - P4 -10G-> fc8B P31 - P5 -10G-> fc7A P28 - P6 -10G-> fc7B P31 - P7 -10G-> fc6A P28 - P8 -10G-> fc6B P31 - P9 -10G-> fc5A P28 - P10 -10G-> fc5B P31 - P11 -10G-> fc4A P28 - P12 -10G-> fc4B P31 - P13 -10G-> fc1B P31 - P14 -10G-> fc1A P28 - P15 -10G-> fc2B P31 - P16 -10G-> fc2A P28 - P17 -10G-> fc3B P31 - P18 -10G-> fc3A P28 - P19 -10G-> lc4-3A/P3 - P20 -10G-> lc4-3B/P3 - P21 -10G-> lc4-3B/P2 - P22 -10G-> lc4-3B/P1 - P23 -10G-> lc4-3A/P2 - P24 -10G-> lc4-3A/P1 - P25 -10G-> lc4-4A/P3 - P26 -10G-> lc4-4B/P3 - P27 -10G-> lc4-4B/P2 - P28 -10G-> lc4-4B/P1 - P29 -10G-> lc4-4A/P2 - P30 -10G-> lc4-4A/P1 - P31 -10G-> lc4-5A/P1 - P32 -10G-> lc4-5A/P2 - P33 -10G-> lc4-5B/P1 - P34 -10G-> lc4-5B/P2 - P35 -10G-> lc4-5B/P3 - P36 -10G-> lc4-5A/P3 + P1 -10G-> fc8A P34 + P2 -10G-> fc8B P16 + P3 -10G-> fc7A P34 + P4 -10G-> fc7B P16 + P5 -10G-> fc6A P34 + P6 -10G-> fc6B P16 + P7 -10G-> fc5A P34 + P8 -10G-> fc5B P16 + P9 -10G-> fc4A P34 + P10 -10G-> fc4B P16 + P11 -10G-> fc3A P34 + P12 -10G-> fc3B P16 + P13 -10G-> fc0B P16 + P14 -10G-> fc0A P34 + P15 -10G-> fc1B P16 + P16 -10G-> fc1A P34 + P17 -10G-> fc2B P16 + P18 -10G-> fc2A P34 + P19 -10G-> lc4-3B/P3 + P20 -10G-> lc4-3A/P3 + P21 -10G-> lc4-3A/P2 + P22 -10G-> lc4-3A/P1 + P23 -10G-> lc4-3B/P2 + P24 -10G-> lc4-3B/P1 + P25 -10G-> lc4-4B/P3 + P26 -10G-> lc4-4A/P3 + P27 -10G-> lc4-4A/P2 + P28 -10G-> lc4-4A/P1 + P29 -10G-> lc4-4B/P2 + P30 -10G-> lc4-4B/P1 + P31 -10G-> lc4-5B/P1 + P32 -10G-> lc4-5B/P2 + P33 -10G-> lc4-5A/P1 + P34 -10G-> lc4-5A/P2 + P35 -10G-> lc4-5A/P3 + P36 -10G-> lc4-5B/P3 SUBSYSTEM LEAF lc4C - P1 -10G-> fc9B P32 - P2 -10G-> fc9A P29 - P3 -10G-> fc8B P32 - P4 -10G-> fc8A P29 - P5 -10G-> fc7B P32 - P6 -10G-> fc7A P29 - P7 -10G-> fc6B P32 - P8 -10G-> fc6A P29 - P9 -10G-> fc5B P32 - P10 -10G-> fc5A P29 - P11 -10G-> fc4B P32 - P12 -10G-> fc4A P29 - P13 -10G-> fc1A P29 - P14 -10G-> fc1B P32 - P15 -10G-> fc2A P29 - P16 -10G-> fc2B P32 - P17 -10G-> fc3A P29 - P18 -10G-> fc3B P32 - P19 -10G-> lc4-6A/P3 - P20 -10G-> lc4-6B/P3 - P21 -10G-> lc4-6B/P2 - P22 -10G-> lc4-6B/P1 - P23 -10G-> lc4-6A/P2 - P24 -10G-> lc4-6A/P1 - P25 -10G-> lc4-7A/P3 - P26 -10G-> lc4-7B/P3 - P27 -10G-> lc4-7B/P2 - P28 -10G-> lc4-7B/P1 - P29 -10G-> lc4-7A/P2 - P30 -10G-> lc4-7A/P1 - P31 -10G-> lc4-8A/P1 - P32 -10G-> lc4-8A/P2 - P33 -10G-> lc4-8B/P1 - P34 -10G-> lc4-8B/P2 - P35 -10G-> lc4-8B/P3 - P36 -10G-> lc4-8A/P3 + P1 -10G-> fc8B P15 + P2 -10G-> fc8A P35 + P3 -10G-> fc7B P15 + P4 -10G-> fc7A P35 + P5 -10G-> fc6B P15 + P6 -10G-> fc6A P35 + P7 -10G-> fc5B P15 + P8 -10G-> fc5A P35 + P9 -10G-> fc4B P15 + P10 -10G-> fc4A P35 + P11 -10G-> fc3B P15 + P12 -10G-> fc3A P35 + P13 -10G-> fc0A P35 + P14 -10G-> fc0B P15 + P15 -10G-> fc1A P35 + P16 -10G-> fc1B P15 + P17 -10G-> fc2A P35 + P18 -10G-> fc2B P15 + P19 -10G-> lc4-6B/P3 + P20 -10G-> lc4-6A/P3 + P21 -10G-> lc4-6A/P2 + P22 -10G-> lc4-6A/P1 + P23 -10G-> lc4-6B/P2 + P24 -10G-> lc4-6B/P1 + P25 -10G-> lc4-7B/P3 + P26 -10G-> lc4-7A/P3 + P27 -10G-> lc4-7A/P2 + P28 -10G-> lc4-7A/P1 + P29 -10G-> lc4-7B/P2 + P30 -10G-> lc4-7B/P1 + P31 -10G-> lc4-8B/P1 + P32 -10G-> lc4-8B/P2 + P33 -10G-> lc4-8A/P1 + P34 -10G-> lc4-8A/P2 + P35 -10G-> lc4-8A/P3 + P36 -10G-> lc4-8B/P3 SUBSYSTEM LEAF lc4D - P1 -10G-> fc9A P26 - P2 -10G-> fc9B P14 - P3 -10G-> fc8A P26 - P4 -10G-> fc8B P14 - P5 -10G-> fc7A P26 - P6 -10G-> fc7B P14 - P7 -10G-> fc6A P26 - P8 -10G-> fc6B P14 - P9 -10G-> fc5A P26 - P10 -10G-> fc5B P14 - P11 -10G-> fc4A P26 - P12 -10G-> fc4B P14 - P13 -10G-> fc1B P14 - P14 -10G-> fc1A P26 - P15 -10G-> fc2B P14 - P16 -10G-> fc2A P26 - P17 -10G-> fc3B P14 - P18 -10G-> fc3A P26 - P19 -10G-> lc4-9A/P3 - P20 -10G-> lc4-9B/P3 - P21 -10G-> lc4-9B/P2 - P22 -10G-> lc4-9B/P1 - P23 -10G-> lc4-9A/P2 - P24 -10G-> lc4-9A/P1 - P25 -10G-> lc4-10A/P3 - P26 -10G-> lc4-10B/P3 - P27 -10G-> lc4-10B/P2 - P28 -10G-> lc4-10B/P1 - P29 -10G-> lc4-10A/P2 - P30 -10G-> lc4-10A/P1 - P31 -10G-> lc4-11A/P1 - P32 -10G-> lc4-11A/P2 - P33 -10G-> lc4-11B/P1 - P34 -10G-> lc4-11B/P2 - P35 -10G-> lc4-11B/P3 - P36 -10G-> lc4-11A/P3 + P1 -10G-> fc8A P30 + P2 -10G-> fc8B P12 + P3 -10G-> fc7A P30 + P4 -10G-> fc7B P12 + P5 -10G-> fc6A P30 + P6 -10G-> fc6B P12 + P7 -10G-> fc5A P30 + P8 -10G-> fc5B P12 + P9 -10G-> fc4A P30 + P10 -10G-> fc4B P12 + P11 -10G-> fc3A P30 + P12 -10G-> fc3B P12 + P13 -10G-> fc0B P12 + P14 -10G-> fc0A P30 + P15 -10G-> fc1B P12 + P16 -10G-> fc1A P30 + P17 -10G-> fc2B P12 + P18 -10G-> fc2A P30 + P19 -10G-> lc4-9B/P3 + P20 -10G-> lc4-9A/P3 + P21 -10G-> lc4-9A/P2 + P22 -10G-> lc4-9A/P1 + P23 -10G-> lc4-9B/P2 + P24 -10G-> lc4-9B/P1 + P25 -10G-> lc4-10B/P3 + P26 -10G-> lc4-10A/P3 + P27 -10G-> lc4-10A/P2 + P28 -10G-> lc4-10A/P1 + P29 -10G-> lc4-10B/P2 + P30 -10G-> lc4-10B/P1 + P31 -10G-> lc4-11B/P1 + P32 -10G-> lc4-11B/P2 + P33 -10G-> lc4-11A/P1 + P34 -10G-> lc4-11A/P2 + P35 -10G-> lc4-11A/P3 + P36 -10G-> lc4-11B/P3 SUBSYSTEM LEAF lc5A - P1 -10G-> fc9B P17 - P2 -10G-> fc9A P33 - P3 -10G-> fc8B P17 - P4 -10G-> fc8A P33 - P5 -10G-> fc7B P17 - P6 -10G-> fc7A P33 - P7 -10G-> fc6B P17 - P8 -10G-> fc6A P33 - P9 -10G-> fc5B P17 - P10 -10G-> fc5A P33 - P11 -10G-> fc4B P17 - P12 -10G-> fc4A P33 - P13 -10G-> fc1A P33 - P14 -10G-> fc1B P17 - P15 -10G-> fc2A P33 - P16 -10G-> fc2B P17 - P17 -10G-> fc3A P33 - P18 -10G-> fc3B P17 - P19 -10G-> lc5-0A/P3 - P20 -10G-> lc5-0B/P3 - P21 -10G-> lc5-0B/P2 - P22 -10G-> lc5-0B/P1 - P23 -10G-> lc5-0A/P2 - P24 -10G-> lc5-0A/P1 - P25 -10G-> lc5-1A/P3 - P26 -10G-> lc5-1B/P3 - P27 -10G-> lc5-1B/P2 - P28 -10G-> lc5-1B/P1 - P29 -10G-> lc5-1A/P2 - P30 -10G-> lc5-1A/P1 - P31 -10G-> lc5-2A/P1 - P32 -10G-> lc5-2A/P2 - P33 -10G-> lc5-2B/P1 - P34 -10G-> lc5-2B/P2 - P35 -10G-> lc5-2B/P3 - P36 -10G-> lc5-2A/P3 + P1 -10G-> fc8B P11 + P2 -10G-> fc8A P14 + P3 -10G-> fc7B P11 + P4 -10G-> fc7A P14 + P5 -10G-> fc6B P11 + P6 -10G-> fc6A P14 + P7 -10G-> fc5B P11 + P8 -10G-> fc5A P14 + P9 -10G-> fc4B P11 + P10 -10G-> fc4A P14 + P11 -10G-> fc3B P11 + P12 -10G-> fc3A P14 + P13 -10G-> fc0A P14 + P14 -10G-> fc0B P11 + P15 -10G-> fc1A P14 + P16 -10G-> fc1B P11 + P17 -10G-> fc2A P14 + P18 -10G-> fc2B P11 + P19 -10G-> lc5-0B/P3 + P20 -10G-> lc5-0A/P3 + P21 -10G-> lc5-0A/P2 + P22 -10G-> lc5-0A/P1 + P23 -10G-> lc5-0B/P2 + P24 -10G-> lc5-0B/P1 + P25 -10G-> lc5-1B/P3 + P26 -10G-> lc5-1A/P3 + P27 -10G-> lc5-1A/P2 + P28 -10G-> lc5-1A/P1 + P29 -10G-> lc5-1B/P2 + P30 -10G-> lc5-1B/P1 + P31 -10G-> lc5-2B/P1 + P32 -10G-> lc5-2B/P2 + P33 -10G-> lc5-2A/P1 + P34 -10G-> lc5-2A/P2 + P35 -10G-> lc5-2A/P3 + P36 -10G-> lc5-2B/P3 SUBSYSTEM LEAF lc5B - P1 -10G-> fc9A P34 - P2 -10G-> fc9B P16 - P3 -10G-> fc8A P34 - P4 -10G-> fc8B P16 - P5 -10G-> fc7A P34 - P6 -10G-> fc7B P16 - P7 -10G-> fc6A P34 - P8 -10G-> fc6B P16 - P9 -10G-> fc5A P34 - P10 -10G-> fc5B P16 - P11 -10G-> fc4A P34 - P12 -10G-> fc4B P16 - P13 -10G-> fc1B P16 - P14 -10G-> fc1A P34 - P15 -10G-> fc2B P16 - P16 -10G-> fc2A P34 - P17 -10G-> fc3B P16 - P18 -10G-> fc3A P34 - P19 -10G-> lc5-3A/P3 - P20 -10G-> lc5-3B/P3 - P21 -10G-> lc5-3B/P2 - P22 -10G-> lc5-3B/P1 - P23 -10G-> lc5-3A/P2 - P24 -10G-> lc5-3A/P1 - P25 -10G-> lc5-4A/P3 - P26 -10G-> lc5-4B/P3 - P27 -10G-> lc5-4B/P2 - P28 -10G-> lc5-4B/P1 - P29 -10G-> lc5-4A/P2 - P30 -10G-> lc5-4A/P1 - P31 -10G-> lc5-5A/P1 - P32 -10G-> lc5-5A/P2 - P33 -10G-> lc5-5B/P1 - P34 -10G-> lc5-5B/P2 - P35 -10G-> lc5-5B/P3 - P36 -10G-> lc5-5A/P3 + P1 -10G-> fc8A P13 + P2 -10G-> fc8B P10 + P3 -10G-> fc7A P13 + P4 -10G-> fc7B P10 + P5 -10G-> fc6A P13 + P6 -10G-> fc6B P10 + P7 -10G-> fc5A P13 + P8 -10G-> fc5B P10 + P9 -10G-> fc4A P13 + P10 -10G-> fc4B P10 + P11 -10G-> fc3A P13 + P12 -10G-> fc3B P10 + P13 -10G-> fc0B P10 + P14 -10G-> fc0A P13 + P15 -10G-> fc1B P10 + P16 -10G-> fc1A P13 + P17 -10G-> fc2B P10 + P18 -10G-> fc2A P13 + P19 -10G-> lc5-3B/P3 + P20 -10G-> lc5-3A/P3 + P21 -10G-> lc5-3A/P2 + P22 -10G-> lc5-3A/P1 + P23 -10G-> lc5-3B/P2 + P24 -10G-> lc5-3B/P1 + P25 -10G-> lc5-4B/P3 + P26 -10G-> lc5-4A/P3 + P27 -10G-> lc5-4A/P2 + P28 -10G-> lc5-4A/P1 + P29 -10G-> lc5-4B/P2 + P30 -10G-> lc5-4B/P1 + P31 -10G-> lc5-5B/P1 + P32 -10G-> lc5-5B/P2 + P33 -10G-> lc5-5A/P1 + P34 -10G-> lc5-5A/P2 + P35 -10G-> lc5-5A/P3 + P36 -10G-> lc5-5B/P3 SUBSYSTEM LEAF lc5C - P1 -10G-> fc9B P15 - P2 -10G-> fc9A P35 - P3 -10G-> fc8B P15 - P4 -10G-> fc8A P35 - P5 -10G-> fc7B P15 - P6 -10G-> fc7A P35 - P7 -10G-> fc6B P15 - P8 -10G-> fc6A P35 - P9 -10G-> fc5B P15 - P10 -10G-> fc5A P35 - P11 -10G-> fc4B P15 - P12 -10G-> fc4A P35 - P13 -10G-> fc1A P35 - P14 -10G-> fc1B P15 - P15 -10G-> fc2A P35 - P16 -10G-> fc2B P15 - P17 -10G-> fc3A P35 - P18 -10G-> fc3B P15 - P19 -10G-> lc5-6A/P3 - P20 -10G-> lc5-6B/P3 - P21 -10G-> lc5-6B/P2 - P22 -10G-> lc5-6B/P1 - P23 -10G-> lc5-6A/P2 - P24 -10G-> lc5-6A/P1 - P25 -10G-> lc5-7A/P3 - P26 -10G-> lc5-7B/P3 - P27 -10G-> lc5-7B/P2 - P28 -10G-> lc5-7B/P1 - P29 -10G-> lc5-7A/P2 - P30 -10G-> lc5-7A/P1 - P31 -10G-> lc5-8A/P1 - P32 -10G-> lc5-8A/P2 - P33 -10G-> lc5-8B/P1 - P34 -10G-> lc5-8B/P2 - P35 -10G-> lc5-8B/P3 - P36 -10G-> lc5-8A/P3 + P1 -10G-> fc8B P18 + P2 -10G-> fc8A P31 + P3 -10G-> fc7B P18 + P4 -10G-> fc7A P31 + P5 -10G-> fc6B P18 + P6 -10G-> fc6A P31 + P7 -10G-> fc5B P18 + P8 -10G-> fc5A P31 + P9 -10G-> fc4B P18 + P10 -10G-> fc4A P31 + P11 -10G-> fc3B P18 + P12 -10G-> fc3A P31 + P13 -10G-> fc0A P31 + P14 -10G-> fc0B P18 + P15 -10G-> fc1A P31 + P16 -10G-> fc1B P18 + P17 -10G-> fc2A P31 + P18 -10G-> fc2B P18 + P19 -10G-> lc5-6B/P3 + P20 -10G-> lc5-6A/P3 + P21 -10G-> lc5-6A/P2 + P22 -10G-> lc5-6A/P1 + P23 -10G-> lc5-6B/P2 + P24 -10G-> lc5-6B/P1 + P25 -10G-> lc5-7B/P3 + P26 -10G-> lc5-7A/P3 + P27 -10G-> lc5-7A/P2 + P28 -10G-> lc5-7A/P1 + P29 -10G-> lc5-7B/P2 + P30 -10G-> lc5-7B/P1 + P31 -10G-> lc5-8B/P1 + P32 -10G-> lc5-8B/P2 + P33 -10G-> lc5-8A/P1 + P34 -10G-> lc5-8A/P2 + P35 -10G-> lc5-8A/P3 + P36 -10G-> lc5-8B/P3 SUBSYSTEM LEAF lc5D - P1 -10G-> fc9A P30 - P2 -10G-> fc9B P12 - P3 -10G-> fc8A P30 - P4 -10G-> fc8B P12 - P5 -10G-> fc7A P30 - P6 -10G-> fc7B P12 - P7 -10G-> fc6A P30 - P8 -10G-> fc6B P12 - P9 -10G-> fc5A P30 - P10 -10G-> fc5B P12 - P11 -10G-> fc4A P30 - P12 -10G-> fc4B P12 - P13 -10G-> fc1B P12 - P14 -10G-> fc1A P30 - P15 -10G-> fc2B P12 - P16 -10G-> fc2A P30 - P17 -10G-> fc3B P12 - P18 -10G-> fc3A P30 - P19 -10G-> lc5-9A/P3 - P20 -10G-> lc5-9B/P3 - P21 -10G-> lc5-9B/P2 - P22 -10G-> lc5-9B/P1 - P23 -10G-> lc5-9A/P2 - P24 -10G-> lc5-9A/P1 - P25 -10G-> lc5-10A/P3 - P26 -10G-> lc5-10B/P3 - P27 -10G-> lc5-10B/P2 - P28 -10G-> lc5-10B/P1 - P29 -10G-> lc5-10A/P2 - P30 -10G-> lc5-10A/P1 - P31 -10G-> lc5-11A/P1 - P32 -10G-> lc5-11A/P2 - P33 -10G-> lc5-11B/P1 - P34 -10G-> lc5-11B/P2 - P35 -10G-> lc5-11B/P3 - P36 -10G-> lc5-11A/P3 + P1 -10G-> fc8A P32 + P2 -10G-> fc8B P8 + P3 -10G-> fc7A P32 + P4 -10G-> fc7B P8 + P5 -10G-> fc6A P32 + P6 -10G-> fc6B P8 + P7 -10G-> fc5A P32 + P8 -10G-> fc5B P8 + P9 -10G-> fc4A P32 + P10 -10G-> fc4B P8 + P11 -10G-> fc3A P32 + P12 -10G-> fc3B P8 + P13 -10G-> fc0B P8 + P14 -10G-> fc0A P32 + P15 -10G-> fc1B P8 + P16 -10G-> fc1A P32 + P17 -10G-> fc2B P8 + P18 -10G-> fc2A P32 + P19 -10G-> lc5-9B/P3 + P20 -10G-> lc5-9A/P3 + P21 -10G-> lc5-9A/P2 + P22 -10G-> lc5-9A/P1 + P23 -10G-> lc5-9B/P2 + P24 -10G-> lc5-9B/P1 + P25 -10G-> lc5-10B/P3 + P26 -10G-> lc5-10A/P3 + P27 -10G-> lc5-10A/P2 + P28 -10G-> lc5-10A/P1 + P29 -10G-> lc5-10B/P2 + P30 -10G-> lc5-10B/P1 + P31 -10G-> lc5-11B/P1 + P32 -10G-> lc5-11B/P2 + P33 -10G-> lc5-11A/P1 + P34 -10G-> lc5-11A/P2 + P35 -10G-> lc5-11A/P3 + P36 -10G-> lc5-11B/P3 SUBSYSTEM LEAF lc6A - P1 -10G-> fc9B P11 - P2 -10G-> fc9A P14 - P3 -10G-> fc8B P11 - P4 -10G-> fc8A P14 - P5 -10G-> fc7B P11 - P6 -10G-> fc7A P14 - P7 -10G-> fc6B P11 - P8 -10G-> fc6A P14 - P9 -10G-> fc5B P11 - P10 -10G-> fc5A P14 - P11 -10G-> fc4B P11 - P12 -10G-> fc4A P14 - P13 -10G-> fc1A P14 - P14 -10G-> fc1B P11 - P15 -10G-> fc2A P14 - P16 -10G-> fc2B P11 - P17 -10G-> fc3A P14 - P18 -10G-> fc3B P11 - P19 -10G-> lc6-0A/P3 - P20 -10G-> lc6-0B/P3 - P21 -10G-> lc6-0B/P2 - P22 -10G-> lc6-0B/P1 - P23 -10G-> lc6-0A/P2 - P24 -10G-> lc6-0A/P1 - P25 -10G-> lc6-1A/P3 - P26 -10G-> lc6-1B/P3 - P27 -10G-> lc6-1B/P2 - P28 -10G-> lc6-1B/P1 - P29 -10G-> lc6-1A/P2 - P30 -10G-> lc6-1A/P1 - P31 -10G-> lc6-2A/P1 - P32 -10G-> lc6-2A/P2 - P33 -10G-> lc6-2B/P1 - P34 -10G-> lc6-2B/P2 - P35 -10G-> lc6-2B/P3 - P36 -10G-> lc6-2A/P3 + P1 -10G-> fc8B P7 + P2 -10G-> fc8A P12 + P3 -10G-> fc7B P7 + P4 -10G-> fc7A P12 + P5 -10G-> fc6B P7 + P6 -10G-> fc6A P12 + P7 -10G-> fc5B P7 + P8 -10G-> fc5A P12 + P9 -10G-> fc4B P7 + P10 -10G-> fc4A P12 + P11 -10G-> fc3B P7 + P12 -10G-> fc3A P12 + P13 -10G-> fc0A P12 + P14 -10G-> fc0B P7 + P15 -10G-> fc1A P12 + P16 -10G-> fc1B P7 + P17 -10G-> fc2A P12 + P18 -10G-> fc2B P7 + P19 -10G-> lc6-0B/P3 + P20 -10G-> lc6-0A/P3 + P21 -10G-> lc6-0A/P2 + P22 -10G-> lc6-0A/P1 + P23 -10G-> lc6-0B/P2 + P24 -10G-> lc6-0B/P1 + P25 -10G-> lc6-1B/P3 + P26 -10G-> lc6-1A/P3 + P27 -10G-> lc6-1A/P2 + P28 -10G-> lc6-1A/P1 + P29 -10G-> lc6-1B/P2 + P30 -10G-> lc6-1B/P1 + P31 -10G-> lc6-2B/P1 + P32 -10G-> lc6-2B/P2 + P33 -10G-> lc6-2A/P1 + P34 -10G-> lc6-2A/P2 + P35 -10G-> lc6-2A/P3 + P36 -10G-> lc6-2B/P3 SUBSYSTEM LEAF lc6B - P1 -10G-> fc9A P13 - P2 -10G-> fc9B P10 - P3 -10G-> fc8A P13 - P4 -10G-> fc8B P10 - P5 -10G-> fc7A P13 - P6 -10G-> fc7B P10 - P7 -10G-> fc6A P13 - P8 -10G-> fc6B P10 - P9 -10G-> fc5A P13 - P10 -10G-> fc5B P10 - P11 -10G-> fc4A P13 - P12 -10G-> fc4B P10 - P13 -10G-> fc1B P10 - P14 -10G-> fc1A P13 - P15 -10G-> fc2B P10 - P16 -10G-> fc2A P13 - P17 -10G-> fc3B P10 - P18 -10G-> fc3A P13 - P19 -10G-> lc6-3A/P3 - P20 -10G-> lc6-3B/P3 - P21 -10G-> lc6-3B/P2 - P22 -10G-> lc6-3B/P1 - P23 -10G-> lc6-3A/P2 - P24 -10G-> lc6-3A/P1 - P25 -10G-> lc6-4A/P3 - P26 -10G-> lc6-4B/P3 - P27 -10G-> lc6-4B/P2 - P28 -10G-> lc6-4B/P1 - P29 -10G-> lc6-4A/P2 - P30 -10G-> lc6-4A/P1 - P31 -10G-> lc6-5A/P1 - P32 -10G-> lc6-5A/P2 - P33 -10G-> lc6-5B/P1 - P34 -10G-> lc6-5B/P2 - P35 -10G-> lc6-5B/P3 - P36 -10G-> lc6-5A/P3 + P1 -10G-> fc8A P17 + P2 -10G-> fc8B P6 + P3 -10G-> fc7A P17 + P4 -10G-> fc7B P6 + P5 -10G-> fc6A P17 + P6 -10G-> fc6B P6 + P7 -10G-> fc5A P17 + P8 -10G-> fc5B P6 + P9 -10G-> fc4A P17 + P10 -10G-> fc4B P6 + P11 -10G-> fc3A P17 + P12 -10G-> fc3B P6 + P13 -10G-> fc0B P6 + P14 -10G-> fc0A P17 + P15 -10G-> fc1B P6 + P16 -10G-> fc1A P17 + P17 -10G-> fc2B P6 + P18 -10G-> fc2A P17 + P19 -10G-> lc6-3B/P3 + P20 -10G-> lc6-3A/P3 + P21 -10G-> lc6-3A/P2 + P22 -10G-> lc6-3A/P1 + P23 -10G-> lc6-3B/P2 + P24 -10G-> lc6-3B/P1 + P25 -10G-> lc6-4B/P3 + P26 -10G-> lc6-4A/P3 + P27 -10G-> lc6-4A/P2 + P28 -10G-> lc6-4A/P1 + P29 -10G-> lc6-4B/P2 + P30 -10G-> lc6-4B/P1 + P31 -10G-> lc6-5B/P1 + P32 -10G-> lc6-5B/P2 + P33 -10G-> lc6-5A/P1 + P34 -10G-> lc6-5A/P2 + P35 -10G-> lc6-5A/P3 + P36 -10G-> lc6-5B/P3 SUBSYSTEM LEAF lc6C - P1 -10G-> fc9B P18 - P2 -10G-> fc9A P31 - P3 -10G-> fc8B P18 - P4 -10G-> fc8A P31 - P5 -10G-> fc7B P18 - P6 -10G-> fc7A P31 - P7 -10G-> fc6B P18 - P8 -10G-> fc6A P31 - P9 -10G-> fc5B P18 - P10 -10G-> fc5A P31 - P11 -10G-> fc4B P18 - P12 -10G-> fc4A P31 - P13 -10G-> fc1A P31 - P14 -10G-> fc1B P18 - P15 -10G-> fc2A P31 - P16 -10G-> fc2B P18 - P17 -10G-> fc3A P31 - P18 -10G-> fc3B P18 - P19 -10G-> lc6-6A/P3 - P20 -10G-> lc6-6B/P3 - P21 -10G-> lc6-6B/P2 - P22 -10G-> lc6-6B/P1 - P23 -10G-> lc6-6A/P2 - P24 -10G-> lc6-6A/P1 - P25 -10G-> lc6-7A/P3 - P26 -10G-> lc6-7B/P3 - P27 -10G-> lc6-7B/P2 - P28 -10G-> lc6-7B/P1 - P29 -10G-> lc6-7A/P2 - P30 -10G-> lc6-7A/P1 - P31 -10G-> lc6-8A/P1 - P32 -10G-> lc6-8A/P2 - P33 -10G-> lc6-8B/P1 - P34 -10G-> lc6-8B/P2 - P35 -10G-> lc6-8B/P3 - P36 -10G-> lc6-8A/P3 + P1 -10G-> fc8B P9 + P2 -10G-> fc8A P16 + P3 -10G-> fc7B P9 + P4 -10G-> fc7A P16 + P5 -10G-> fc6B P9 + P6 -10G-> fc6A P16 + P7 -10G-> fc5B P9 + P8 -10G-> fc5A P16 + P9 -10G-> fc4B P9 + P10 -10G-> fc4A P16 + P11 -10G-> fc3B P9 + P12 -10G-> fc3A P16 + P13 -10G-> fc0A P16 + P14 -10G-> fc0B P9 + P15 -10G-> fc1A P16 + P16 -10G-> fc1B P9 + P17 -10G-> fc2A P16 + P18 -10G-> fc2B P9 + P19 -10G-> lc6-6B/P3 + P20 -10G-> lc6-6A/P3 + P21 -10G-> lc6-6A/P2 + P22 -10G-> lc6-6A/P1 + P23 -10G-> lc6-6B/P2 + P24 -10G-> lc6-6B/P1 + P25 -10G-> lc6-7B/P3 + P26 -10G-> lc6-7A/P3 + P27 -10G-> lc6-7A/P2 + P28 -10G-> lc6-7A/P1 + P29 -10G-> lc6-7B/P2 + P30 -10G-> lc6-7B/P1 + P31 -10G-> lc6-8B/P1 + P32 -10G-> lc6-8B/P2 + P33 -10G-> lc6-8A/P1 + P34 -10G-> lc6-8A/P2 + P35 -10G-> lc6-8A/P3 + P36 -10G-> lc6-8B/P3 SUBSYSTEM LEAF lc6D - P1 -10G-> fc9A P32 - P2 -10G-> fc9B P8 - P3 -10G-> fc8A P32 - P4 -10G-> fc8B P8 - P5 -10G-> fc7A P32 - P6 -10G-> fc7B P8 - P7 -10G-> fc6A P32 - P8 -10G-> fc6B P8 - P9 -10G-> fc5A P32 - P10 -10G-> fc5B P8 - P11 -10G-> fc4A P32 - P12 -10G-> fc4B P8 - P13 -10G-> fc1B P8 - P14 -10G-> fc1A P32 - P15 -10G-> fc2B P8 - P16 -10G-> fc2A P32 - P17 -10G-> fc3B P8 - P18 -10G-> fc3A P32 - P19 -10G-> lc6-9A/P3 - P20 -10G-> lc6-9B/P3 - P21 -10G-> lc6-9B/P2 - P22 -10G-> lc6-9B/P1 - P23 -10G-> lc6-9A/P2 - P24 -10G-> lc6-9A/P1 - P25 -10G-> lc6-10A/P3 - P26 -10G-> lc6-10B/P3 - P27 -10G-> lc6-10B/P2 - P28 -10G-> lc6-10B/P1 - P29 -10G-> lc6-10A/P2 - P30 -10G-> lc6-10A/P1 - P31 -10G-> lc6-11A/P1 - P32 -10G-> lc6-11A/P2 - P33 -10G-> lc6-11B/P1 - P34 -10G-> lc6-11B/P2 - P35 -10G-> lc6-11B/P3 - P36 -10G-> lc6-11A/P3 + P1 -10G-> fc8A P15 + P2 -10G-> fc8B P5 + P3 -10G-> fc7A P15 + P4 -10G-> fc7B P5 + P5 -10G-> fc6A P15 + P6 -10G-> fc6B P5 + P7 -10G-> fc5A P15 + P8 -10G-> fc5B P5 + P9 -10G-> fc4A P15 + P10 -10G-> fc4B P5 + P11 -10G-> fc3A P15 + P12 -10G-> fc3B P5 + P13 -10G-> fc0B P5 + P14 -10G-> fc0A P15 + P15 -10G-> fc1B P5 + P16 -10G-> fc1A P15 + P17 -10G-> fc2B P5 + P18 -10G-> fc2A P15 + P19 -10G-> lc6-9B/P3 + P20 -10G-> lc6-9A/P3 + P21 -10G-> lc6-9A/P2 + P22 -10G-> lc6-9A/P1 + P23 -10G-> lc6-9B/P2 + P24 -10G-> lc6-9B/P1 + P25 -10G-> lc6-10B/P3 + P26 -10G-> lc6-10A/P3 + P27 -10G-> lc6-10A/P2 + P28 -10G-> lc6-10A/P1 + P29 -10G-> lc6-10B/P2 + P30 -10G-> lc6-10B/P1 + P31 -10G-> lc6-11B/P1 + P32 -10G-> lc6-11B/P2 + P33 -10G-> lc6-11A/P1 + P34 -10G-> lc6-11A/P2 + P35 -10G-> lc6-11A/P3 + P36 -10G-> lc6-11B/P3 SUBSYSTEM LEAF lc7A - P1 -10G-> fc9B P7 - P2 -10G-> fc9A P12 - P3 -10G-> fc8B P7 - P4 -10G-> fc8A P12 - P5 -10G-> fc7B P7 - P6 -10G-> fc7A P12 - P7 -10G-> fc6B P7 - P8 -10G-> fc6A P12 - P9 -10G-> fc5B P7 - P10 -10G-> fc5A P12 - P11 -10G-> fc4B P7 - P12 -10G-> fc4A P12 - P13 -10G-> fc1A P12 - P14 -10G-> fc1B P7 - P15 -10G-> fc2A P12 - P16 -10G-> fc2B P7 - P17 -10G-> fc3A P12 - P18 -10G-> fc3B P7 - P19 -10G-> lc7-0A/P3 - P20 -10G-> lc7-0B/P3 - P21 -10G-> lc7-0B/P2 - P22 -10G-> lc7-0B/P1 - P23 -10G-> lc7-0A/P2 - P24 -10G-> lc7-0A/P1 - P25 -10G-> lc7-1A/P3 - P26 -10G-> lc7-1B/P3 - P27 -10G-> lc7-1B/P2 - P28 -10G-> lc7-1B/P1 - P29 -10G-> lc7-1A/P2 - P30 -10G-> lc7-1A/P1 - P31 -10G-> lc7-2A/P1 - P32 -10G-> lc7-2A/P2 - P33 -10G-> lc7-2B/P1 - P34 -10G-> lc7-2B/P2 - P35 -10G-> lc7-2B/P3 - P36 -10G-> lc7-2A/P3 + P1 -10G-> fc8B P2 + P2 -10G-> fc8A P8 + P3 -10G-> fc7B P2 + P4 -10G-> fc7A P8 + P5 -10G-> fc6B P2 + P6 -10G-> fc6A P8 + P7 -10G-> fc5B P2 + P8 -10G-> fc5A P8 + P9 -10G-> fc4B P2 + P10 -10G-> fc4A P8 + P11 -10G-> fc3B P2 + P12 -10G-> fc3A P8 + P13 -10G-> fc0A P8 + P14 -10G-> fc0B P2 + P15 -10G-> fc1A P8 + P16 -10G-> fc1B P2 + P17 -10G-> fc2A P8 + P18 -10G-> fc2B P2 + P19 -10G-> lc7-0B/P3 + P20 -10G-> lc7-0A/P3 + P21 -10G-> lc7-0A/P2 + P22 -10G-> lc7-0A/P1 + P23 -10G-> lc7-0B/P2 + P24 -10G-> lc7-0B/P1 + P25 -10G-> lc7-1B/P3 + P26 -10G-> lc7-1A/P3 + P27 -10G-> lc7-1A/P2 + P28 -10G-> lc7-1A/P1 + P29 -10G-> lc7-1B/P2 + P30 -10G-> lc7-1B/P1 + P31 -10G-> lc7-2B/P1 + P32 -10G-> lc7-2B/P2 + P33 -10G-> lc7-2A/P1 + P34 -10G-> lc7-2A/P2 + P35 -10G-> lc7-2A/P3 + P36 -10G-> lc7-2B/P3 SUBSYSTEM LEAF lc7B - P1 -10G-> fc9A P17 - P2 -10G-> fc9B P6 - P3 -10G-> fc8A P17 - P4 -10G-> fc8B P6 - P5 -10G-> fc7A P17 - P6 -10G-> fc7B P6 - P7 -10G-> fc6A P17 - P8 -10G-> fc6B P6 - P9 -10G-> fc5A P17 - P10 -10G-> fc5B P6 - P11 -10G-> fc4A P17 - P12 -10G-> fc4B P6 - P13 -10G-> fc1B P6 - P14 -10G-> fc1A P17 - P15 -10G-> fc2B P6 - P16 -10G-> fc2A P17 - P17 -10G-> fc3B P6 - P18 -10G-> fc3A P17 - P19 -10G-> lc7-3A/P3 - P20 -10G-> lc7-3B/P3 - P21 -10G-> lc7-3B/P2 - P22 -10G-> lc7-3B/P1 - P23 -10G-> lc7-3A/P2 - P24 -10G-> lc7-3A/P1 - P25 -10G-> lc7-4A/P3 - P26 -10G-> lc7-4B/P3 - P27 -10G-> lc7-4B/P2 - P28 -10G-> lc7-4B/P1 - P29 -10G-> lc7-4A/P2 - P30 -10G-> lc7-4A/P1 - P31 -10G-> lc7-5A/P1 - P32 -10G-> lc7-5A/P2 - P33 -10G-> lc7-5B/P1 - P34 -10G-> lc7-5B/P2 - P35 -10G-> lc7-5B/P3 - P36 -10G-> lc7-5A/P3 + P1 -10G-> fc8A P11 + P2 -10G-> fc8B P3 + P3 -10G-> fc7A P11 + P4 -10G-> fc7B P3 + P5 -10G-> fc6A P11 + P6 -10G-> fc6B P3 + P7 -10G-> fc5A P11 + P8 -10G-> fc5B P3 + P9 -10G-> fc4A P11 + P10 -10G-> fc4B P3 + P11 -10G-> fc3A P11 + P12 -10G-> fc3B P3 + P13 -10G-> fc0B P3 + P14 -10G-> fc0A P11 + P15 -10G-> fc1B P3 + P16 -10G-> fc1A P11 + P17 -10G-> fc2B P3 + P18 -10G-> fc2A P11 + P19 -10G-> lc7-3B/P3 + P20 -10G-> lc7-3A/P3 + P21 -10G-> lc7-3A/P2 + P22 -10G-> lc7-3A/P1 + P23 -10G-> lc7-3B/P2 + P24 -10G-> lc7-3B/P1 + P25 -10G-> lc7-4B/P3 + P26 -10G-> lc7-4A/P3 + P27 -10G-> lc7-4A/P2 + P28 -10G-> lc7-4A/P1 + P29 -10G-> lc7-4B/P2 + P30 -10G-> lc7-4B/P1 + P31 -10G-> lc7-5B/P1 + P32 -10G-> lc7-5B/P2 + P33 -10G-> lc7-5A/P1 + P34 -10G-> lc7-5A/P2 + P35 -10G-> lc7-5A/P3 + P36 -10G-> lc7-5B/P3 SUBSYSTEM LEAF lc7C - P1 -10G-> fc9B P9 - P2 -10G-> fc9A P16 - P3 -10G-> fc8B P9 - P4 -10G-> fc8A P16 - P5 -10G-> fc7B P9 - P6 -10G-> fc7A P16 - P7 -10G-> fc6B P9 - P8 -10G-> fc6A P16 - P9 -10G-> fc5B P9 - P10 -10G-> fc5A P16 - P11 -10G-> fc4B P9 - P12 -10G-> fc4A P16 - P13 -10G-> fc1A P16 - P14 -10G-> fc1B P9 - P15 -10G-> fc2A P16 - P16 -10G-> fc2B P9 - P17 -10G-> fc3A P16 - P18 -10G-> fc3B P9 - P19 -10G-> lc7-6A/P3 - P20 -10G-> lc7-6B/P3 - P21 -10G-> lc7-6B/P2 - P22 -10G-> lc7-6B/P1 - P23 -10G-> lc7-6A/P2 - P24 -10G-> lc7-6A/P1 - P25 -10G-> lc7-7A/P3 - P26 -10G-> lc7-7B/P3 - P27 -10G-> lc7-7B/P2 - P28 -10G-> lc7-7B/P1 - P29 -10G-> lc7-7A/P2 - P30 -10G-> lc7-7A/P1 - P31 -10G-> lc7-8A/P1 - P32 -10G-> lc7-8A/P2 - P33 -10G-> lc7-8B/P1 - P34 -10G-> lc7-8B/P2 - P35 -10G-> lc7-8B/P3 - P36 -10G-> lc7-8A/P3 + P1 -10G-> fc8B P4 + P2 -10G-> fc8A P10 + P3 -10G-> fc7B P4 + P4 -10G-> fc7A P10 + P5 -10G-> fc6B P4 + P6 -10G-> fc6A P10 + P7 -10G-> fc5B P4 + P8 -10G-> fc5A P10 + P9 -10G-> fc4B P4 + P10 -10G-> fc4A P10 + P11 -10G-> fc3B P4 + P12 -10G-> fc3A P10 + P13 -10G-> fc0A P10 + P14 -10G-> fc0B P4 + P15 -10G-> fc1A P10 + P16 -10G-> fc1B P4 + P17 -10G-> fc2A P10 + P18 -10G-> fc2B P4 + P19 -10G-> lc7-6B/P3 + P20 -10G-> lc7-6A/P3 + P21 -10G-> lc7-6A/P2 + P22 -10G-> lc7-6A/P1 + P23 -10G-> lc7-6B/P2 + P24 -10G-> lc7-6B/P1 + P25 -10G-> lc7-7B/P3 + P26 -10G-> lc7-7A/P3 + P27 -10G-> lc7-7A/P2 + P28 -10G-> lc7-7A/P1 + P29 -10G-> lc7-7B/P2 + P30 -10G-> lc7-7B/P1 + P31 -10G-> lc7-8B/P1 + P32 -10G-> lc7-8B/P2 + P33 -10G-> lc7-8A/P1 + P34 -10G-> lc7-8A/P2 + P35 -10G-> lc7-8A/P3 + P36 -10G-> lc7-8B/P3 SUBSYSTEM LEAF lc7D - P1 -10G-> fc9A P15 - P2 -10G-> fc9B P5 - P3 -10G-> fc8A P15 - P4 -10G-> fc8B P5 - P5 -10G-> fc7A P15 - P6 -10G-> fc7B P5 - P7 -10G-> fc6A P15 - P8 -10G-> fc6B P5 - P9 -10G-> fc5A P15 - P10 -10G-> fc5B P5 - P11 -10G-> fc4A P15 - P12 -10G-> fc4B P5 - P13 -10G-> fc1B P5 - P14 -10G-> fc1A P15 - P15 -10G-> fc2B P5 - P16 -10G-> fc2A P15 - P17 -10G-> fc3B P5 - P18 -10G-> fc3A P15 - P19 -10G-> lc7-9A/P3 - P20 -10G-> lc7-9B/P3 - P21 -10G-> lc7-9B/P2 - P22 -10G-> lc7-9B/P1 - P23 -10G-> lc7-9A/P2 - P24 -10G-> lc7-9A/P1 - P25 -10G-> lc7-10A/P3 - P26 -10G-> lc7-10B/P3 - P27 -10G-> lc7-10B/P2 - P28 -10G-> lc7-10B/P1 - P29 -10G-> lc7-10A/P2 - P30 -10G-> lc7-10A/P1 - P31 -10G-> lc7-11A/P1 - P32 -10G-> lc7-11A/P2 - P33 -10G-> lc7-11B/P1 - P34 -10G-> lc7-11B/P2 - P35 -10G-> lc7-11B/P3 - P36 -10G-> lc7-11A/P3 + P1 -10G-> fc8A P18 + P2 -10G-> fc8B P1 + P3 -10G-> fc7A P18 + P4 -10G-> fc7B P1 + P5 -10G-> fc6A P18 + P6 -10G-> fc6B P1 + P7 -10G-> fc5A P18 + P8 -10G-> fc5B P1 + P9 -10G-> fc4A P18 + P10 -10G-> fc4B P1 + P11 -10G-> fc3A P18 + P12 -10G-> fc3B P1 + P13 -10G-> fc0B P1 + P14 -10G-> fc0A P18 + P15 -10G-> fc1B P1 + P16 -10G-> fc1A P18 + P17 -10G-> fc2B P1 + P18 -10G-> fc2A P18 + P19 -10G-> lc7-9B/P3 + P20 -10G-> lc7-9A/P3 + P21 -10G-> lc7-9A/P2 + P22 -10G-> lc7-9A/P1 + P23 -10G-> lc7-9B/P2 + P24 -10G-> lc7-9B/P1 + P25 -10G-> lc7-10B/P3 + P26 -10G-> lc7-10A/P3 + P27 -10G-> lc7-10A/P2 + P28 -10G-> lc7-10A/P1 + P29 -10G-> lc7-10B/P2 + P30 -10G-> lc7-10B/P1 + P31 -10G-> lc7-11B/P1 + P32 -10G-> lc7-11B/P2 + P33 -10G-> lc7-11A/P1 + P34 -10G-> lc7-11A/P2 + P35 -10G-> lc7-11A/P3 + P36 -10G-> lc7-11B/P3 SUBSYSTEM LEAF lc8A - P1 -10G-> fc9B P2 - P2 -10G-> fc9A P8 - P3 -10G-> fc8B P2 - P4 -10G-> fc8A P8 - P5 -10G-> fc7B P2 - P6 -10G-> fc7A P8 - P7 -10G-> fc6B P2 - P8 -10G-> fc6A P8 - P9 -10G-> fc5B P2 - P10 -10G-> fc5A P8 - P11 -10G-> fc4B P2 - P12 -10G-> fc4A P8 - P13 -10G-> fc1A P8 - P14 -10G-> fc1B P2 - P15 -10G-> fc2A P8 - P16 -10G-> fc2B P2 - P17 -10G-> fc3A P8 - P18 -10G-> fc3B P2 - P19 -10G-> lc8-0A/P3 - P20 -10G-> lc8-0B/P3 - P21 -10G-> lc8-0B/P2 - P22 -10G-> lc8-0B/P1 - P23 -10G-> lc8-0A/P2 - P24 -10G-> lc8-0A/P1 - P25 -10G-> lc8-1A/P3 - P26 -10G-> lc8-1B/P3 - P27 -10G-> lc8-1B/P2 - P28 -10G-> lc8-1B/P1 - P29 -10G-> lc8-1A/P2 - P30 -10G-> lc8-1A/P1 - P31 -10G-> lc8-2A/P1 - P32 -10G-> lc8-2A/P2 - P33 -10G-> lc8-2B/P1 - P34 -10G-> lc8-2B/P2 - P35 -10G-> lc8-2B/P3 - P36 -10G-> lc8-2A/P3 + P1 -10G-> fc8B P21 + P2 -10G-> fc8A P5 + P3 -10G-> fc7B P21 + P4 -10G-> fc7A P5 + P5 -10G-> fc6B P21 + P6 -10G-> fc6A P5 + P7 -10G-> fc5B P21 + P8 -10G-> fc5A P5 + P9 -10G-> fc4B P21 + P10 -10G-> fc4A P5 + P11 -10G-> fc3B P21 + P12 -10G-> fc3A P5 + P13 -10G-> fc0A P5 + P14 -10G-> fc0B P21 + P15 -10G-> fc1A P5 + P16 -10G-> fc1B P21 + P17 -10G-> fc2A P5 + P18 -10G-> fc2B P21 + P19 -10G-> lc8-0B/P3 + P20 -10G-> lc8-0A/P3 + P21 -10G-> lc8-0A/P2 + P22 -10G-> lc8-0A/P1 + P23 -10G-> lc8-0B/P2 + P24 -10G-> lc8-0B/P1 + P25 -10G-> lc8-1B/P3 + P26 -10G-> lc8-1A/P3 + P27 -10G-> lc8-1A/P2 + P28 -10G-> lc8-1A/P1 + P29 -10G-> lc8-1B/P2 + P30 -10G-> lc8-1B/P1 + P31 -10G-> lc8-2B/P1 + P32 -10G-> lc8-2B/P2 + P33 -10G-> lc8-2A/P1 + P34 -10G-> lc8-2A/P2 + P35 -10G-> lc8-2A/P3 + P36 -10G-> lc8-2B/P3 SUBSYSTEM LEAF lc8B - P1 -10G-> fc9A P11 - P2 -10G-> fc9B P3 - P3 -10G-> fc8A P11 - P4 -10G-> fc8B P3 - P5 -10G-> fc7A P11 - P6 -10G-> fc7B P3 - P7 -10G-> fc6A P11 - P8 -10G-> fc6B P3 - P9 -10G-> fc5A P11 - P10 -10G-> fc5B P3 - P11 -10G-> fc4A P11 - P12 -10G-> fc4B P3 - P13 -10G-> fc1B P3 - P14 -10G-> fc1A P11 - P15 -10G-> fc2B P3 - P16 -10G-> fc2A P11 - P17 -10G-> fc3B P3 - P18 -10G-> fc3A P11 - P19 -10G-> lc8-3A/P3 - P20 -10G-> lc8-3B/P3 - P21 -10G-> lc8-3B/P2 - P22 -10G-> lc8-3B/P1 - P23 -10G-> lc8-3A/P2 - P24 -10G-> lc8-3A/P1 - P25 -10G-> lc8-4A/P3 - P26 -10G-> lc8-4B/P3 - P27 -10G-> lc8-4B/P2 - P28 -10G-> lc8-4B/P1 - P29 -10G-> lc8-4A/P2 - P30 -10G-> lc8-4A/P1 - P31 -10G-> lc8-5A/P1 - P32 -10G-> lc8-5A/P2 - P33 -10G-> lc8-5B/P1 - P34 -10G-> lc8-5B/P2 - P35 -10G-> lc8-5B/P3 - P36 -10G-> lc8-5A/P3 + P1 -10G-> fc8A P7 + P2 -10G-> fc8B P20 + P3 -10G-> fc7A P7 + P4 -10G-> fc7B P20 + P5 -10G-> fc6A P7 + P6 -10G-> fc6B P20 + P7 -10G-> fc5A P7 + P8 -10G-> fc5B P20 + P9 -10G-> fc4A P7 + P10 -10G-> fc4B P20 + P11 -10G-> fc3A P7 + P12 -10G-> fc3B P20 + P13 -10G-> fc0B P20 + P14 -10G-> fc0A P7 + P15 -10G-> fc1B P20 + P16 -10G-> fc1A P7 + P17 -10G-> fc2B P20 + P18 -10G-> fc2A P7 + P19 -10G-> lc8-3B/P3 + P20 -10G-> lc8-3A/P3 + P21 -10G-> lc8-3A/P2 + P22 -10G-> lc8-3A/P1 + P23 -10G-> lc8-3B/P2 + P24 -10G-> lc8-3B/P1 + P25 -10G-> lc8-4B/P3 + P26 -10G-> lc8-4A/P3 + P27 -10G-> lc8-4A/P2 + P28 -10G-> lc8-4A/P1 + P29 -10G-> lc8-4B/P2 + P30 -10G-> lc8-4B/P1 + P31 -10G-> lc8-5B/P1 + P32 -10G-> lc8-5B/P2 + P33 -10G-> lc8-5A/P1 + P34 -10G-> lc8-5A/P2 + P35 -10G-> lc8-5A/P3 + P36 -10G-> lc8-5B/P3 SUBSYSTEM LEAF lc8C - P1 -10G-> fc9B P4 - P2 -10G-> fc9A P10 - P3 -10G-> fc8B P4 - P4 -10G-> fc8A P10 - P5 -10G-> fc7B P4 - P6 -10G-> fc7A P10 - P7 -10G-> fc6B P4 - P8 -10G-> fc6A P10 - P9 -10G-> fc5B P4 - P10 -10G-> fc5A P10 - P11 -10G-> fc4B P4 - P12 -10G-> fc4A P10 - P13 -10G-> fc1A P10 - P14 -10G-> fc1B P4 - P15 -10G-> fc2A P10 - P16 -10G-> fc2B P4 - P17 -10G-> fc3A P10 - P18 -10G-> fc3B P4 - P19 -10G-> lc8-6A/P3 - P20 -10G-> lc8-6B/P3 - P21 -10G-> lc8-6B/P2 - P22 -10G-> lc8-6B/P1 - P23 -10G-> lc8-6A/P2 - P24 -10G-> lc8-6A/P1 - P25 -10G-> lc8-7A/P3 - P26 -10G-> lc8-7B/P3 - P27 -10G-> lc8-7B/P2 - P28 -10G-> lc8-7B/P1 - P29 -10G-> lc8-7A/P2 - P30 -10G-> lc8-7A/P1 - P31 -10G-> lc8-8A/P1 - P32 -10G-> lc8-8A/P2 - P33 -10G-> lc8-8B/P1 - P34 -10G-> lc8-8B/P2 - P35 -10G-> lc8-8B/P3 - P36 -10G-> lc8-8A/P3 + P1 -10G-> fc8B P19 + P2 -10G-> fc8A P6 + P3 -10G-> fc7B P19 + P4 -10G-> fc7A P6 + P5 -10G-> fc6B P19 + P6 -10G-> fc6A P6 + P7 -10G-> fc5B P19 + P8 -10G-> fc5A P6 + P9 -10G-> fc4B P19 + P10 -10G-> fc4A P6 + P11 -10G-> fc3B P19 + P12 -10G-> fc3A P6 + P13 -10G-> fc0A P6 + P14 -10G-> fc0B P19 + P15 -10G-> fc1A P6 + P16 -10G-> fc1B P19 + P17 -10G-> fc2A P6 + P18 -10G-> fc2B P19 + P19 -10G-> lc8-6B/P3 + P20 -10G-> lc8-6A/P3 + P21 -10G-> lc8-6A/P2 + P22 -10G-> lc8-6A/P1 + P23 -10G-> lc8-6B/P2 + P24 -10G-> lc8-6B/P1 + P25 -10G-> lc8-7B/P3 + P26 -10G-> lc8-7A/P3 + P27 -10G-> lc8-7A/P2 + P28 -10G-> lc8-7A/P1 + P29 -10G-> lc8-7B/P2 + P30 -10G-> lc8-7B/P1 + P31 -10G-> lc8-8B/P1 + P32 -10G-> lc8-8B/P2 + P33 -10G-> lc8-8A/P1 + P34 -10G-> lc8-8A/P2 + P35 -10G-> lc8-8A/P3 + P36 -10G-> lc8-8B/P3 SUBSYSTEM LEAF lc8D - P1 -10G-> fc9A P18 - P2 -10G-> fc9B P1 - P3 -10G-> fc8A P18 - P4 -10G-> fc8B P1 - P5 -10G-> fc7A P18 - P6 -10G-> fc7B P1 - P7 -10G-> fc6A P18 - P8 -10G-> fc6B P1 - P9 -10G-> fc5A P18 - P10 -10G-> fc5B P1 - P11 -10G-> fc4A P18 - P12 -10G-> fc4B P1 - P13 -10G-> fc1B P1 - P14 -10G-> fc1A P18 - P15 -10G-> fc2B P1 - P16 -10G-> fc2A P18 - P17 -10G-> fc3B P1 - P18 -10G-> fc3A P18 - P19 -10G-> lc8-9A/P3 - P20 -10G-> lc8-9B/P3 - P21 -10G-> lc8-9B/P2 - P22 -10G-> lc8-9B/P1 - P23 -10G-> lc8-9A/P2 - P24 -10G-> lc8-9A/P1 - P25 -10G-> lc8-10A/P3 - P26 -10G-> lc8-10B/P3 - P27 -10G-> lc8-10B/P2 - P28 -10G-> lc8-10B/P1 - P29 -10G-> lc8-10A/P2 - P30 -10G-> lc8-10A/P1 - P31 -10G-> lc8-11A/P1 - P32 -10G-> lc8-11A/P2 - P33 -10G-> lc8-11B/P1 - P34 -10G-> lc8-11B/P2 - P35 -10G-> lc8-11B/P3 - P36 -10G-> lc8-11A/P3 - -SUBSYSTEM LEAF lc9A - P1 -10G-> fc9B P21 - P2 -10G-> fc9A P5 - P3 -10G-> fc8B P21 - P4 -10G-> fc8A P5 - P5 -10G-> fc7B P21 - P6 -10G-> fc7A P5 - P7 -10G-> fc6B P21 - P8 -10G-> fc6A P5 - P9 -10G-> fc5B P21 - P10 -10G-> fc5A P5 - P11 -10G-> fc4B P21 - P12 -10G-> fc4A P5 - P13 -10G-> fc1A P5 - P14 -10G-> fc1B P21 - P15 -10G-> fc2A P5 - P16 -10G-> fc2B P21 - P17 -10G-> fc3A P5 - P18 -10G-> fc3B P21 - P19 -10G-> lc9-0A/P3 - P20 -10G-> lc9-0B/P3 - P21 -10G-> lc9-0B/P2 - P22 -10G-> lc9-0B/P1 - P23 -10G-> lc9-0A/P2 - P24 -10G-> lc9-0A/P1 - P25 -10G-> lc9-1A/P3 - P26 -10G-> lc9-1B/P3 - P27 -10G-> lc9-1B/P2 - P28 -10G-> lc9-1B/P1 - P29 -10G-> lc9-1A/P2 - P30 -10G-> lc9-1A/P1 - P31 -10G-> lc9-2A/P1 - P32 -10G-> lc9-2A/P2 - P33 -10G-> lc9-2B/P1 - P34 -10G-> lc9-2B/P2 - P35 -10G-> lc9-2B/P3 - P36 -10G-> lc9-2A/P3 - -SUBSYSTEM LEAF lc9B - P1 -10G-> fc9A P7 - P2 -10G-> fc9B P20 - P3 -10G-> fc8A P7 - P4 -10G-> fc8B P20 - P5 -10G-> fc7A P7 - P6 -10G-> fc7B P20 - P7 -10G-> fc6A P7 - P8 -10G-> fc6B P20 - P9 -10G-> fc5A P7 - P10 -10G-> fc5B P20 - P11 -10G-> fc4A P7 - P12 -10G-> fc4B P20 - P13 -10G-> fc1B P20 - P14 -10G-> fc1A P7 - P15 -10G-> fc2B P20 - P16 -10G-> fc2A P7 - P17 -10G-> fc3B P20 - P18 -10G-> fc3A P7 - P19 -10G-> lc9-3A/P3 - P20 -10G-> lc9-3B/P3 - P21 -10G-> lc9-3B/P2 - P22 -10G-> lc9-3B/P1 - P23 -10G-> lc9-3A/P2 - P24 -10G-> lc9-3A/P1 - P25 -10G-> lc9-4A/P3 - P26 -10G-> lc9-4B/P3 - P27 -10G-> lc9-4B/P2 - P28 -10G-> lc9-4B/P1 - P29 -10G-> lc9-4A/P2 - P30 -10G-> lc9-4A/P1 - P31 -10G-> lc9-5A/P1 - P32 -10G-> lc9-5A/P2 - P33 -10G-> lc9-5B/P1 - P34 -10G-> lc9-5B/P2 - P35 -10G-> lc9-5B/P3 - P36 -10G-> lc9-5A/P3 - -SUBSYSTEM LEAF lc9C - P1 -10G-> fc9B P19 - P2 -10G-> fc9A P6 - P3 -10G-> fc8B P19 - P4 -10G-> fc8A P6 - P5 -10G-> fc7B P19 - P6 -10G-> fc7A P6 - P7 -10G-> fc6B P19 - P8 -10G-> fc6A P6 - P9 -10G-> fc5B P19 - P10 -10G-> fc5A P6 - P11 -10G-> fc4B P19 - P12 -10G-> fc4A P6 - P13 -10G-> fc1A P6 - P14 -10G-> fc1B P19 - P15 -10G-> fc2A P6 - P16 -10G-> fc2B P19 - P17 -10G-> fc3A P6 - P18 -10G-> fc3B P19 - P19 -10G-> lc9-6A/P3 - P20 -10G-> lc9-6B/P3 - P21 -10G-> lc9-6B/P2 - P22 -10G-> lc9-6B/P1 - P23 -10G-> lc9-6A/P2 - P24 -10G-> lc9-6A/P1 - P25 -10G-> lc9-7A/P3 - P26 -10G-> lc9-7B/P3 - P27 -10G-> lc9-7B/P2 - P28 -10G-> lc9-7B/P1 - P29 -10G-> lc9-7A/P2 - P30 -10G-> lc9-7A/P1 - P31 -10G-> lc9-8A/P1 - P32 -10G-> lc9-8A/P2 - P33 -10G-> lc9-8B/P1 - P34 -10G-> lc9-8B/P2 - P35 -10G-> lc9-8B/P3 - P36 -10G-> lc9-8A/P3 - -SUBSYSTEM LEAF lc9D - P1 -10G-> fc9A P9 - P2 -10G-> fc9B P22 - P3 -10G-> fc8A P9 - P4 -10G-> fc8B P22 - P5 -10G-> fc7A P9 - P6 -10G-> fc7B P22 - P7 -10G-> fc6A P9 - P8 -10G-> fc6B P22 - P9 -10G-> fc5A P9 - P10 -10G-> fc5B P22 - P11 -10G-> fc4A P9 - P12 -10G-> fc4B P22 - P13 -10G-> fc1B P22 - P14 -10G-> fc1A P9 - P15 -10G-> fc2B P22 - P16 -10G-> fc2A P9 - P17 -10G-> fc3B P22 - P18 -10G-> fc3A P9 - P19 -10G-> lc9-9A/P3 - P20 -10G-> lc9-9B/P3 - P21 -10G-> lc9-9B/P2 - P22 -10G-> lc9-9B/P1 - P23 -10G-> lc9-9A/P2 - P24 -10G-> lc9-9A/P1 - P25 -10G-> lc9-10A/P3 - P26 -10G-> lc9-10B/P3 - P27 -10G-> lc9-10B/P2 - P28 -10G-> lc9-10B/P1 - P29 -10G-> lc9-10A/P2 - P30 -10G-> lc9-10A/P1 - P31 -10G-> lc9-11A/P1 - P32 -10G-> lc9-11A/P2 - P33 -10G-> lc9-11B/P1 - P34 -10G-> lc9-11B/P2 - P35 -10G-> lc9-11B/P3 - P36 -10G-> lc9-11A/P3 + P1 -10G-> fc8A P9 + P2 -10G-> fc8B P22 + P3 -10G-> fc7A P9 + P4 -10G-> fc7B P22 + P5 -10G-> fc6A P9 + P6 -10G-> fc6B P22 + P7 -10G-> fc5A P9 + P8 -10G-> fc5B P22 + P9 -10G-> fc4A P9 + P10 -10G-> fc4B P22 + P11 -10G-> fc3A P9 + P12 -10G-> fc3B P22 + P13 -10G-> fc0B P22 + P14 -10G-> fc0A P9 + P15 -10G-> fc1B P22 + P16 -10G-> fc1A P9 + P17 -10G-> fc2B P22 + P18 -10G-> fc2A P9 + P19 -10G-> lc8-9B/P3 + P20 -10G-> lc8-9A/P3 + P21 -10G-> lc8-9A/P2 + P22 -10G-> lc8-9A/P1 + P23 -10G-> lc8-9B/P2 + P24 -10G-> lc8-9B/P1 + P25 -10G-> lc8-10B/P3 + P26 -10G-> lc8-10A/P3 + P27 -10G-> lc8-10A/P2 + P28 -10G-> lc8-10A/P1 + P29 -10G-> lc8-10B/P2 + P30 -10G-> lc8-10B/P1 + P31 -10G-> lc8-11B/P1 + P32 -10G-> lc8-11B/P2 + P33 -10G-> lc8-11A/P1 + P34 -10G-> lc8-11A/P2 + P35 -10G-> lc8-11A/P3 + P36 -10G-> lc8-11B/P3 diff --git a/ibdm/ibnl/SUNDCS72QDR.ibnl b/ibdm/ibnl/SUNDCS72QDR.ibnl index 1907ec3..fee233a 100644 --- a/ibdm/ibnl/SUNDCS72QDR.ibnl +++ b/ibdm/ibnl/SUNDCS72QDR.ibnl @@ -176,24 +176,24 @@ SUBSYSTEM LEAF SW-D P16 -10G-> SW-E P18 P17 -10G-> SW-F P17 P18 -10G-> SW-E P16 - P19 -10G-> C-9A/P3 - P20 -10G-> C-9B/P3 - P21 -10G-> C-9B/P2 - P22 -10G-> C-9B/P1 - P23 -10G-> C-9A/P2 - P24 -10G-> C-9A/P1 - P25 -10G-> C-10A/P3 - P26 -10G-> C-10B/P3 - P27 -10G-> C-10B/P2 - P28 -10G-> C-10B/P1 - P29 -10G-> C-10A/P2 - P30 -10G-> C-10A/P1 - P31 -10G-> C-11A/P1 - P32 -10G-> C-11A/P2 - P33 -10G-> C-11B/P1 - P34 -10G-> C-11B/P2 - P35 -10G-> C-11B/P3 - P36 -10G-> C-11A/P3 + P19 -10G-> C-9B/P3 + P20 -10G-> C-9A/P3 + P21 -10G-> C-9A/P2 + P22 -10G-> C-9A/P1 + P23 -10G-> C-9B/P2 + P24 -10G-> C-9B/P1 + P25 -10G-> C-10B/P3 + P26 -10G-> C-10A/P3 + P27 -10G-> C-10A/P2 + P28 -10G-> C-10A/P1 + P29 -10G-> C-10B/P2 + P30 -10G-> C-10B/P1 + P31 -10G-> C-11B/P1 + P32 -10G-> C-11B/P2 + P33 -10G-> C-11A/P1 + P34 -10G-> C-11A/P2 + P35 -10G-> C-11A/P3 + P36 -10G-> C-11B/P3 SUBSYSTEM LEAF SW-C P1 -10G-> SW-F P9 @@ -214,24 +214,24 @@ SUBSYSTEM LEAF SW-C P16 -10G-> SW-E P24 P17 -10G-> SW-F P23 P18 -10G-> SW-E P22 - P19 -10G-> C-6A/P3 - P20 -10G-> C-6B/P3 - P21 -10G-> C-6B/P2 - P22 -10G-> C-6B/P1 - P23 -10G-> C-6A/P2 - P24 -10G-> C-6A/P1 - P25 -10G-> C-7A/P3 - P26 -10G-> C-7B/P3 - P27 -10G-> C-7B/P2 - P28 -10G-> C-7B/P1 - P29 -10G-> C-7A/P2 - P30 -10G-> C-7A/P1 - P31 -10G-> C-8A/P1 - P32 -10G-> C-8A/P2 - P33 -10G-> C-8B/P1 - P34 -10G-> C-8B/P2 - P35 -10G-> C-8B/P3 - P36 -10G-> C-8A/P3 + P19 -10G-> C-6B/P3 + P20 -10G-> C-6A/P3 + P21 -10G-> C-6A/P2 + P22 -10G-> C-6A/P1 + P23 -10G-> C-6B/P2 + P24 -10G-> C-6B/P1 + P25 -10G-> C-7B/P3 + P26 -10G-> C-7A/P3 + P27 -10G-> C-7A/P2 + P28 -10G-> C-7A/P1 + P29 -10G-> C-7B/P2 + P30 -10G-> C-7B/P1 + P31 -10G-> C-8B/P1 + P32 -10G-> C-8B/P2 + P33 -10G-> C-8A/P1 + P34 -10G-> C-8A/P2 + P35 -10G-> C-8A/P3 + P36 -10G-> C-8B/P3 SUBSYSTEM LEAF SW-B P1 -10G-> SW-E P28 @@ -252,24 +252,24 @@ SUBSYSTEM LEAF SW-B P16 -10G-> SW-F P18 P17 -10G-> SW-E P17 P18 -10G-> SW-F P16 - P19 -10G-> C-3A/P3 - P20 -10G-> C-3B/P3 - P21 -10G-> C-3B/P2 - P22 -10G-> C-3B/P1 - P23 -10G-> C-3A/P2 - P24 -10G-> C-3A/P1 - P25 -10G-> C-4A/P3 - P26 -10G-> C-4B/P3 - P27 -10G-> C-4B/P2 - P28 -10G-> C-4B/P1 - P29 -10G-> C-4A/P2 - P30 -10G-> C-4A/P1 - P31 -10G-> C-5A/P1 - P32 -10G-> C-5A/P2 - P33 -10G-> C-5B/P1 - P34 -10G-> C-5B/P2 - P35 -10G-> C-5B/P3 - P36 -10G-> C-5A/P3 + P19 -10G-> C-3B/P3 + P20 -10G-> C-3A/P3 + P21 -10G-> C-3A/P2 + P22 -10G-> C-3A/P1 + P23 -10G-> C-3B/P2 + P24 -10G-> C-3B/P1 + P25 -10G-> C-4B/P3 + P26 -10G-> C-4A/P3 + P27 -10G-> C-4A/P2 + P28 -10G-> C-4A/P1 + P29 -10G-> C-4B/P2 + P30 -10G-> C-4B/P1 + P31 -10G-> C-5B/P1 + P32 -10G-> C-5B/P2 + P33 -10G-> C-5A/P1 + P34 -10G-> C-5A/P2 + P35 -10G-> C-5A/P3 + P36 -10G-> C-5B/P3 SUBSYSTEM LEAF SW-A P1 -10G-> SW-E P9 @@ -290,22 +290,22 @@ SUBSYSTEM LEAF SW-A P16 -10G-> SW-F P24 P17 -10G-> SW-E P23 P18 -10G-> SW-F P22 - P19 -10G-> C-0A/P3 - P20 -10G-> C-0B/P3 - P21 -10G-> C-0B/P2 - P22 -10G-> C-0B/P1 - P23 -10G-> C-0A/P2 - P24 -10G-> C-0A/P1 - P25 -10G-> C-1A/P3 - P26 -10G-> C-1B/P3 - P27 -10G-> C-1B/P2 - P28 -10G-> C-1B/P1 - P29 -10G-> C-1A/P2 - P30 -10G-> C-1A/P1 - P31 -10G-> C-2A/P1 - P32 -10G-> C-2A/P2 - P33 -10G-> C-2B/P1 - P34 -10G-> C-2B/P2 - P35 -10G-> C-2B/P3 - P36 -10G-> C-2A/P3 + P19 -10G-> C-0B/P3 + P20 -10G-> C-0A/P3 + P21 -10G-> C-0A/P2 + P22 -10G-> C-0A/P1 + P23 -10G-> C-0B/P2 + P24 -10G-> C-0B/P1 + P25 -10G-> C-1B/P3 + P26 -10G-> C-1A/P3 + P27 -10G-> C-1A/P2 + P28 -10G-> C-1A/P1 + P29 -10G-> C-1B/P2 + P30 -10G-> C-1B/P1 + P31 -10G-> C-2B/P1 + P32 -10G-> C-2B/P2 + P33 -10G-> C-2A/P1 + P34 -10G-> C-2A/P2 + P35 -10G-> C-2A/P3 + P36 -10G-> C-2B/P3 From vlad at lists.openfabrics.org Tue Sep 8 03:08:09 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 8 Sep 2009 03:08:09 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090908-0200 daily build status Message-ID: <20090908100809.84DB6E61F0F@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Failed: Build failed on x86_64 with linux-2.6.9-78.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-78.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-78.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'daddr' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:71: error: 'struct inet_sock' has no member named 'dport' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'saddr' /home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.c:73: error: 'struct inet_sock' has no member named 'daddr' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090908-0200_linux-2.6.9-67.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-67.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From jsquyres at cisco.com Tue Sep 8 05:20:35 2009 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 8 Sep 2009 08:20:35 -0400 Subject: [ofa-general] Re: [PLEASE READ] Transition from general@lists.openfabrics.org to linux-rdma@vger.kernel.org In-Reply-To: References: Message-ID: +1 to everything Roland said (I'm deleting general at lists.openfabrics.org from my addressbook after sending this). I'd suggest making a clean break -- on 1 Oct, make "general at lists.openfabrics.org " autoreply with something like "This list has moved; please re-send your message to linux-rdma at vger.kernel.org" or somesuch. On Sep 8, 2009, at 12:56 AM, Roland Dreier (rdreier) wrote: > Hi everyone, > > As you may have noticed, the linux-rdma at vger.kernel.org list is up and > running. A number of sites have started archiving it -- in no > particular order, the sites I know of are: > > http://www.spinics.net/lists/linux-rdma/ > http://marc.info/?l=linux-rdma > http://www.mail-archive.com/linux-rdma at vger.kernel.org/ > http://dir.gmane.org/gmane.linux.drivers.rdma > > If you know of any archives I've missed, please let me know. > > Another nice tool we have with the new list is > > http://patchwork.kernel.org/project/linux-rdma/list/ > > which will make me have to work a little harder to lose patches. > Also, > if you are someone who handles a driver or some other part of the > tree, > you can register and I will be able to delegate patches to you. > > I would suggest that everyone who is currently following > general at lists.openfabrics.org to subscribe to the new list (send the > line "subscribe linux-rdma" in the body of a message to > majordomo at vger.kernel.org). Also, everyone sending email to the old > list to also please start to copy the new vger list as well > (especially > for patches, so that the patchwork tool catches them). > > As a date to finalize the transition, I propose the end of the month > -- > that is, on October 1 we shut down the old general@ list and expect > everything to be cut over by then. I'll send a number of reminders > before then, but please do speak up if this schedule is too tight -- I > don't think keeping the old list running for longer is a big hardship > for anyone. > > Finally, I'll plan to merge the following for 2.6.32: > > > MAINTAINERS: InfiniBand/RDMA mailing list transition to vger > > InfiniBand/RDMA development discussion is moving from > general at lists.openfabrics.org to linux-rdma at vger.kernel.org. > > Signed-off-by: Roland Dreier > --- > MAINTAINERS | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 8dca9d8..989ff11 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -439,7 +439,7 @@ F: drivers/hwmon/ams/ > AMSO1100 RNIC DRIVER > M: Tom Tucker > M: Steve Wise > -L: general at lists.openfabrics.org > +L: linux-rdma at vger.kernel.org > S: Maintained > F: drivers/infiniband/hw/amso1100/ > > @@ -1494,7 +1494,7 @@ F: drivers/net/cxgb3/ > > CXGB3 IWARP RNIC DRIVER (IW_CXGB3) > M: Steve Wise > -L: general at lists.openfabrics.org > +L: linux-rdma at vger.kernel.org > W: http://www.openfabrics.org > S: Supported > F: drivers/infiniband/hw/cxgb3/ > @@ -1868,7 +1868,7 @@ F: fs/efs/ > EHCA (IBM GX bus InfiniBand adapter) DRIVER > M: Hoang-Nam Nguyen > M: Christoph Raisch > -L: general at lists.openfabrics.org > +L: linux-rdma at vger.kernel.org > S: Supported > F: drivers/infiniband/hw/ehca/ > > @@ -2552,7 +2552,7 @@ INFINIBAND SUBSYSTEM > M: Roland Dreier > M: Sean Hefty > M: Hal Rosenstock > -L: general at lists.openfabrics.org (moderated for non-subscribers) > +L: linux-rdma at vger.kernel.org > W: http://www.openib.org/ > T: git git://git.kernel.org/pub/scm/linux/kernel/git/roland/ > infiniband.git > S: Supported > @@ -2729,7 +2729,7 @@ F: drivers/net/ipg.c > > IPATH DRIVER > M: Ralph Campbell > -L: general at lists.openfabrics.org > +L: linux-rdma at vger.kernel.org > T: git git://git.qlogic.com/ipath-linux-2.6 > S: Supported > F: drivers/infiniband/hw/ipath/ > @@ -3485,7 +3485,7 @@ F: drivers/scsi/NCR_D700.* > NETEFFECT IWARP RNIC DRIVER (IW_NES) > M: Faisal Latif > M: Chien Tung > -L: general at lists.openfabrics.org > +L: linux-rdma at vger.kernel.org > W: http://www.neteffect.com > S: Supported > F: drivers/infiniband/hw/nes/ > -- > 1.6.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux- > rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jeff Squyres jsquyres at cisco.com From bart.vanassche at gmail.com Tue Sep 8 10:01:42 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 8 Sep 2009 19:01:42 +0200 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 8:25 AM, Bart Van Assche wrote: > On Tue, Sep 8, 2009 at 6:21 AM, Roland Dreier wrote: > >  > With 2.6.31-rc9 + patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 + the > >  > patch you posted at the start of this thread the following lockdep > >  > complaint was triggered on the SRP initiator system during SRP login: > >  > > >  > ====================================================== > >  > [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] > >  > 2.6.31-rc9 #2 > >  > ------------------------------------------------------ > >  > ibsrpdm/4290 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: > >  >  (&(&rmpp_recv->cleanup_work)->timer){+.-...}, at: > >  > [] del_timer_sync+0x0/0xa0 > >  > > >  > and this task is already holding: > >  >  (&mad_agent_priv->lock){..-...}, at: [] > >  > ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] > >  > which would create a new lock dependency: > >  >  (&mad_agent_priv->lock){..-...} -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} > > > > And this report doesn't happen with the older patch?  (Did you do the > > same testing with the older patch that triggered this) > > > > Because this looks like a *different* incarnation of the same > > lock->lock->delayed work/timer that we're trying to fix here -- the > > delayed work is now rmpp_recv->cleanup_work in this case instead of > > mad_agent_priv->timed_work as it was before. > > The above issue does not occur with the for-next branch of the > infiniband git tree, but does occur with 2.6.31-rc9 + aforementioned > patches. > > As far as I can see commit 721d67cdca5b7642b380ca0584de8dceecf6102f > (http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=721d67cdca5b7642b380ca0584de8dceecf6102f) > is not yet included in 2.6.31-rc9. Could this be related to the above > issue ? Update: patch 721d67cdca5b7642b380ca0584de8dceecf6102f does not apply cleanly to 2.6.31-rc9, so I have been using a slightly modified version of this patch (http://bugzilla.kernel.org/attachment.cgi?id=22624). I have retested the 2.6.31-rc9 kernel with the following patches applied to it: * patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 * http://bugzilla.kernel.org/attachment.cgi?id=22624 * the patch posted at the start of this thread. With this combination I did not observe any lockdep complaints. Bart. From rdreier at cisco.com Tue Sep 8 10:15:35 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Sep 2009 10:15:35 -0700 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: (Bart Van Assche's message of "Tue, 8 Sep 2009 08:25:53 +0200") References: Message-ID: > The above issue does not occur with the for-next branch of the > infiniband git tree, but does occur with 2.6.31-rc9 + aforementioned > patches. > > As far as I can see commit 721d67cdca5b7642b380ca0584de8dceecf6102f > (http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=721d67cdca5b7642b380ca0584de8dceecf6102f) > is not yet included in 2.6.31-rc9. Could this be related to the above > issue ? Yes, that would make sense. "priv->lock" -- ie the ipoib lock whose coverage is reduced in 721d67cd -- is in the lockdep report you posted. So it seems likely that 721d67cd makes the mad_rmpp report not trigger. However I think the mad_rmpp code does still have a lock-lock-timer problem that could cause lockdep reports in the future, so I'll have a look at fixing it. Do you happen to have the full lockdep output from this test handy? I'm curious to see exactly how the mad_rmpp lock gets linked to priv->lock. Thanks, Roland From rdreier at cisco.com Tue Sep 8 10:17:01 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Sep 2009 10:17:01 -0700 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: (Bart Van Assche's message of "Tue, 8 Sep 2009 19:01:42 +0200") References: Message-ID: > Update: patch 721d67cdca5b7642b380ca0584de8dceecf6102f does not apply > cleanly to 2.6.31-rc9, so I have been using a slightly modified > version of this patch > (http://bugzilla.kernel.org/attachment.cgi?id=22624). > > I have retested the 2.6.31-rc9 kernel with the following patches applied to it: > * patch 4e49627b9bc29a14b393c480e8c979e3bc922ef7 > * http://bugzilla.kernel.org/attachment.cgi?id=22624 > * the patch posted at the start of this thread. > > With this combination I did not observe any lockdep complaints. OK, thanks. That makes sense -- the new mad patch should be equivalent to the old in terms of what it fixes. From bart.vanassche at gmail.com Tue Sep 8 12:09:42 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 8 Sep 2009 21:09:42 +0200 Subject: [ofa-general] Re: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 7:15 PM, Roland Dreier wrote: > >  > The above issue does not occur with the for-next branch of the >  > infiniband git tree, but does occur with 2.6.31-rc9 + aforementioned >  > patches. >  > >  > As far as I can see commit 721d67cdca5b7642b380ca0584de8dceecf6102f >  > (http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=721d67cdca5b7642b380ca0584de8dceecf6102f) >  > is not yet included in 2.6.31-rc9. Could this be related to the above >  > issue ? > > Yes, that would make sense.  "priv->lock" -- ie the ipoib lock whose > coverage is reduced in 721d67cd -- is in the lockdep report you posted. > So it seems likely that 721d67cd makes the mad_rmpp report not trigger. > However I think the mad_rmpp code does still have a lock-lock-timer > problem that could cause lockdep reports in the future, so I'll have a > look at fixing it. > > Do you happen to have the full lockdep output from this test handy?  I'm > curious to see exactly how the mad_rmpp lock gets linked to priv->lock. The full lockdep output is as follows: ====================================================== [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] 2.6.31-rc9 #2 ------------------------------------------------------ ibsrpdm/4290 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (&(&rmpp_recv->cleanup_work)->timer){+.-...}, at: [] del_timer_sync+0x0/0xa0 and this task is already holding: (&mad_agent_priv->lock){..-...}, at: [] ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] which would create a new lock dependency: (&mad_agent_priv->lock){..-...} -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} but this new dependency connects a HARDIRQ-irq-safe lock: (&priv->lock){-.-...} ... which became HARDIRQ-irq-safe at: [] 0xffffffffffffffff to a HARDIRQ-irq-unsafe lock: (&(&rmpp_recv->cleanup_work)->timer){+.-...} ... which became HARDIRQ-irq-unsafe at: ... [] 0xffffffffffffffff other info that might help us debug this: 2 locks held by ibsrpdm/4290: #0: (&port->file_mutex){+.+.+.}, at: [] ib_umad_close+0x39/0x120 [ib_umad] #1: (&mad_agent_priv->lock){..-...}, at: [] ib_cancel_rmpp_recvs+0x28/0x118 [ib_mad] the HARDIRQ-irq-safe lock's dependencies: -> (&priv->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irq+0x3c/0x50 [] ipoib_mcast_join_task+0x1fe/0x380 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.42387+0x0/0xffffffffffff8ba3 [ib_ipoib] -> (&n->list_lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] add_partial+0x21/0x70 [] __slab_free+0x1ec/0x390 [] kmem_cache_free+0x95/0xf0 [] acpi_os_release_object+0x9/0xd [] acpi_ut_delete_object_desc+0x48/0x4c [] acpi_ut_delete_internal_obj+0x3af/0x3ba [] acpi_ut_update_ref_count+0x187/0x1d9 [] acpi_ut_update_object_reference+0x115/0x18f [] acpi_ut_remove_reference+0x65/0x6c [] acpi_ex_create_method+0x9b/0xaa [] acpi_ds_load1_end_op+0x1ba/0x245 [] acpi_ps_parse_loop+0x786/0x93e [] acpi_ps_parse_aml+0x10d/0x3df [] acpi_ns_one_complete_parse+0x131/0x14c [] acpi_ns_parse_table+0x49/0x8c [] acpi_ns_load_table+0x7a/0x114 [] acpi_load_tables+0x6d/0x15a [] acpi_early_init+0x60/0xf5 [] start_kernel+0x372/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.25366+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] get_partial_node+0x4c/0xf0 [] __slab_alloc+0x105/0x540 [] kmem_cache_alloc+0xf6/0x100 [] ipoib_mcast_alloc+0x25/0xb0 [ib_ipoib] [] ipoib_mcast_restart_task+0x1c7/0x510 [ib_ipoib] [] __ipoib_ib_dev_flush+0xfc/0x250 [ib_ipoib] [] ipoib_ib_dev_flush_normal+0x15/0x20 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&list->lock#4){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] 0xffffffffffffffff } ... key at: [] __key.18496+0x0/0xffffffffffff8b87 [ib_ipoib] ... acquired at: [] 0xffffffffffffffff -> (&device->client_data_lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] add_client_context+0x55/0xb0 [ib_core] [] ib_register_device+0x439/0x4b0 [ib_core] [] mlx4_ib_add+0x52e/0x600 [mlx4_ib] [] mlx4_add_device+0x3c/0xa0 [mlx4_core] [] mlx4_register_interface+0x6b/0xb0 [mlx4_core] [] 0xffffffffa0419010 [] do_one_initcall+0x3c/0x170 [] sys_init_module+0xaf/0x200 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.17696+0x0/0xffffffffffff45cb [ib_core] ... acquired at: [] 0xffffffffffffffff -> (&port->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ib_sa_join_multicast+0x144/0x410 [ib_sa] [] ipoib_mcast_join+0x11f/0x1e0 [ib_ipoib] [] ipoib_mcast_join_task+0xec/0x380 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.23380+0x0/0xffffffffffffcc27 [ib_sa] -> (&group->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ib_sa_join_multicast+0x312/0x410 [ib_sa] [] ipoib_mcast_join+0x11f/0x1e0 [ib_ipoib] [] ipoib_mcast_join_task+0xec/0x380 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.23161+0x0/0xffffffffffffcc17 [ib_sa] -> (&cwq->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] 0xffffffffffffffff } ... key at: [] __key.23406+0x0/0x8 -> (&q->lock){-.-.-.} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff IN-RECLAIM_FS-W at: [] __lock_acquire+0x73a/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] prepare_to_wait+0x2c/0x90 [] kswapd+0x105/0x800 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irq+0x3c/0x50 [] wait_for_common+0x43/0x1a0 [] wait_for_completion+0x18/0x20 [] kthread_create+0x9f/0x130 [] migration_call+0x362/0x4fb [] migration_init+0x22/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.17808+0x0/0x8 -> (&rq->lock){-.-.-.} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff IN-RECLAIM_FS-W at: [] __lock_acquire+0x73a/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] set_cpus_allowed_ptr+0x2a/0x190 [] kswapd+0x84/0x800 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] rq_attach_root+0x26/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.45497+0x0/0x8 -> (&vec->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] cpupri_set+0xc6/0x160 [] rq_online_rt+0x47/0x90 [] set_rq_online+0x5e/0x80 [] rq_attach_root+0xe8/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.14614+0x0/0x3c ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] cpupri_set+0xc6/0x160 [] rq_online_rt+0x47/0x90 [] set_rq_online+0x5e/0x80 [] rq_attach_root+0xe8/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff -> (&rt_b->rt_runtime_lock){-.....} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] enqueue_task_rt+0x1ec/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.36636+0x0/0x8 -> (&cpu_base->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_hrtimer_base+0x2c/0x60 [] __hrtimer_start_range_ns+0x37/0x290 [] enqueue_task_rt+0x242/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.19841+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_hrtimer_base+0x2c/0x60 [] __hrtimer_start_range_ns+0x37/0x290 [] enqueue_task_rt+0x242/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&rt_rq->rt_runtime_lock){-.....} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr_rt+0xf1/0x190 [] dequeue_task_rt+0x1f/0x80 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] migration_thread+0x1c8/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.45477+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] __enable_runtime+0x39/0x80 [] rq_online_rt+0x28/0x90 [] set_rq_online+0x5e/0x80 [] migration_call+0x8d/0x4fb [] notifier_call_chain+0x3f/0x80 [] raw_notifier_call_chain+0x11/0x20 [] _cpu_up+0x126/0x12c [] cpu_up+0x77/0x89 [] kernel_init+0xe2/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] enqueue_task_rt+0x1ec/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr_rt+0xf1/0x190 [] dequeue_task_rt+0x1f/0x80 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] migration_thread+0x1c8/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&rq->lock/1){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_nested+0x41/0x60 [] double_rq_lock+0x72/0x90 [] __migrate_task+0x6f/0x120 [] migration_thread+0x9d/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.45497+0x1/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_nested+0x41/0x60 [] double_rq_lock+0x72/0x90 [] __migrate_task+0x6f/0x120 [] migration_thread+0x9d/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&sig->cputimer.lock){......} ops: 0 { INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] thread_group_cputimer+0x38/0xf0 [] posix_cpu_timers_exit_group+0x15/0x40 [] release_task+0x2b8/0x3f0 [] do_exit+0x58d/0x790 [] __module_put_and_exit+0x19/0x20 [] cryptomgr_test+0x32/0x50 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.15508+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr+0x118/0x140 [] dequeue_task_fair+0x4d/0x280 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] do_exit+0x536/0x790 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] try_to_wake_up+0x3f/0x2b0 [] default_wake_function+0xd/0x10 [] __wake_up_common+0x5a/0x90 [] complete+0x3f/0x60 [] kthreadd+0xb0/0x160 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&ep->lock){......} ops: 0 { INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] sys_epoll_ctl+0x380/0x510 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.22538+0x0/0x10 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] try_to_wake_up+0x3f/0x2b0 [] default_wake_function+0xd/0x10 [] __wake_up_common+0x5a/0x90 [] __wake_up_locked+0x13/0x20 [] ep_poll_callback+0x8d/0x120 [] __wake_up_common+0x5a/0x90 [] __wake_up_sync_key+0x4e/0x70 [] sock_def_readable+0x43/0x80 [] unix_stream_connect+0x44a/0x470 [] sys_connect+0x71/0xb0 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ep_poll_callback+0x2e/0x120 [] __wake_up_common+0x5a/0x90 [] __wake_up_sync_key+0x4e/0x70 [] sock_def_readable+0x43/0x80 [] unix_stream_connect+0x44a/0x470 [] sys_connect+0x71/0xb0 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] __queue_work+0x1f/0x50 [] queue_work_on+0x4f/0x60 [] queue_work+0x29/0x60 [] ib_sa_join_multicast+0x35d/0x410 [ib_sa] [] ipoib_mcast_join+0x11f/0x1e0 [ib_ipoib] [] ipoib_mcast_join_task+0xec/0x380 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] mcast_groups_event+0x7a/0xd0 [ib_sa] [] mcast_event_handler+0x41/0x70 [ib_sa] [] ib_dispatch_event+0x39/0x70 [ib_core] [] mlx4_ib_process_mad+0x4a7/0x500 [mlx4_ib] [] ib_mad_completion_handler+0x302/0x760 [ib_mad] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff -> (&qp->sq.lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_post_send+0x39/0xc50 [mlx4_ib] [] ib_send_mad+0x165/0x3a0 [ib_mad] [] ib_post_send_mad+0x14c/0x720 [ib_mad] [] send_mad+0xb4/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.21638+0x0/0xffffffffffffcf32 [mlx4_ib] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_post_send+0x39/0xc50 [mlx4_ib] [] ipoib_send+0x469/0x850 [ib_ipoib] [] ipoib_mcast_send+0x1aa/0x3f0 [ib_ipoib] [] ipoib_path_lookup+0x122/0x2e0 [ib_ipoib] [] ipoib_start_xmit+0x17d/0x440 [ib_ipoib] [] dev_hard_start_xmit+0x2bd/0x340 [] __qdisc_run+0x25e/0x2b0 [] dev_queue_xmit+0x2f0/0x4c0 [] neigh_connected_output+0xa9/0xe0 [] ip_finish_output+0x2e6/0x340 [] ip_mc_output+0x220/0x260 [] ip_local_out+0x20/0x30 [] ip_push_pending_frames+0x2ec/0x450 [] udp_push_pending_frames+0x13f/0x400 [] udp_sendmsg+0x33f/0x770 [] inet_sendmsg+0x45/0x80 [] sock_sendmsg+0xdf/0x110 [] sys_sendmsg+0x189/0x320 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff -> (&base->lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_timer_base+0x36/0x70 [] mod_timer+0x3c/0x110 [] con_init+0x272/0x277 [] console_init+0x22/0x36 [] start_kernel+0x25d/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.23060+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_timer_base+0x36/0x70 [] mod_timer+0x3c/0x110 [] add_timer+0x13/0x20 [] queue_delayed_work_on+0xa3/0xd0 [] queue_delayed_work+0x1c/0x30 [] ipoib_mcast_join_complete+0x1de/0x240 [ib_ipoib] [] join_handler+0x1b2/0x1f0 [ib_sa] [] ib_sa_mcmember_rec_callback+0x6e/0x70 [ib_sa] [] send_handler+0xac/0xd0 [ib_sa] [] timeout_sends+0xd2/0x1d0 [ib_mad] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&sa_dev->port[i].ah_lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irq+0x3c/0x50 [] update_sm_ah+0xf5/0x170 [ib_sa] [] ib_sa_add_one+0x21b/0x250 [ib_sa] [] ib_register_device+0x443/0x4b0 [ib_core] [] mlx4_ib_add+0x52e/0x600 [mlx4_ib] [] mlx4_add_device+0x3c/0xa0 [mlx4_core] [] mlx4_register_interface+0x6b/0xb0 [mlx4_core] [] 0xffffffffa0419010 [] do_one_initcall+0x3c/0x170 [] sys_init_module+0xaf/0x200 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.18811+0x0/0xffffffffffffccbf [ib_sa] ... acquired at: [] 0xffffffffffffffff -> (&tid_lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] init_mad+0x2e/0x70 [ib_sa] [] ib_sa_mcmember_rec_query+0xfa/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.18872+0x0/0xffffffffffffccb7 [ib_sa] ... acquired at: [] 0xffffffffffffffff -> (query_idr.lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] idr_pre_get+0x30/0x90 [] send_mad+0x2f/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] query_idr+0x30/0xffffffffffffcf47 [ib_sa] ... acquired at: [] 0xffffffffffffffff -> (&idr_lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] send_mad+0x43/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.18871+0x0/0xffffffffffffccaf [ib_sa] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] get_from_free_list+0x23/0x60 [] idr_get_empty_slot+0x2a5/0x2d0 [] idr_get_new_above_int+0x1c/0x90 [] idr_get_new+0x13/0x40 [] send_mad+0x58/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff -> (&mad_agent_priv->lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ib_post_send_mad+0xe2/0x720 [ib_mad] [] send_mad+0xb4/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.18167+0x0/0xffffffffffffc74a [ib_mad] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_timer_base+0x36/0x70 [] mod_timer+0x3c/0x110 [] add_timer+0x13/0x20 [] queue_delayed_work_on+0xa3/0xd0 [] queue_delayed_work+0x1c/0x30 [] wait_for_response+0xea/0xf0 [ib_mad] [] ib_mad_complete_send_wr+0x106/0x250 [ib_mad] [] ib_mad_send_done_handler+0xf3/0x240 [ib_mad] [] ib_mad_completion_handler+0x143/0x760 [ib_mad] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] __queue_work+0x1f/0x50 [] queue_work_on+0x4f/0x60 [] queue_work+0x29/0x60 [] queue_delayed_work+0x25/0x30 [] wait_for_response+0xea/0xf0 [ib_mad] [] ib_reset_mad_timeout+0x22/0x30 [ib_mad] [] ib_modify_mad+0x12a/0x190 [ib_mad] [] ib_cancel_mad+0xb/0x10 [ib_mad] [] cm_work_handler+0x6ff/0x103c [ib_cm] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&(&rmpp_recv->timeout_work)->timer){......} ops: 0 { INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] del_timer_sync+0x3d/0xa0 [] ib_cancel_rmpp_recvs+0x49/0x118 [ib_mad] [] ib_unregister_mad_agent+0x385/0x580 [ib_mad] [] ib_umad_close+0xd2/0x120 [ib_umad] [] __fput+0xd0/0x1e0 [] fput+0x1d/0x30 [] filp_close+0x5b/0x90 [] put_files_struct+0x84/0xe0 [] exit_files+0x4e/0x60 [] do_exit+0x709/0x790 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.18191+0x0/0xffffffffffffc6ce [ib_mad] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] del_timer_sync+0x3d/0xa0 [] ib_cancel_rmpp_recvs+0x49/0x118 [ib_mad] [] ib_unregister_mad_agent+0x385/0x580 [ib_mad] [] ib_umad_close+0xd2/0x120 [ib_umad] [] __fput+0xd0/0x1e0 [] fput+0x1d/0x30 [] filp_close+0x5b/0x90 [] put_files_struct+0x84/0xe0 [] exit_files+0x4e/0x60 [] do_exit+0x709/0x790 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff -> (&mad_queue->lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ib_mad_post_receive_mads+0xaf/0x2c0 [ib_mad] [] ib_mad_init_device+0x237/0x640 [ib_mad] [] ib_register_device+0x443/0x4b0 [ib_core] [] mlx4_ib_add+0x52e/0x600 [mlx4_ib] [] mlx4_add_device+0x3c/0xa0 [mlx4_core] [] mlx4_register_interface+0x6b/0xb0 [mlx4_core] [] 0xffffffffa0419010 [] do_one_initcall+0x3c/0x170 [] sys_init_module+0xaf/0x200 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.20179+0x0/0xffffffffffffc776 [ib_mad] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_post_send+0x39/0xc50 [mlx4_ib] [] ib_send_mad+0x165/0x3a0 [ib_mad] [] ib_post_send_mad+0x14c/0x720 [ib_mad] [] send_mad+0xb4/0x110 [ib_sa] [] ib_sa_mcmember_rec_query+0x15b/0x1c0 [ib_sa] [] mcast_work_handler+0x10c/0x8c0 [ib_sa] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] complete+0x23/0x60 [] path_rec_completion+0x4b4/0x540 [ib_ipoib] [] ib_sa_path_rec_callback+0x4b/0x70 [ib_sa] [] recv_handler+0x37/0x70 [ib_sa] [] ib_mad_completion_handler+0x63e/0x760 [ib_mad] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff -> (&cq->lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_poll_cq+0x2d/0x820 [mlx4_ib] [] ib_mad_completion_handler+0x57/0x760 [ib_mad] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.20459+0x0/0xffffffffffffcf66 [mlx4_ib] -> (&srq->lock){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_post_srq_recv+0x2c/0x1a0 [mlx4_ib] [] ipoib_cm_post_receive_srq+0x98/0x120 [ib_ipoib] [] ipoib_cm_dev_init+0x442/0x610 [ib_ipoib] [] ipoib_transport_dev_init+0xa7/0x3b0 [ib_ipoib] [] ipoib_ib_dev_init+0x3e/0xd0 [ib_ipoib] [] ipoib_dev_init+0x9f/0x120 [ib_ipoib] [] ipoib_add_one+0x2aa/0x4d0 [ib_ipoib] [] ib_register_client+0x86/0xb0 [ib_core] [] 0xffffffffa044a103 [] do_one_initcall+0x3c/0x170 [] sys_init_module+0xaf/0x200 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.20091+0x0/0xffffffffffffcf22 [mlx4_ib] ... acquired at: [] 0xffffffffffffffff -> (&qp_table->lock){-.....} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irq+0x3c/0x50 [] mlx4_qp_alloc+0x102/0x1b0 [mlx4_core] [] create_qp_common+0x42d/0x990 [mlx4_ib] [] mlx4_ib_create_qp+0x124/0x1a0 [mlx4_ib] [] ib_create_qp+0x18/0xa0 [ib_core] [] create_mad_qp+0x7a/0xd0 [ib_mad] [] ib_mad_init_device+0x382/0x640 [ib_mad] [] ib_register_device+0x443/0x4b0 [ib_core] [] mlx4_ib_add+0x52e/0x600 [mlx4_ib] [] mlx4_add_device+0x3c/0xa0 [mlx4_core] [] mlx4_register_interface+0x6b/0xb0 [mlx4_core] [] 0xffffffffa0419010 [] do_one_initcall+0x3c/0x170 [] sys_init_module+0xaf/0x200 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.18883+0x0/0xffffffffffff67c2 [mlx4_core] ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_qp_remove+0x30/0x80 [mlx4_core] [] mlx4_ib_destroy_qp+0x340/0x390 [mlx4_ib] [] ib_destroy_qp+0x34/0x80 [ib_core] [] ipoib_cm_tx_reap+0x1fa/0x5b0 [ib_ipoib] [] worker_thread+0x1a3/0x300 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] mlx4_ib_poll_cq+0x2d/0x820 [mlx4_ib] [] poll_tx+0x35/0x1b0 [ib_ipoib] [] ipoib_send+0x5a5/0x850 [ib_ipoib] [] ipoib_mcast_send+0x1aa/0x3f0 [ib_ipoib] [] ipoib_path_lookup+0x122/0x2e0 [ib_ipoib] [] ipoib_start_xmit+0x17d/0x440 [ib_ipoib] [] dev_hard_start_xmit+0x2bd/0x340 [] __qdisc_run+0x25e/0x2b0 [] dev_queue_xmit+0x2f0/0x4c0 [] neigh_connected_output+0xa9/0xe0 [] ip_finish_output+0x2e6/0x340 [] ip_mc_output+0x220/0x260 [] ip_local_out+0x20/0x30 [] ip_push_pending_frames+0x2ec/0x450 [] udp_push_pending_frames+0x13f/0x400 [] udp_sendmsg+0x33f/0x770 [] inet_sendmsg+0x45/0x80 [] sock_sendmsg+0xdf/0x110 [] sys_sendmsg+0x189/0x320 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff the HARDIRQ-irq-unsafe lock's dependencies: -> (&(&rmpp_recv->cleanup_work)->timer){+.-...} ops: 0 { HARDIRQ-ON-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] 0xffffffffffffffff } ... key at: [] __key.18194+0x0/0xffffffffffffc6de [ib_mad] -> (&cwq->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] 0xffffffffffffffff } ... key at: [] __key.23406+0x0/0x8 -> (&q->lock){-.-.-.} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff IN-RECLAIM_FS-W at: [] __lock_acquire+0x73a/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] prepare_to_wait+0x2c/0x90 [] kswapd+0x105/0x800 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irq+0x3c/0x50 [] wait_for_common+0x43/0x1a0 [] wait_for_completion+0x18/0x20 [] kthread_create+0x9f/0x130 [] migration_call+0x362/0x4fb [] migration_init+0x22/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.17808+0x0/0x8 -> (&rq->lock){-.-.-.} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff IN-RECLAIM_FS-W at: [] __lock_acquire+0x73a/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] set_cpus_allowed_ptr+0x2a/0x190 [] kswapd+0x84/0x800 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] rq_attach_root+0x26/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.45497+0x0/0x8 -> (&vec->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] cpupri_set+0xc6/0x160 [] rq_online_rt+0x47/0x90 [] set_rq_online+0x5e/0x80 [] rq_attach_root+0xe8/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff } ... key at: [] __key.14614+0x0/0x3c ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] cpupri_set+0xc6/0x160 [] rq_online_rt+0x47/0x90 [] set_rq_online+0x5e/0x80 [] rq_attach_root+0xe8/0x110 [] sched_init+0x2c0/0x436 [] start_kernel+0x16b/0x429 [] x86_64_start_reservations+0x99/0xb9 [] x86_64_start_kernel+0xe0/0xf2 [] 0xffffffffffffffff -> (&rt_b->rt_runtime_lock){-.....} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] enqueue_task_rt+0x1ec/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.36636+0x0/0x8 -> (&cpu_base->lock){-.-...} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_hrtimer_base+0x2c/0x60 [] __hrtimer_start_range_ns+0x37/0x290 [] enqueue_task_rt+0x242/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.19841+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] lock_hrtimer_base+0x2c/0x60 [] __hrtimer_start_range_ns+0x37/0x290 [] enqueue_task_rt+0x242/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&rt_rq->rt_runtime_lock){-.....} ops: 0 { IN-HARDIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr_rt+0xf1/0x190 [] dequeue_task_rt+0x1f/0x80 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] migration_thread+0x1c8/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.45477+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] __enable_runtime+0x39/0x80 [] rq_online_rt+0x28/0x90 [] set_rq_online+0x5e/0x80 [] migration_call+0x8d/0x4fb [] notifier_call_chain+0x3f/0x80 [] raw_notifier_call_chain+0x11/0x20 [] _cpu_up+0x126/0x12c [] cpu_up+0x77/0x89 [] kernel_init+0xe2/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] enqueue_task_rt+0x1ec/0x2a0 [] enqueue_task+0x5b/0x70 [] activate_task+0x28/0x40 [] try_to_wake_up+0x1a8/0x2b0 [] wake_up_process+0x10/0x20 [] migration_call+0x58/0x4fb [] migration_init+0x40/0x58 [] do_one_initcall+0x3c/0x170 [] kernel_init+0x6d/0x1a8 [] child_rip+0xa/0x20 [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr_rt+0xf1/0x190 [] dequeue_task_rt+0x1f/0x80 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] migration_thread+0x1c8/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&rq->lock/1){..-...} ops: 0 { IN-SOFTIRQ-W at: [] 0xffffffffffffffff INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_nested+0x41/0x60 [] double_rq_lock+0x72/0x90 [] __migrate_task+0x6f/0x120 [] migration_thread+0x9d/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.45497+0x1/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_nested+0x41/0x60 [] double_rq_lock+0x72/0x90 [] __migrate_task+0x6f/0x120 [] migration_thread+0x9d/0x2c0 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&sig->cputimer.lock){......} ops: 0 { INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] thread_group_cputimer+0x38/0xf0 [] posix_cpu_timers_exit_group+0x15/0x40 [] release_task+0x2b8/0x3f0 [] do_exit+0x58d/0x790 [] __module_put_and_exit+0x19/0x20 [] cryptomgr_test+0x32/0x50 [] kthread+0x56/0x90 [] child_rip+0xa/0x20 [] 0xffffffffffffffff } ... key at: [] __key.15508+0x0/0x8 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] update_curr+0x118/0x140 [] dequeue_task_fair+0x4d/0x280 [] dequeue_task+0xb5/0xf0 [] deactivate_task+0x28/0x40 [] thread_return+0x11f/0x8bc [] schedule+0x13/0x40 [] do_exit+0x536/0x790 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] try_to_wake_up+0x3f/0x2b0 [] default_wake_function+0xd/0x10 [] __wake_up_common+0x5a/0x90 [] complete+0x3f/0x60 [] kthreadd+0xb0/0x160 [] child_rip+0xa/0x20 [] 0xffffffffffffffff -> (&ep->lock){......} ops: 0 { INITIAL USE at: [] __lock_acquire+0x171/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] sys_epoll_ctl+0x380/0x510 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff } ... key at: [] __key.22538+0x0/0x10 ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock+0x36/0x50 [] task_rq_lock+0x4d/0x90 [] try_to_wake_up+0x3f/0x2b0 [] default_wake_function+0xd/0x10 [] __wake_up_common+0x5a/0x90 [] __wake_up_locked+0x13/0x20 [] ep_poll_callback+0x8d/0x120 [] __wake_up_common+0x5a/0x90 [] __wake_up_sync_key+0x4e/0x70 [] sock_def_readable+0x43/0x80 [] unix_stream_connect+0x44a/0x470 [] sys_connect+0x71/0xb0 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] __lock_acquire+0x1135/0x1b50 [] lock_acquire+0x56/0x80 [] _spin_lock_irqsave+0x41/0x60 [] ep_poll_callback+0x2e/0x120 [] __wake_up_common+0x5a/0x90 [] __wake_up_sync_key+0x4e/0x70 [] sock_def_readable+0x43/0x80 [] unix_stream_connect+0x44a/0x470 [] sys_connect+0x71/0xb0 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff ... acquired at: [] 0xffffffffffffffff stack backtrace: Pid: 4290, comm: ibsrpdm Not tainted 2.6.31-rc9 #2 Call Trace: [] check_usage+0x3ba/0x470 [] check_irq_usage+0x64/0x100 [] __lock_acquire+0xf72/0x1b50 [] lock_acquire+0x56/0x80 [] ? del_timer_sync+0x0/0xa0 [] del_timer_sync+0x3d/0xa0 [] ? del_timer_sync+0x0/0xa0 [] ib_cancel_rmpp_recvs+0x62/0x118 [ib_mad] [] ib_unregister_mad_agent+0x385/0x580 [ib_mad] [] ? mark_held_locks+0x6c/0x90 [] ib_umad_close+0xd2/0x120 [ib_umad] [] __fput+0xd0/0x1e0 [] fput+0x1d/0x30 [] filp_close+0x5b/0x90 [] put_files_struct+0x84/0xe0 [] exit_files+0x4e/0x60 [] do_exit+0x709/0x790 [] ? up_read+0x26/0x30 [] ? retint_swapgs+0xe/0x13 [] do_group_exit+0x3e/0xb0 [] sys_exit_group+0x12/0x20 [] system_call_fastpath+0x16/0x1b scsi host6: ib_srp: new target: id_ext 0002c9030003cca2 ioc_guid 0002c9030003cca2 pkey ffff service_id 0002c9030003cca2 dgid fe80:0000:0000:0000:0002:c903:0003:cca3 scsi6 : SRP.T10:0002C9030003CCA2 scsi 6:0:0:0: Direct-Access SCST_FIO disk01 102 PQ: 0 ANSI: 5 sd 6:0:0:0: Attached scsi generic sg2 type 0 sd 6:0:0:0: [sdb] 2097152 512-byte hardware sectors: (1.07 GB/1.00 GiB) sd 6:0:0:0: [sdb] Write Protect is off sd 6:0:0:0: [sdb] Mode Sense: 83 00 10 08 sd 6:0:0:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA sdb: unknown partition table sd 6:0:0:0: [sdb] Attached SCSI disk From jon at opengridcomputing.com Tue Sep 8 13:42:41 2009 From: jon at opengridcomputing.com (Jon Mason) Date: Tue, 8 Sep 2009 15:42:41 -0500 Subject: [ofa-general] Bad interaction between ofed, NFS and Gaussian In-Reply-To: <1251974772.12806.1393.camel@pc.interlinx.bc.ca> References: <20090902163914.GA23358@lapinou.lsd.univ-montp2.fr> <4A9EC972.9050106@mellanox.co.il> <4A9EE085.9050102@nasa.gov> <1251974772.12806.1393.camel@pc.interlinx.bc.ca> Message-ID: <20090908204241.GI4090@opengridcomputing.com> On Thu, Sep 03, 2009 at 06:46:12AM -0400, Brian J. Murrell wrote: > On Wed, 2009-09-02 at 14:15 -0700, Jeff Becker wrote: > > > > Does it also happen with OFED 1.5 alpha? Thanks. > > Was bug 1671 landed to 1.5? No, it's still in my local tree. I was waiting for you to verify it on OFED 1.5 before I committed it. > > > Tziporet Koren wrote: > > > > > It seems issues of NFS/RDMA backports. > > > Can you install OFED without NFS/RDMA? > > But in 1.4.x, simply disabling NFS/RDMA does not prevent use of the > backport headers that NFS/RDMA brings in. That's why I filed bug 1671. It all depends on the application, but I doubt that any application aside from Lustre has the header file implications. > Jon Mason provided me a patch for bug 1671 which I backported to 1.4.x > and his patch resolved problems I was having because I was getting > NFS/RDMA headers even though I had selected to not build NFS/RDMA. > > Jon: can you attach your patch to bug 1671, just so it's on record? I'll send it out as an RFC shortly, and push it when you confirm it solves the issue on OFED 1.5. Sound reasonable? Thanks, Jon > Maybe that patch will solve OP's problems, although I have to admit > having jumped into this thread late and not fully understanding it's > origins. > > b. > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From swise at opengridcomputing.com Tue Sep 8 14:30:01 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 08 Sep 2009 16:30:01 -0500 Subject: [ofa-general] [PATCH 1/2] RDMA/cxgb3: Don't ignore insert_handle() failures. Message-ID: <20090908213001.15369.15629.stgit@build.ogc.int> Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_mem.c | 21 ++++++++--- drivers/infiniband/hw/cxgb3/iwch_provider.c | 50 +++++++++++++++++++-------- 2 files changed, 49 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_mem.c b/drivers/infiniband/hw/cxgb3/iwch_mem.c index ec49a5c..e1ec65e 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_mem.c +++ b/drivers/infiniband/hw/cxgb3/iwch_mem.c @@ -39,7 +39,7 @@ #include "iwch.h" #include "iwch_provider.h" -static void iwch_finish_mem_reg(struct iwch_mr *mhp, u32 stag) +static int iwch_finish_mem_reg(struct iwch_mr *mhp, u32 stag) { u32 mmid; @@ -47,14 +47,15 @@ static void iwch_finish_mem_reg(struct iwch_mr *mhp, u32 stag) mhp->attr.stag = stag; mmid = stag >> 8; mhp->ibmr.rkey = mhp->ibmr.lkey = stag; - insert_handle(mhp->rhp, &mhp->rhp->mmidr, mhp, mmid); PDBG("%s mmid 0x%x mhp %p\n", __func__, mmid, mhp); + return insert_handle(mhp->rhp, &mhp->rhp->mmidr, mhp, mmid); } int iwch_register_mem(struct iwch_dev *rhp, struct iwch_pd *php, struct iwch_mr *mhp, int shift) { u32 stag; + int ret; if (cxio_register_phys_mem(&rhp->rdev, &stag, mhp->attr.pdid, @@ -66,9 +67,11 @@ int iwch_register_mem(struct iwch_dev *rhp, struct iwch_pd *php, mhp->attr.pbl_size, mhp->attr.pbl_addr)) return -ENOMEM; - iwch_finish_mem_reg(mhp, stag); - - return 0; + ret = iwch_finish_mem_reg(mhp, stag); + if (ret) + cxio_dereg_mem(&rhp->rdev, mhp->attr.stag, mhp->attr.pbl_size, + mhp->attr.pbl_addr); + return ret; } int iwch_reregister_mem(struct iwch_dev *rhp, struct iwch_pd *php, @@ -77,6 +80,7 @@ int iwch_reregister_mem(struct iwch_dev *rhp, struct iwch_pd *php, int npages) { u32 stag; + int ret; /* We could support this... */ if (npages > mhp->attr.pbl_size) @@ -93,9 +97,12 @@ int iwch_reregister_mem(struct iwch_dev *rhp, struct iwch_pd *php, mhp->attr.pbl_size, mhp->attr.pbl_addr)) return -ENOMEM; - iwch_finish_mem_reg(mhp, stag); + ret = iwch_finish_mem_reg(mhp, stag); + if (ret) + cxio_dereg_mem(&rhp->rdev, mhp->attr.stag, mhp->attr.pbl_size, + mhp->attr.pbl_addr); - return 0; + return ret; } int iwch_alloc_pbl(struct iwch_mr *mhp, int npages) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index 72aa57c..6895523 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -195,7 +195,11 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev, int entries, int ve spin_lock_init(&chp->lock); atomic_set(&chp->refcnt, 1); init_waitqueue_head(&chp->wait); - insert_handle(rhp, &rhp->cqidr, chp, chp->cq.cqid); + if (insert_handle(rhp, &rhp->cqidr, chp, chp->cq.cqid)) { + cxio_destroy_cq(&chp->rhp->rdev, &chp->cq); + kfree(chp); + return ERR_PTR(-ENOMEM); + } if (ucontext) { struct iwch_mm_entry *mm; @@ -750,7 +754,11 @@ static struct ib_mw *iwch_alloc_mw(struct ib_pd *pd) mhp->attr.stag = stag; mmid = (stag) >> 8; mhp->ibmw.rkey = stag; - insert_handle(rhp, &rhp->mmidr, mhp, mmid); + if (insert_handle(rhp, &rhp->mmidr, mhp, mmid)) { + cxio_deallocate_window(&rhp->rdev, mhp->attr.stag); + kfree(mhp); + return ERR_PTR(-ENOMEM); + } PDBG("%s mmid 0x%x mhp %p stag 0x%x\n", __func__, mmid, mhp, stag); return &(mhp->ibmw); } @@ -778,37 +786,43 @@ static struct ib_mr *iwch_alloc_fast_reg_mr(struct ib_pd *pd, int pbl_depth) struct iwch_mr *mhp; u32 mmid; u32 stag = 0; - int ret; + int ret = 0; php = to_iwch_pd(pd); rhp = php->rhp; mhp = kzalloc(sizeof(*mhp), GFP_KERNEL); if (!mhp) - return ERR_PTR(-ENOMEM); + goto err; mhp->rhp = rhp; ret = iwch_alloc_pbl(mhp, pbl_depth); - if (ret) { - kfree(mhp); - return ERR_PTR(ret); - } + if (ret) + goto err1; mhp->attr.pbl_size = pbl_depth; ret = cxio_allocate_stag(&rhp->rdev, &stag, php->pdid, mhp->attr.pbl_size, mhp->attr.pbl_addr); - if (ret) { - iwch_free_pbl(mhp); - kfree(mhp); - return ERR_PTR(ret); - } + if (ret) + goto err2; mhp->attr.pdid = php->pdid; mhp->attr.type = TPT_NON_SHARED_MR; mhp->attr.stag = stag; mhp->attr.state = 1; mmid = (stag) >> 8; mhp->ibmr.rkey = mhp->ibmr.lkey = stag; - insert_handle(rhp, &rhp->mmidr, mhp, mmid); + if (insert_handle(rhp, &rhp->mmidr, mhp, mmid)) + goto err3; + PDBG("%s mmid 0x%x mhp %p stag 0x%x\n", __func__, mmid, mhp, stag); return &(mhp->ibmr); +err3: + cxio_dereg_mem(&rhp->rdev, stag, mhp->attr.pbl_size, + mhp->attr.pbl_addr); +err2: + iwch_free_pbl(mhp); +err1: + kfree(mhp); +err: + return ERR_PTR(ret); } static struct ib_fast_reg_page_list *iwch_alloc_fastreg_pbl( @@ -961,7 +975,13 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd, spin_lock_init(&qhp->lock); init_waitqueue_head(&qhp->wait); atomic_set(&qhp->refcnt, 1); - insert_handle(rhp, &rhp->qpidr, qhp, qhp->wq.qpid); + + if (insert_handle(rhp, &rhp->qpidr, qhp, qhp->wq.qpid)) { + cxio_destroy_qp(&rhp->rdev, &qhp->wq, + ucontext ? &ucontext->uctx : &rhp->rdev.uctx); + kfree(qhp); + return ERR_PTR(-ENOMEM); + } if (udata) { From swise at opengridcomputing.com Tue Sep 8 14:30:06 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 08 Sep 2009 16:30:06 -0500 Subject: [ofa-general] [PATCH 2/2] RDMA/cxgb3: clean up properly on FW mismatch failures. In-Reply-To: <20090908213001.15369.15629.stgit@build.ogc.int> References: <20090908213001.15369.15629.stgit@build.ogc.int> Message-ID: <20090908213006.15369.19707.stgit@build.ogc.int> FW mismatches can cause a crash in the iw_cxgb3 event handler. - NULL the t3cdev->ulp pointer on failures in cxio_rdev_open() - Silently ignore events with the ulp ptr is null in iwch_err_handler() Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_hal.c | 1 + drivers/infiniband/hw/cxgb3/iwch.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index 4dec515..68955f8 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -1034,6 +1034,7 @@ err3: err2: cxio_hal_destroy_ctrl_qp(rdev_p); err1: + rdev_p->t3cdev_p->ulp = (void *) NULL; list_del(&rdev_p->entry); return err; } diff --git a/drivers/infiniband/hw/cxgb3/iwch.c b/drivers/infiniband/hw/cxgb3/iwch.c index 5796170..3f0c99d 100644 --- a/drivers/infiniband/hw/cxgb3/iwch.c +++ b/drivers/infiniband/hw/cxgb3/iwch.c @@ -165,10 +165,13 @@ static void close_rnic_dev(struct t3cdev *tdev) static void iwch_event_handler(struct t3cdev *tdev, u32 evt, u32 port_id) { struct cxio_rdev *rdev = tdev->ulp; - struct iwch_dev *rnicp = rdev_to_iwch_dev(rdev); + struct iwch_dev *rnicp; struct ib_event event; u32 portnum = port_id + 1; + if (!rdev) + return; + rnicp = rdev_to_iwch_dev(rdev); switch (evt) { case OFFLOAD_STATUS_DOWN: { rdev->flags = CXIO_ERROR_FATAL; From hnrose at comcast.net Tue Sep 8 15:11:37 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 8 Sep 2009 18:11:37 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/ibportstate: Support changing of link width Message-ID: <20090908221137.GA24265@comcast.net> Also, update man page Signed-off-by: Hal Rosenstock --- diff --git a/infiniband-diags/man/ibportstate.8 b/infiniband-diags/man/ibportstate.8 index 9b5e618..b64c18d 100644 --- a/infiniband-diags/man/ibportstate.8 +++ b/infiniband-diags/man/ibportstate.8 @@ -1,4 +1,4 @@ -.TH IBPORTSTATE 8 "October 19, 2006" "OpenIB" "OpenIB Diagnostics" +.TH IBPORTSTATE 8 "September 8, 2009" "OpenIB" "OpenIB Diagnostics" .SH NAME ibportstate \- handle port (physical) state and link speed of an InfiniBand port @@ -23,16 +23,19 @@ also allows the link speed enabled on any IB port to be adjusted. .TP op Port operations allowed - supported ops: enable, disable, reset, speed, query + supported ops: enable, disable, reset, speed, width, query Default is query .PP ops enable, disable, and reset are only allowed on switch ports (An error is indicated if attempted on CA or router ports) - speed op is allowed on any port + speed and width ops are allowed on any port speed values are legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) - (NOTE: Speed changes are not effected until the port goes through + width valyes are legal values for PortInfo:LinkWidthEnabled + (An error is indicated if PortInfo:LinkWidthSupported does not support + this setting) + (NOTE: Speed and width changes are not effected until the port goes through link renegotiation) query also validates port characteristics (link width and speed) based on the peer port. This checking is done when the port @@ -108,8 +111,10 @@ ibportstate -D 0 1 # (query) by direct route ibportstate 3 1 reset # by lid .PP ibportstate 3 1 speed 1 # by lid +.PP +ibportstate 3 1 width 1 # by lid .SH AUTHOR .TP Hal Rosenstock -.RI < halr at voltaire.com > +.RI < hal.rosenstock at gmail.com > diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 76e74f7..d20961f 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -204,6 +204,7 @@ int main(int argc, char **argv) int err; int port_op = 0; /* default to query */ int speed = 15; + int new_width = 255; int is_switch = 1; int state, physstate, lwe, lws, lwa, lse, lss, lsa; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, @@ -216,13 +217,14 @@ int main(int argc, char **argv) int selfport = 0; char usage_args[] = " []\n" - "\nSupported ops: enable, disable, reset, speed, query"; + "\nSupported ops: enable, disable, reset, speed, width, query"; const char *usage_examples[] = { "3 1 disable\t\t\t# by lid", "-G 0x2C9000100D051 1 enable\t# by guid", "-D 0 1\t\t\t# (query) by direct route", "3 1 reset\t\t\t# by lid", "3 1 speed 1\t\t\t# by lid", + "3 1 width 1\t\t\t# by lid", NULL }; @@ -263,6 +265,15 @@ int main(int argc, char **argv) speed = strtoul(argv[3], 0, 0); if (speed > 15) IBERROR("invalid speed value %d", speed); + } else if (!strcmp(argv[2], "width")) { + if (argc < 4) + IBERROR + ("width requires an additional parameter"); + port_op = 5; + /* Parse width value */ + new_width = strtoul(argv[3], 0, 0); + if (new_width > 255) + IBERROR("invalid width value %d", new_width); } } @@ -298,6 +309,11 @@ int main(int argc, char **argv) speed); mad_set_field(data, 0, IB_PORT_STATE_F, 0); mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); + } else if (port_op == 5) { /* Set width */ + mad_set_field(data, 0, IB_PORT_LINK_WIDTH_ENABLED_F, + new_width); + mad_set_field(data, 0, IB_PORT_STATE_F, 0); + mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); } err = set_port_info(&portid, data, portnum, port_op); From hal.rosenstock at gmail.com Tue Sep 8 15:23:24 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 8 Sep 2009 18:23:24 -0400 Subject: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing In-Reply-To: <20090906173900.GK25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> Message-ID: On Sun, Sep 6, 2009 at 1:39 PM, Sasha Khapyorsky wrote: > > When we have two or more changes in a same multicast group multiple > multicast rerouting requests will be created and processed. To prevent > this we will use array of requests indexed by mlid value minus > IB_LID_MCAST_START_HO and for each multicast group change we will just > mark that specific mlid requires re-routing and "duplicated" requests > will be merged there. > > Also in this way we will be able to process multicast group routing > entries deletion for already removed groups by just knowing its MLID > and not using its content - this will let us to not delay mutlicast > groups deletion ('to_be_deleted' flag) and will simplify many multicast > related code flows. > While the delay adds complexity, it is a feature. Delayed deletion (and join) is allowed by IBA and is needed in a fast changing subnet when there are a lot of groups changing. This was seen quite a while ago and was how OpenSM evolved based on field experience and other testing. Eitan is the expert here. IMO support for this needs to be added (back in). -- Hal > > Signed-off-by: Sasha Khapyorsky > --- > opensm/include/opensm/osm_sm.h | 4 +- > opensm/opensm/osm_mcast_mgr.c | 27 +++++++++------------ > opensm/opensm/osm_sm.c | 49 > +++++++++++++--------------------------- > 3 files changed, 30 insertions(+), 50 deletions(-) > > diff --git a/opensm/include/opensm/osm_sm.h > b/opensm/include/opensm/osm_sm.h > index 0914a95..986143a 100644 > --- a/opensm/include/opensm/osm_sm.h > +++ b/opensm/include/opensm/osm_sm.h > @@ -126,8 +126,8 @@ typedef struct osm_sm { > cl_dispatcher_t *p_disp; > cl_plock_t *p_lock; > atomic32_t sm_trans_id; > - cl_spinlock_t mgrp_lock; > - cl_qlist_t mgrp_list; > + unsigned mlids_req_max; > + uint8_t *mlids_req; > osm_sm_mad_ctrl_t mad_ctrl; > osm_lid_mgr_t lid_mgr; > osm_ucast_mgr_t ucast_mgr; > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index d7c5ce1..dd504ef 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -1116,7 +1116,6 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) > int osm_mcast_mgr_process(osm_sm_t * sm) > { > cl_qmap_t *p_sw_tbl; > - cl_qlist_t *p_list = &sm->mgrp_list; > osm_mgrp_t *p_mgrp; > int i, ret = 0; > > @@ -1150,16 +1149,14 @@ int osm_mcast_mgr_process(osm_sm_t * sm) > mcast_mgr_process_mgrp(sm, p_mgrp); > } > > + memset(sm->mlids_req, 0, sm->mlids_req_max); > + sm->mlids_req_max = 0; > + > /* > Walk the switches and download the tables for each. > */ > ret = mcast_mgr_set_mftables(sm); > > - while (!cl_is_qlist_empty(p_list)) { > - cl_list_item_t *p = cl_qlist_remove_head(p_list); > - free(p); > - } > - > exit: > CL_PLOCK_RELEASE(sm->p_lock); > > @@ -1174,11 +1171,10 @@ exit: > **********************************************************************/ > int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) > { > - cl_qlist_t *p_list = &sm->mgrp_list; > osm_mgrp_t *p_mgrp; > ib_net16_t mlid; > - osm_mcast_mgr_ctxt_t *ctx; > int ret = 0; > + unsigned i; > > OSM_LOG_ENTER(sm->p_log); > > @@ -1192,14 +1188,12 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) > goto exit; > } > > - while (!cl_is_qlist_empty(p_list)) { > - ctx = (osm_mcast_mgr_ctxt_t *) > cl_qlist_remove_head(p_list); > - > - /* nice copy no warning on size diff */ > - memcpy(&mlid, &ctx->mlid, sizeof(mlid)); > + for (i = 0; i <= sm->mlids_req_max; i++) { > + if (!sm->mlids_req[i]) > + continue; > + sm->mlids_req[i] = 0; > > - /* we can destroy the context now */ > - free(ctx); > + mlid = cl_hton16(i + IB_LID_MCAST_START_HO); > > /* since we delayed the execution we prefer to pass the > mlid as the mgrp identifier and then find it or abort */ > @@ -1223,6 +1217,9 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) > mcast_mgr_process_mgrp(sm, p_mgrp); > } > > + memset(sm->mlids_req, 0, sm->mlids_req_max); > + sm->mlids_req_max = 0; > + > /* > Walk the switches and download the tables for each. > */ > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c > index 50aee91..e446c9d 100644 > --- a/opensm/opensm/osm_sm.c > +++ b/opensm/opensm/osm_sm.c > @@ -166,7 +166,6 @@ void osm_sm_construct(IN osm_sm_t * p_sm) > cl_event_construct(&p_sm->subnet_up_event); > cl_event_wheel_construct(&p_sm->trap_aging_tracker); > cl_thread_construct(&p_sm->sweeper); > - cl_spinlock_construct(&p_sm->mgrp_lock); > osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); > osm_lid_mgr_construct(&p_sm->lid_mgr); > osm_ucast_mgr_construct(&p_sm->ucast_mgr); > @@ -234,8 +233,8 @@ void osm_sm_destroy(IN osm_sm_t * p_sm) > cl_event_destroy(&p_sm->signal_event); > cl_event_destroy(&p_sm->subnet_up_event); > cl_spinlock_destroy(&p_sm->signal_lock); > - cl_spinlock_destroy(&p_sm->mgrp_lock); > cl_spinlock_destroy(&p_sm->state_lock); > + free(p_sm->mlids_req); > > osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n"); /* Format > Waived */ > OSM_LOG_EXIT(p_sm->p_log); > @@ -288,11 +287,14 @@ ib_api_status_t osm_sm_init(IN osm_sm_t * p_sm, IN > osm_subn_t * p_subn, > if (status != CL_SUCCESS) > goto Exit; > > - cl_qlist_init(&p_sm->mgrp_list); > - > - status = cl_spinlock_init(&p_sm->mgrp_lock); > - if (status != CL_SUCCESS) > + p_sm->mlids_req_max = 0; > + p_sm->mlids_req = malloc((IB_LID_MCAST_END_HO - > IB_LID_MCAST_START_HO + > + 1) * sizeof(p_sm->mlids_req[0])); > + if (!p_sm->mlids_req) > goto Exit; > + memset(p_sm->mlids_req, 0, > + (IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + > + 1) * sizeof(p_sm->mlids_req[0])); > > status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl, p_sm->p_subn, > p_sm->p_mad_pool, p_sm->p_vl15, > @@ -441,32 +443,15 @@ Exit: > > /********************************************************************** > **********************************************************************/ > -static ib_api_status_t sm_mgrp_process(IN osm_sm_t * p_sm, > - IN osm_mgrp_t * p_mgrp) > +static void request_mlid(osm_sm_t * sm, uint16_t mlid) > { > - osm_mcast_mgr_ctxt_t *ctx; > - > - /* > - * 'Schedule' all the QP0 traffic for when the state manager > - * isn't busy trying to do something else. > - */ > - ctx = malloc(sizeof(*ctx)); > - if (!ctx) > - return IB_ERROR; > - memset(ctx, 0, sizeof(*ctx)); > - ctx->mlid = p_mgrp->mlid; > - > - cl_spinlock_acquire(&p_sm->mgrp_lock); > - cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx->list_item); > - cl_spinlock_release(&p_sm->mgrp_lock); > - > - osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); > - > - return IB_SUCCESS; > + mlid -= IB_LID_MCAST_START_HO; > + sm->mlids_req[mlid] = 1; > + if (sm->mlids_req_max < mlid) > + sm->mlids_req_max = mlid; > + osm_sm_signal(sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); > } > > -/********************************************************************** > - **********************************************************************/ > ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, > IN const ib_net64_t port_guid) > { > @@ -519,7 +504,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, > IN osm_mgrp_t *mgrp, > goto Exit; > } > > - status = sm_mgrp_process(p_sm, mgrp); > + request_mlid(p_sm, cl_ntoh16(mgrp->mlid)); > Exit: > CL_PLOCK_RELEASE(p_sm->p_lock); > OSM_LOG_EXIT(p_sm->p_log); > @@ -527,8 +512,6 @@ Exit: > return status; > } > > -/********************************************************************** > - **********************************************************************/ > ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t > *mgrp, > IN const ib_net64_t port_guid) > { > @@ -557,7 +540,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, > IN osm_mgrp_t *mgrp, > > osm_port_remove_mgrp(p_port, mgrp); > > - status = sm_mgrp_process(p_sm, mgrp); > + request_mlid(p_sm, cl_hton16(mgrp->mlid)); > Exit: > CL_PLOCK_RELEASE(p_sm->p_lock); > OSM_LOG_EXIT(p_sm->p_log); > -- > 1.6.4.2 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From worleys at gmail.com Tue Sep 8 15:29:56 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 9 Sep 2009 00:29:56 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AA4F561.504@vlnb.net> References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Mon, Sep 7, 2009 at 1:58 PM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/06/2009 05:41 PM wrote: >> >> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>> >>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>> wrote: >>>> >>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>> >>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>> >>>>>> I've used a couple of initiators (different systems) w/ different >>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>> repeat). >>>>> >>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>> both cases, the problem does not repeat. >>>>> >>>>> That makes it sound like OFED is the cure on either side of the >>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>> w/ either Ununtu or RHEL targets). >>>> >>>> These results are strange. Regarding the Linux-only tests, I was >>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>> each of these components there is at least one test that passes and at >>>> least one test that fails. So either my assumption is wrong or one of >>>> the above test results is not repeatable. Do you have the time to >>>> repeat the Linux-only tests ? >>> >>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>> the problem repeated; now, I can't repeat the case where it didn't >>> fail.  Still, no errors, other than the eventual timeouts previously >>> shown; the target thinks all is fine, the initiator is stuck. >> >> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >> 9.04. > > 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races > you have, although not completely. I'll give it a try if/when I can use my systems again. I've already done the lock testing as suggested, with no problems. It really seems like a protocol issue... something is getting dropped (maybe by the IB hardware), and either no retries or the target rejects the retries. > > 2. Try with another hardware, including motherboard. You can have something > like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) I'm wondering why it's so easily repeatable by me, and those I work with, and nobody else? I have another completely different configuration w/ the same issue... Today I was running on a dual-socket NHM and an eight-socket Opteron boxes. Both were running RHEL 5.3 w/ 2.6.18-128.7.1.0.1.el5, with all the SCST recommended modifications (which I haven't been running on my systems) and OFED 1.4.1. The Opteron was the target. The apps on the initiator don't need great speed, just very low latency. The app run time is ~4 hours... 2.5hours in, the hang occurs... same symptom, although this isn't raw disks running fio benchmarks... these are LVM striped (4MB) partitions w/ ext3 fs'es on top. The average block sizes as seen from the target are between 64KB and 128KB... not the 8KB and smaller I've been talking about in other tests, although all I can tell is what iostat shows me, and it just displays averages. But, the same issue occurs... the apps on the initiator hang, and the target thinks all is well. An app will hang in one of the file systems... the others seem to be working well (even though they are comprised of the same drives as the hung fs/app), for example: you can do a "find ." from their root w/o hanging "find", but if you try that in the fs where the app is hung, "find" will hang. Lvscan/pvscan will hang too. Strangely, restarting the target (removing ib_srpd and scst_vdisk modules, then re-registering the disks with scst_vdisk and re-modprobing ib_srpt from scratch) causes the apps on the initiator to un-hang and make progress again... (but eventually hang again... seemingly more readily than before). While nothing other than the messages you'd expect (from re-registering the drives to the initiator logging in) occur on the target, the initiator has much to say during this re-registration period, starting w/ the time-out (that has been shown previously): Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: timing out command, waited 360s Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: SCSI error: return code = 0x06000000 Sep 8 22:04:07 nameme kernel: end_request: I/O error, dev sdo, sector 45304704 Sep 8 22:04:16 nameme kernel: sd 29:0:0:8: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdj, sector 133535344 Sep 8 22:04:16 nameme kernel: sd 29:0:0:1: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 29:0:0:1: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdc, sector 55150000 Sep 8 22:04:16 nameme kernel: sd 29:0:0:5: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 29:0:0:5: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdg, sector 110218720 Sep 8 22:04:16 nameme kernel: sd 29:0:0:6: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 29:0:0:6: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdh, sector 133521792 Sep 8 22:04:16 nameme kernel: sd 30:0:0:9: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 30:0:0:9: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdu, sector 123498408 Sep 8 22:04:16 nameme kernel: sd 29:0:0:7: timing out command, waited 360s Sep 8 22:04:16 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdi, sector 61892872 Sep 8 22:04:26 nameme kernel: sd 29:0:0:0: timing out command, waited 360s Sep 8 22:04:26 nameme kernel: sd 29:0:0:0: SCSI error: return code = 0x06000000 Sep 8 22:04:26 nameme kernel: end_request: I/O error, dev sdb, sector 111165032 Sep 8 22:04:35 nameme kernel: sd 29:0:0:3: timing out command, waited 360s Sep 8 22:04:35 nameme kernel: sd 29:0:0:3: SCSI error: return code = 0x06000000 Sep 8 22:04:35 nameme kernel: end_request: I/O error, dev sde, sector 110192456 Sep 8 22:04:35 nameme kernel: sd 29:0:0:4: timing out command, waited 360s Sep 8 22:04:35 nameme kernel: sd 29:0:0:4: SCSI error: return code = 0x06000000 Sep 8 22:04:35 nameme kernel: end_request: I/O error, dev sdf, sector 81712464 Sep 8 22:04:45 nameme kernel: sd 29:0:0:9: timing out command, waited 360s Sep 8 22:04:45 nameme kernel: sd 29:0:0:9: SCSI error: return code = 0x06000000 Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdk, sector 133530608 Sep 8 22:04:45 nameme kernel: sd 30:0:0:4: timing out command, waited 360s Sep 8 22:04:45 nameme kernel: sd 30:0:0:4: SCSI error: return code = 0x06000000 Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdp, sector 133527104 Sep 8 22:04:45 nameme kernel: sd 30:0:0:8: timing out command, waited 360s Sep 8 22:04:45 nameme kernel: sd 30:0:0:8: SCSI error: return code = 0x06000000 Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdt, sector 133534080 Sep 8 22:04:45 nameme kernel: sd 30:0:0:5: timing out command, waited 360s Sep 8 22:04:45 nameme kernel: sd 30:0:0:5: SCSI error: return code = 0x06000000 Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdq, sector 133525384 Sep 8 22:04:45 nameme kernel: sd 30:0:0:6: timing out command, waited 360s Sep 8 22:04:45 nameme kernel: sd 30:0:0:6: SCSI error: return code = 0x06000000 Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdr, sector 136997696 Sep 8 22:04:50 nameme kernel: sd 30:0:0:0: timing out command, waited 360s Sep 8 22:04:50 nameme kernel: sd 30:0:0:0: SCSI error: return code = 0x06000000 Sep 8 22:04:50 nameme kernel: end_request: I/O error, dev sdl, sector 133540752 Sep 8 22:05:24 nameme kernel: sd 29:0:0:2: timing out command, waited 360s Sep 8 22:05:24 nameme kernel: sd 29:0:0:2: SCSI error: return code = 0x06000000 Sep 8 22:05:24 nameme kernel: end_request: I/O error, dev sdd, sector 138572160 Sep 8 22:06:37 nameme kernel: host27: ib_srp: DREQ received - connection closed Sep 8 22:06:37 nameme kernel: host28: ib_srp: DREQ received - connection closed Sep 8 22:06:37 nameme kernel: host29: ib_srp: DREQ received - connection closed Sep 8 22:06:37 nameme kernel: host30: ib_srp: DREQ received - connection closed Sep 8 22:06:37 nameme kernel: host31: ib_srp: DREQ received - connection closed Sep 8 22:06:37 nameme kernel: host32: ib_srp: DREQ received - connection closed Sep 8 22:06:39 nameme kernel: host27: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host27: add qp_in_err timer Sep 8 22:06:39 nameme kernel: host28: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host28: add qp_in_err timer Sep 8 22:06:39 nameme kernel: host29: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host29: add qp_in_err timer Sep 8 22:06:39 nameme kernel: host30: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host30: add qp_in_err timer Sep 8 22:06:39 nameme kernel: host31: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host31: add qp_in_err timer Sep 8 22:06:39 nameme kernel: host32: ib_srp: connection closed Sep 8 22:06:39 nameme kernel: ib_srp: host32: add qp_in_err timer Sep 8 22:07:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:07:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:07:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:07:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:07:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:07:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called Sep 8 22:07:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:10 nameme kernel: host27: ib_srp: DREQ received - connection closed Sep 8 22:09:10 nameme kernel: host28: ib_srp: DREQ received - connection closed Sep 8 22:09:10 nameme kernel: host29: ib_srp: DREQ received - connection closed Sep 8 22:09:10 nameme kernel: host30: ib_srp: DREQ received - connection closed Sep 8 22:09:10 nameme kernel: host31: ib_srp: DREQ received - connection closed Sep 8 22:09:10 nameme kernel: host32: ib_srp: DREQ received - connection closed Sep 8 22:09:12 nameme kernel: host27: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host27: add qp_in_err timer Sep 8 22:09:12 nameme kernel: host28: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host28: add qp_in_err timer Sep 8 22:09:12 nameme kernel: host29: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host29: add qp_in_err timer Sep 8 22:09:12 nameme kernel: host30: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host30: add qp_in_err timer Sep 8 22:09:12 nameme kernel: host31: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host31: add qp_in_err timer Sep 8 22:09:12 nameme kernel: host32: ib_srp: connection closed Sep 8 22:09:12 nameme kernel: ib_srp: host32: add qp_in_err timer Sep 8 22:09:37 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host27: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:37 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host28: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:37 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host29: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:37 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host30: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:37 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host31: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:09:37 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called Sep 8 22:09:37 nameme kernel: host32: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:10:36 nameme kernel: host27: ib_srp: DREQ received - connection closed Sep 8 22:10:36 nameme kernel: host28: ib_srp: DREQ received - connection closed Sep 8 22:10:36 nameme kernel: host29: ib_srp: DREQ received - connection closed Sep 8 22:10:36 nameme kernel: host30: ib_srp: DREQ received - connection closed Sep 8 22:10:36 nameme kernel: host31: ib_srp: DREQ received - connection closed Sep 8 22:10:36 nameme kernel: host32: ib_srp: DREQ received - connection closed Sep 8 22:10:39 nameme kernel: host27: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host27: add qp_in_err timer Sep 8 22:10:39 nameme kernel: host28: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host28: add qp_in_err timer Sep 8 22:10:39 nameme kernel: host29: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host29: add qp_in_err timer Sep 8 22:10:39 nameme kernel: host30: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host30: add qp_in_err timer Sep 8 22:10:39 nameme kernel: host31: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host31: add qp_in_err timer Sep 8 22:10:39 nameme kernel: host32: ib_srp: connection closed Sep 8 22:10:39 nameme kernel: ib_srp: host32: add qp_in_err timer Sep 8 22:11:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:11:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:11:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:11:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:11:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:11:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called Sep 8 22:11:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer flushed reset - done Here is the start of another hang, un-hung by restarting the target, Sep 8 22:32:33 nameme kernel: sd 29:0:0:7: timing out command, waited 360s Sep 8 22:32:33 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 Sep 8 22:32:33 nameme kernel: end_request: I/O error, dev sdi, sector 123549288 Sep 8 22:32:33 nameme kernel: device-mapper: multipath: Failing path 8:128. Sep 8 22:32:33 nameme kernel: sd 29:0:0:8: timing out command, waited 360s Sep 8 22:32:33 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 Sep 8 22:32:33 nameme kernel: end_request: I/O error, dev sdj, sector 123538312 Sep 8 22:32:33 nameme kernel: device-mapper: multipath: Failing path 8:144. Sep 8 22:38:33 nameme kernel: sd 29:0:0:8: timing out command, waited 360s Sep 8 22:38:33 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdj, sector 140607872 Sep 8 22:38:33 nameme kernel: sd 29:0:0:7: timing out command, waited 360s Sep 8 22:38:33 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdi, sector 123549416 Sep 8 22:38:33 nameme kernel: sd 29:0:0:0: timing out command, waited 360s Sep 8 22:38:33 nameme kernel: sd 29:0:0:0: SCSI error: return code = 0x06000000 Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdb, sector 123535744 Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:16. Sep 8 22:38:33 nameme kernel: sd 29:0:0:2: timing out command, waited 360s Sep 8 22:38:33 nameme kernel: sd 29:0:0:2: SCSI error: return code = 0x06000000 Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdd, sector 123535744 Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:48. Sep 8 22:38:33 nameme kernel: sd 29:0:0:6: timing out command, waited 360s Sep 8 22:38:33 nameme kernel: sd 29:0:0:6: SCSI error: return code = 0x06000000 Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdh, sector 123534960 Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:112. Sep 8 22:40:39 nameme kernel: sd 29:0:0:3: timing out command, waited 360s Sep 8 22:40:39 nameme kernel: sd 29:0:0:3: SCSI error: return code = 0x06000000 Sep 8 22:40:39 nameme kernel: end_request: I/O error, dev sde, sector 123559464 Sep 8 22:40:39 nameme kernel: device-mapper: multipath: Failing path 8:64. Sep 8 22:41:32 nameme kernel: host27: ib_srp: DREQ received - connection closed Sep 8 22:41:32 nameme kernel: host28: ib_srp: DREQ received - connection closed Sep 8 22:41:32 nameme kernel: host29: ib_srp: DREQ received - connection closed Sep 8 22:41:32 nameme kernel: host30: ib_srp: DREQ received - connection closed Sep 8 22:41:32 nameme kernel: host31: ib_srp: DREQ received - connection closed Sep 8 22:41:32 nameme kernel: host32: ib_srp: DREQ received - connection closed Sep 8 22:41:34 nameme kernel: host27: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host27: add qp_in_err timer Sep 8 22:41:34 nameme kernel: host28: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host28: add qp_in_err timer Sep 8 22:41:34 nameme kernel: host29: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host29: add qp_in_err timer Sep 8 22:41:34 nameme kernel: host30: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host30: add qp_in_err timer Sep 8 22:41:34 nameme kernel: host31: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host31: add qp_in_err timer Sep 8 22:41:34 nameme kernel: host32: ib_srp: connection closed Sep 8 22:41:34 nameme kernel: ib_srp: host32: add qp_in_err timer Sep 8 22:41:59 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host27: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:41:59 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host28: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:41:59 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host29: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:41:59 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host30: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:41:59 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host31: ib_srp: srp_qp_in_err_timer flushed reset - done Sep 8 22:41:59 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called Sep 8 22:41:59 nameme kernel: host32: ib_srp: srp_qp_in_err_timer flushed reset - done If I get a chance to work w/ my systems again, I'll see if this re-registration helps there too. Chris > >> Chris >>> >>> Chris >>>> >>>> Bart. >>>> > > From vlad at lists.openfabrics.org Wed Sep 9 03:06:17 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 9 Sep 2009 03:06:17 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090909-0200 daily build status Message-ID: <20090909100617.D44CBE61DDB@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From dzieko at wcss.pl Wed Sep 9 03:28:37 2009 From: dzieko at wcss.pl (Pawel Dziekonski) Date: Wed, 9 Sep 2009 12:28:37 +0200 Subject: [ofa-general] ibpanic In-Reply-To: <20090807112526.GD21691@cefeid.wcss.wroc.pl> References: <20090807112526.GD21691@cefeid.wcss.wroc.pl> Message-ID: <20090909102837.GE9156@cefeid.wcss.wroc.pl> Hi, today I got the following: kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [20876] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21040] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21050] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21137] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21246] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21256] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21343] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21431] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21441] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) perfquery: ibpanic: [21528] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) kernel: NETDEV WATCHDOG: ib0: transmit timed out kernel: ib0: transmit timeout: latency 9037 msecs kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213 kernel: NETDEV WATCHDOG: ib0: transmit timed out kernel: ib0: transmit timeout: latency 10087 msecs kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213 Seems that IB connection is lost completely (lustre not working, can't ping remote IPoIB addresses etc.). Is it a hardware problem with IB iface or something else? regards, P # ibv_devinfo hca_id: mthca0 fw_ver: 1.2.0 node_guid: 0030:487e:0c06:0000 sys_image_guid: 0030:487e:0c06:0003 vendor_id: 0x02c9 vendor_part_id: 25204 hw_ver: 0xA0 board_id: SM_0000000003 phys_port_cnt: 1 port: 1 state: PORT_INIT (2) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 448 port_lmc: 0x00 # ofed_info OFED-1.3.1 libibverbs: git://git.openfabrics.org/ofed_1_3/libibverbs.git ofed_1_3 commit 40b771aa6a9c0ad092b2e20775b4723d3b173792 libmthca: git://git.openfabrics.org/ofed_1_3/libmthca.git ofed_1_3 commit 9501e698d257949acfab2edc90812602966dbcc9 libmlx4: git://git.openfabrics.org/ofed_1_3/libmlx4.git ofed_1_3 commit 3869d6dab7e12fe452270ca641f7dd7082b42482 libehca: git://git.openfabrics.org/ofed_1_3/libehca.git ofed_1_3 commit fd898180cfa3b737f893f432a80b91bac3396325 libipathverbs: git://git.openfabrics.org/ofed_1_3/libipathverbs.git ofed_1_3 commit 82be4d81859d1fd2edf830220fe65a9923b80a46 libcxgb3: git://git.openfabrics.org/ofed_1_3/libcxgb3.git ofed_1_3 commit 6f7485feb244d8571fcab2292ef92c97bea48df0 libnes: git://git.openfabrics.org/ofed_1_3/libnes.git ofed_1_3 commit 471fa2e5a7bb2f8946119396358c31adcc6c2fb3 libibcm: git://git.openfabrics.org/ofed_1_3/libibcm.git ofed_1_3 commit 53ec35f544bbc1838bbadc2210909c25a954a5e2 librdmacm: git://git.openfabrics.org/ofed_1_3/librdmacm.git ofed_1_3 commit a0ef80a1e0d5debdae48a844fbc8d09aec5b24b1 dapl1: git://git.openfabrics.org/ofed_1_3/dapl1.git ofed_1_3 commit 7a9b58d6c50fc0a357de540ec3eb2ab2e07f8779 dapl2: git://git.openfabrics.org/ofed_1_3/dapl2.git ofed_1_3 commit 2583f07d9d0f55eee14e0b0e6074bc6fd0712177 libsdp: git://git.openfabrics.org/ofed_1_3/libsdp.git ofed_1_3 commit c8102dccc502930442b23de658674d386456b350 sdpnetstat: git://git.openfabrics.org/ofed_1_3/sdpnetstat.git ofed_1_3 commit 3341620a7259c4f7bdd4180864b98e260c3dc223 srptools: git://git.openfabrics.org/ofed_1_3/srptools.git ofed_1_3 commit e0ce2d42eeb25f8e89b8f6daaa32a630c9b64f0d perftest: git://git.openfabrics.org/ofed_1_3/perftest.git ofed_1_3 commit 6321b5468f7293088cc003809049c02b176130d8 qlvnictools: git://git.openfabrics.org/ofed_1_3/qlvnictools.git ofed_1_3 commit 086f9cb80ee790d61bddaf201ecbae32a2ff21dd tvflash: git://git.openfabrics.org/ofed_1_3/tvflash.git ofed_1_3 commit f5e7407a7f2058448df5e5320d9843f944427429 mstflint: git://git.openfabrics.org/ofed_1_3/mstflint.git ofed_1_3 commit 78bbd3d521a9078553a991111ffb6f76665b9ee9 qperf: git://git.openfabrics.org/ofed_1_3/qperf.git ofed_1_3 commit 6221aabd038df0b7033e035378ca190641ed2295 management: git://git.openfabrics.org/ofed_1_3/management.git ofed_1_3 commit d9c852406dae14e8284f9cfb1c7f495bbb55fddf ibutils: git://git.openfabrics.org/ofed_1_3/ibutils.git ofed_1_3 commit 7daf94fab6eaf307316326f3f49704e6080a1508 ibsim: git://git.openfabrics.org/ofed_1_3/ibsim.git ofed_1_3 commit 55113d9f919709c7c97ea41d29991941b9c8be70 ofa_kernel-1.3.1: Git: git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel commit 39e1dc833f98e5134f91fcf7f33df402adf4bc0c # MPI mvapich-1.0.1-2533.src.rpm mvapich2-1.0.3-1.src.rpm openmpi-1.2.6-1.src.rpm mpitests-3.0-773.src.rpm -- Pawel Dziekonski Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl From hnrose at comcast.net Wed Sep 9 06:04:29 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 9 Sep 2009 09:04:29 -0400 Subject: [ofa-general] [PATCHv2] opensm/osm_inform.c: For traps 64-67, use GID from DataDetails in log message Message-ID: <20090909130429.GA1946@comcast.net> Issuer GID is uninteresting for SM generated notices Signed-off-by: Hal Rosenstock --- Changes since v1: Unified OSM_LOG call for traps 64-67 Also, added log level check diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index 990f1e0..6e1a2b5 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -312,7 +312,7 @@ static ib_api_status_t send_report(IN osm_infr_t * p_infr_rec, /* the informinfo /* it is better to use LIDs since the GIDs might not be there for SMI traps */ OSM_LOG(p_log, OSM_LOG_DEBUG, "Forwarding Notice Event from LID:%u" - " to InformInfo LID: %u TID:0x%X\n", + " to InformInfo LID:%u TID:0x%X\n", cl_ntoh16(p_ntc->issuer_lid), cl_ntoh16(p_infr_rec->report_addr.dest_lid), trap_fwd_trans_id); @@ -545,6 +545,7 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, cl_list_t infr_to_remove_list; osm_infr_t *p_infr_rec; osm_infr_t *p_next_infr_rec; + ib_gid_t *p_gid; OSM_LOG_ENTER(p_log); @@ -559,8 +560,18 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, return (IB_ERROR); } + if (!osm_log_is_active(p_log, OSM_LOG_INFO)) + goto skip_log; + /* an official Event information log */ - if (ib_notice_is_generic(p_ntc)) + if (ib_notice_is_generic(p_ntc)) { + if ((p_ntc->g_or_v.generic.trap_num == CL_HTON16(64)) || + (p_ntc->g_or_v.generic.trap_num == CL_HTON16(65)) || + (p_ntc->g_or_v.generic.trap_num == CL_HTON16(66)) || + (p_ntc->g_or_v.generic.trap_num == CL_HTON16(67))) + p_gid = (ib_gid_t *)&p_ntc->data_details.ntc_64_67.gid.raw; + else + p_gid = (ib_gid_t *)&p_ntc->issuer_gid.raw; OSM_LOG(p_log, OSM_LOG_INFO, "Reporting Generic Notice type:%u num:%u (%s)" " from LID:%u GID:%s\n", @@ -568,9 +579,8 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, cl_ntoh16(p_ntc->g_or_v.generic.trap_num), ib_get_trap_str(p_ntc->g_or_v.generic.trap_num), cl_ntoh16(p_ntc->issuer_lid), - inet_ntop(AF_INET6, p_ntc->issuer_gid.raw, gid_str, - sizeof gid_str)); - else + inet_ntop(AF_INET6, p_gid->raw, gid_str, sizeof gid_str)); + } else OSM_LOG(p_log, OSM_LOG_INFO, "Reporting Vendor Notice type:%u vend:%u dev:%u" " from LID:%u GID:%s\n", @@ -581,6 +591,7 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, inet_ntop(AF_INET6, p_ntc->issuer_gid.raw, gid_str, sizeof gid_str)); +skip_log: /* Create a list that will hold all the infr records that should be removed due to violation. o13-17.1.2 */ cl_list_construct(&infr_to_remove_list); From ifan at novaglobal.com.sg Wed Sep 9 08:12:46 2009 From: ifan at novaglobal.com.sg (Ifan W) Date: Wed, 9 Sep 2009 23:12:46 +0800 Subject: [ofa-general] SRP on RHEL 5.3/OFED 1.3 vs RHEL 5.1/OFED 1.2? In-Reply-To: <20090522184006.GE26282@starfish.mcs.anl.gov> References: <20090522184006.GE26282@starfish.mcs.anl.gov> Message-ID: Hi John, Seeing your post and i was curious whether you have found the answer to your problem. I am currently facing the same problem on RHEL 5.3 + OFED 1.4 connecting to DDN 9900. Appreciate if you could share your finding so far. Thanks. Regards, Ifan On May 23, 2009, at 2:40 AM, John Valdes wrote: > Hi all, > > We have a storage array (a DDN 9550) attached to 8 servers via IB. > This setup has been running fine for the last 1.5 years or so, with > the servers running RHEL 5.1 and the OFED (OpenIB) 1.2 stack that's > included with RHEL 5.1. > > Recently, we tried to upgrade to new servers running RHEL 5.3 with > its bundled OFED 1.3 stack, but now we're seeing frequent timeouts > resulting in LUN resets and SCSI command aborts between the servers > and the DDN. As far as we can tell, our IB setup on the servers under > 5.3 is identical to the setup under 5.1, so we don't know why we're > seeing the timeouts and resets. > > Is anyone aware of any changes when using IB SRP w/ RHEL 5.3 and OFED > 1.3 vs RHEL 5.1/OFED 1.2 which might be causing this? > > For reference, here are some of the details of our setup: > > OLD CONFIGURATION > ----------------- > * SuperMicro P4DP6 motherboard, w/ dual Xeon CPUs (x86, single core > "Prestonia"), all circa 2002 hardware > * Cisco SFS-HCA-X2T7-A1 IB HCA (aka Mellanox Cougar Cub), 133 MHz > PCI-X, > 128 MB memory, Firmware v3.5.917, dual port (port 1 attached to DDN) > * RHEL 5.1 w/ bundled OFED/OpenIB 1.2 > * ib_mthca module loaded w/o any extra options > * ib_srp module loaded w/ option "srp_sg_tablesize=255" > * Connection to DDN established using "srp_daemon" invoked as: > "srp_daemon -coe" with options "max_sect=8192,max_cmd_per_lun=5" > given in /etc/srp_daemon.conf (Note that due to a bug in the OFED > 1.2 srp_daemon, the "max_sect=8192" option is ignored, which is OK > since we weren't taking advantage of that option). > * 7 DDN LUNs are accessed by all 8 servers as clustered logical > volumes (under RedHat's CLVM) holding GFS filesystems. > * 8 unique (not-shared) DDN LUNs are accessed by the servers (one LUN > per server) as a plain disk holding an ext3 filesystem. > > NEW CONFIGURATION > ----------------- > * SuperMicro H8DME-2 motherboard, w/ dual quad-core AMD Opteron > 2342, x86_64 > * Cisco SFS-HCA-X2T7-A1 IB HCA (aka Mellanox Cougar Cub), 133 MHz > PCI-X, > 128 MB memory, Firmware v3.5.917, dual port (port 1 attached to DDN) > --same card as in old configuration, physically moved to new servers > * RHEL 5.3 w/ bundled OFED/OpenIB 1.3 > * ib_mthca module loaded w/o any extra options > * ib_srp module loaded w/ option "srp_sg_tablesize=255" > * Connection to DDN established using "srp_daemon" invoked as: > "srp_daemon -coe -f /etc/ofed/srp_daemon.conf" with options > "max_sect=8192,max_cmd_per_lun=5" srp_daemon.conf > * 7 DDN LUNs are accessed by all 8 servers as clustered logical > volumes (under RedHat's CLVM) holding GFS filesystems. > * 8 unique (not-shared) DDN LUNs are accessed by the servers (one LUN > per server) as a plain disk holding an ext3 filesystem. > > > With the new configuration, timeouts/resets have frequently occurred > when starting up CLVM on the servers (eg, when the servers scan the > LUNs looking for the Linux (clustered) LVM data) as well as when doing > I/O to the mounted filesystems. Just to make sure the CLVM/GFS setup > wasn't causing problems, we tested the plain ext3 filesystem on the > non-shared LUN from one of the new servers, and when doing a simple > "dd" to the LUN, we were still seeing timeouts and LUN resets. > > Does any of this sound familiar to anyone? Do you have a recommended > IB/SRP setup for RHEL 5.3? > > John > > ---------------------------------------------------------------------- > John Valdes Mathematics and Computer Science Division > valdes at anl.gov Argonne National Laboratory > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From bart.vanassche at gmail.com Wed Sep 9 09:38:01 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 9 Sep 2009 18:38:01 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Wed, Sep 9, 2009 at 12:29 AM, Chris Worley wrote: > I'm wondering why it's so easily repeatable by me, and those I work > with, and nobody else?  I have another completely different > configuration w/ the same issue... It would help if you could run the following test: * Connect an SRP initiator back-to-back to the SRP target. * Install an operating system combination on initiator and target with which SRP did not work properly under heavy load. * Install and run OpenSM on the target if this software is not yet running on the target. * Repeat the SRP stress test on the initiator system that is connected back-to-back to the SRP target. This will tell us whether or not the IB switch or its firmware is causing the SRP issues. Bart. From eitan at mellanox.co.il Wed Sep 9 11:27:49 2009 From: eitan at mellanox.co.il (Eitan Zahavi) Date: Wed, 9 Sep 2009 21:27:49 +0300 Subject: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing In-Reply-To: References: <20090906154901.GF25241@me> <20090906154931.GG25241@me><20090906173900.GK25241@me> Message-ID: Hi All, Sorry I am not following the exact details of the changes made. But one thing must be clear:The deferred multicast handling is mandatory for the SM. The reason is that many requests may arrive together. The deferred execution enables immediate response to the client and reducing the computation time dramatically. The need to respond immediately and not wait for the MFT calculation and setting arises from the fact there is only one timeout value for MAD requests. Increasing that timeout to handle the expected delay on N MFT changes will make error handling too slow. The reduced computation comes from the fact that the routing is not recalculated for each request but only after a set of changes were made. This happens automatically since when the deferred execution happens all the changes made from last MFT routing are handled together. I am not sure what is the change but please make sure to test it on a large cluster when all nodes joins or leaves at once multiple groups. Eitan Eitan Zahavi Senior Engineering Director Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245 P.O. Box 586 Yokneam 20692 ISRAEL ________________________________ From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com] Sent: Wednesday, September 09, 2009 1:23 AM To: Sasha Khapyorsky Cc: OpenIB; Eli Dorfman; Eitan Zahavi; Yevgeny Kliteynik Subject: Re: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing On Sun, Sep 6, 2009 at 1:39 PM, Sasha Khapyorsky wrote: When we have two or more changes in a same multicast group multiple multicast rerouting requests will be created and processed. To prevent this we will use array of requests indexed by mlid value minus IB_LID_MCAST_START_HO and for each multicast group change we will just mark that specific mlid requires re-routing and "duplicated" requests will be merged there. Also in this way we will be able to process multicast group routing entries deletion for already removed groups by just knowing its MLID and not using its content - this will let us to not delay mutlicast groups deletion ('to_be_deleted' flag) and will simplify many multicast related code flows. While the delay adds complexity, it is a feature. Delayed deletion (and join) is allowed by IBA and is needed in a fast changing subnet when there are a lot of groups changing. This was seen quite a while ago and was how OpenSM evolved based on field experience and other testing. Eitan is the expert here. IMO support for this needs to be added (back in). -- Hal Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sm.h | 4 +- opensm/opensm/osm_mcast_mgr.c | 27 +++++++++------------ opensm/opensm/osm_sm.c | 49 +++++++++++++--------------------------- 3 files changed, 30 insertions(+), 50 deletions(-) diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index 0914a95..986143a 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -126,8 +126,8 @@ typedef struct osm_sm { cl_dispatcher_t *p_disp; cl_plock_t *p_lock; atomic32_t sm_trans_id; - cl_spinlock_t mgrp_lock; - cl_qlist_t mgrp_list; + unsigned mlids_req_max; + uint8_t *mlids_req; osm_sm_mad_ctrl_t mad_ctrl; osm_lid_mgr_t lid_mgr; osm_ucast_mgr_t ucast_mgr; diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index d7c5ce1..dd504ef 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1116,7 +1116,6 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) int osm_mcast_mgr_process(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl; - cl_qlist_t *p_list = &sm->mgrp_list; osm_mgrp_t *p_mgrp; int i, ret = 0; @@ -1150,16 +1149,14 @@ int osm_mcast_mgr_process(osm_sm_t * sm) mcast_mgr_process_mgrp(sm, p_mgrp); } + memset(sm->mlids_req, 0, sm->mlids_req_max); + sm->mlids_req_max = 0; + /* Walk the switches and download the tables for each. */ ret = mcast_mgr_set_mftables(sm); - while (!cl_is_qlist_empty(p_list)) { - cl_list_item_t *p = cl_qlist_remove_head(p_list); - free(p); - } - exit: CL_PLOCK_RELEASE(sm->p_lock); @@ -1174,11 +1171,10 @@ exit: **********************************************************************/ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) { - cl_qlist_t *p_list = &sm->mgrp_list; osm_mgrp_t *p_mgrp; ib_net16_t mlid; - osm_mcast_mgr_ctxt_t *ctx; int ret = 0; + unsigned i; OSM_LOG_ENTER(sm->p_log); @@ -1192,14 +1188,12 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) goto exit; } - while (!cl_is_qlist_empty(p_list)) { - ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list); - - /* nice copy no warning on size diff */ - memcpy(&mlid, &ctx->mlid, sizeof(mlid)); + for (i = 0; i <= sm->mlids_req_max; i++) { + if (!sm->mlids_req[i]) + continue; + sm->mlids_req[i] = 0; - /* we can destroy the context now */ - free(ctx); + mlid = cl_hton16(i + IB_LID_MCAST_START_HO); /* since we delayed the execution we prefer to pass the mlid as the mgrp identifier and then find it or abort */ @@ -1223,6 +1217,9 @@ int osm_mcast_mgr_process_mgroups(osm_sm_t * sm) mcast_mgr_process_mgrp(sm, p_mgrp); } + memset(sm->mlids_req, 0, sm->mlids_req_max); + sm->mlids_req_max = 0; + /* Walk the switches and download the tables for each. */ diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index 50aee91..e446c9d 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -166,7 +166,6 @@ void osm_sm_construct(IN osm_sm_t * p_sm) cl_event_construct(&p_sm->subnet_up_event); cl_event_wheel_construct(&p_sm->trap_aging_tracker); cl_thread_construct(&p_sm->sweeper); - cl_spinlock_construct(&p_sm->mgrp_lock); osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); osm_lid_mgr_construct(&p_sm->lid_mgr); osm_ucast_mgr_construct(&p_sm->ucast_mgr); @@ -234,8 +233,8 @@ void osm_sm_destroy(IN osm_sm_t * p_sm) cl_event_destroy(&p_sm->signal_event); cl_event_destroy(&p_sm->subnet_up_event); cl_spinlock_destroy(&p_sm->signal_lock); - cl_spinlock_destroy(&p_sm->mgrp_lock); cl_spinlock_destroy(&p_sm->state_lock); + free(p_sm->mlids_req); osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n"); /* Format Waived */ OSM_LOG_EXIT(p_sm->p_log); @@ -288,11 +287,14 @@ ib_api_status_t osm_sm_init(IN osm_sm_t * p_sm, IN osm_subn_t * p_subn, if (status != CL_SUCCESS) goto Exit; - cl_qlist_init(&p_sm->mgrp_list); - - status = cl_spinlock_init(&p_sm->mgrp_lock); - if (status != CL_SUCCESS) + p_sm->mlids_req_max = 0; + p_sm->mlids_req = malloc((IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + + 1) * sizeof(p_sm->mlids_req[0])); + if (!p_sm->mlids_req) goto Exit; + memset(p_sm->mlids_req, 0, + (IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + + 1) * sizeof(p_sm->mlids_req[0])); status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl, p_sm->p_subn, p_sm->p_mad_pool, p_sm->p_vl15, @@ -441,32 +443,15 @@ Exit: /********************************************************************** **********************************************************************/ -static ib_api_status_t sm_mgrp_process(IN osm_sm_t * p_sm, - IN osm_mgrp_t * p_mgrp) +static void request_mlid(osm_sm_t * sm, uint16_t mlid) { - osm_mcast_mgr_ctxt_t *ctx; - - /* - * 'Schedule' all the QP0 traffic for when the state manager - * isn't busy trying to do something else. - */ - ctx = malloc(sizeof(*ctx)); - if (!ctx) - return IB_ERROR; - memset(ctx, 0, sizeof(*ctx)); - ctx->mlid = p_mgrp->mlid; - - cl_spinlock_acquire(&p_sm->mgrp_lock); - cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx->list_item); - cl_spinlock_release(&p_sm->mgrp_lock); - - osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); - - return IB_SUCCESS; + mlid -= IB_LID_MCAST_START_HO; + sm->mlids_req[mlid] = 1; + if (sm->mlids_req_max < mlid) + sm->mlids_req_max = mlid; + osm_sm_signal(sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); } -/********************************************************************** - **********************************************************************/ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { @@ -519,7 +504,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, goto Exit; } - status = sm_mgrp_process(p_sm, mgrp); + request_mlid(p_sm, cl_ntoh16(mgrp->mlid)); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); @@ -527,8 +512,6 @@ Exit: return status; } -/********************************************************************** - **********************************************************************/ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { @@ -557,7 +540,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, osm_port_remove_mgrp(p_port, mgrp); - status = sm_mgrp_process(p_sm, mgrp); + request_mlid(p_sm, cl_hton16(mgrp->mlid)); Exit: CL_PLOCK_RELEASE(p_sm->p_lock); OSM_LOG_EXIT(p_sm->p_log); -- 1.6.4.2 _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Sep 9 11:29:40 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 09 Sep 2009 11:29:40 -0700 Subject: [ofa-general] Re: [PATCH 2/2] RDMA/cxgb3: clean up properly on FW mismatch failures. In-Reply-To: <20090908213006.15369.19707.stgit@build.ogc.int> (Steve Wise's message of "Tue, 08 Sep 2009 16:30:06 -0500") References: <20090908213001.15369.15629.stgit@build.ogc.int> <20090908213006.15369.19707.stgit@build.ogc.int> Message-ID: thanks, applied both patches. From vst at vlnb.net Wed Sep 9 11:38:30 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Wed, 09 Sep 2009 22:38:30 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: <4AA7F626.7060804@vlnb.net> Chris Worley, on 09/09/2009 02:29 AM wrote: > On Mon, Sep 7, 2009 at 1:58 PM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/06/2009 05:41 PM wrote: >>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>>> wrote: >>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>> repeat). >>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>> both cases, the problem does not repeat. >>>>>> >>>>>> That makes it sound like OFED is the cure on either side of the >>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>> w/ either Ununtu or RHEL targets). >>>>> These results are strange. Regarding the Linux-only tests, I was >>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>> each of these components there is at least one test that passes and at >>>>> least one test that fails. So either my assumption is wrong or one of >>>>> the above test results is not repeatable. Do you have the time to >>>>> repeat the Linux-only tests ? >>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>> the problem repeated; now, I can't repeat the case where it didn't >>>> fail. Still, no errors, other than the eventual timeouts previously >>>> shown; the target thinks all is fine, the initiator is stuck. >>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >>> 9.04. >> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races >> you have, although not completely. > > I'll give it a try if/when I can use my systems again. > > I've already done the lock testing as suggested, with no problems. > > It really seems like a protocol issue... something is getting dropped > (maybe by the IB hardware), and either no retries or the target > rejects the retries. >> 2. Try with another hardware, including motherboard. You can have something >> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) > > I'm wondering why it's so easily repeatable by me, and those I work > with, and nobody else? I have another completely different > configuration w/ the same issue... > > Today I was running on a dual-socket NHM and an eight-socket Opteron > boxes. Both were running RHEL 5.3 w/ 2.6.18-128.7.1.0.1.el5, with all > the SCST recommended modifications (which I haven't been running on my > systems) and OFED 1.4.1. The Opteron was the target. > > The apps on the initiator don't need great speed, just very low > latency. The app run time is ~4 hours... 2.5hours in, the hang > occurs... same symptom, although this isn't raw disks running fio > benchmarks... these are LVM striped (4MB) partitions w/ ext3 fs'es on > top. > > The average block sizes as seen from the target are between 64KB and > 128KB... not the 8KB and smaller I've been talking about in other > tests, although all I can tell is what iostat shows me, and it just > displays averages. > > But, the same issue occurs... the apps on the initiator hang, and the > target thinks all is well. An app will hang in one of the file > systems... the others seem to be working well (even though they are > comprised of the same drives as the hung fs/app), for example: you can > do a "find ." from their root w/o hanging "find", but if you try that > in the fs where the app is hung, "find" will hang. Lvscan/pvscan will > hang too. > > Strangely, restarting the target (removing ib_srpd and scst_vdisk > modules, then re-registering the disks with scst_vdisk and > re-modprobing ib_srpt from scratch) causes the apps on the initiator > to un-hang and make progress again... (but eventually hang again... > seemingly more readily than before). > > While nothing other than the messages you'd expect (from > re-registering the drives to the initiator logging in) occur on the > target, the initiator has much to say during this re-registration > period, starting w/ the time-out (that has been shown previously): > > Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: timing out command, waited 360s > Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: SCSI error: return code = 0x06000000 > Sep 8 22:04:07 nameme kernel: end_request: I/O error, dev sdo, sector 45304704 > Sep 8 22:04:16 nameme kernel: sd 29:0:0:8: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdj, sector 133535344 > Sep 8 22:04:16 nameme kernel: sd 29:0:0:1: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 29:0:0:1: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdc, sector 55150000 > Sep 8 22:04:16 nameme kernel: sd 29:0:0:5: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 29:0:0:5: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdg, sector 110218720 > Sep 8 22:04:16 nameme kernel: sd 29:0:0:6: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 29:0:0:6: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdh, sector 133521792 > Sep 8 22:04:16 nameme kernel: sd 30:0:0:9: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 30:0:0:9: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdu, sector 123498408 > Sep 8 22:04:16 nameme kernel: sd 29:0:0:7: timing out command, waited 360s > Sep 8 22:04:16 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 > Sep 8 22:04:16 nameme kernel: end_request: I/O error, dev sdi, sector 61892872 > Sep 8 22:04:26 nameme kernel: sd 29:0:0:0: timing out command, waited 360s > Sep 8 22:04:26 nameme kernel: sd 29:0:0:0: SCSI error: return code = 0x06000000 > Sep 8 22:04:26 nameme kernel: end_request: I/O error, dev sdb, sector 111165032 > Sep 8 22:04:35 nameme kernel: sd 29:0:0:3: timing out command, waited 360s > Sep 8 22:04:35 nameme kernel: sd 29:0:0:3: SCSI error: return code = 0x06000000 > Sep 8 22:04:35 nameme kernel: end_request: I/O error, dev sde, sector 110192456 > Sep 8 22:04:35 nameme kernel: sd 29:0:0:4: timing out command, waited 360s > Sep 8 22:04:35 nameme kernel: sd 29:0:0:4: SCSI error: return code = 0x06000000 > Sep 8 22:04:35 nameme kernel: end_request: I/O error, dev sdf, sector 81712464 > Sep 8 22:04:45 nameme kernel: sd 29:0:0:9: timing out command, waited 360s > Sep 8 22:04:45 nameme kernel: sd 29:0:0:9: SCSI error: return code = 0x06000000 > Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdk, sector 133530608 > Sep 8 22:04:45 nameme kernel: sd 30:0:0:4: timing out command, waited 360s > Sep 8 22:04:45 nameme kernel: sd 30:0:0:4: SCSI error: return code = 0x06000000 > Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdp, sector 133527104 > Sep 8 22:04:45 nameme kernel: sd 30:0:0:8: timing out command, waited 360s > Sep 8 22:04:45 nameme kernel: sd 30:0:0:8: SCSI error: return code = 0x06000000 > Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdt, sector 133534080 > Sep 8 22:04:45 nameme kernel: sd 30:0:0:5: timing out command, waited 360s > Sep 8 22:04:45 nameme kernel: sd 30:0:0:5: SCSI error: return code = 0x06000000 > Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdq, sector 133525384 > Sep 8 22:04:45 nameme kernel: sd 30:0:0:6: timing out command, waited 360s > Sep 8 22:04:45 nameme kernel: sd 30:0:0:6: SCSI error: return code = 0x06000000 > Sep 8 22:04:45 nameme kernel: end_request: I/O error, dev sdr, sector 136997696 > Sep 8 22:04:50 nameme kernel: sd 30:0:0:0: timing out command, waited 360s > Sep 8 22:04:50 nameme kernel: sd 30:0:0:0: SCSI error: return code = 0x06000000 > Sep 8 22:04:50 nameme kernel: end_request: I/O error, dev sdl, sector 133540752 > Sep 8 22:05:24 nameme kernel: sd 29:0:0:2: timing out command, waited 360s > Sep 8 22:05:24 nameme kernel: sd 29:0:0:2: SCSI error: return code = 0x06000000 > Sep 8 22:05:24 nameme kernel: end_request: I/O error, dev sdd, sector 138572160 > Sep 8 22:06:37 nameme kernel: host27: ib_srp: DREQ received - > connection closed > Sep 8 22:06:37 nameme kernel: host28: ib_srp: DREQ received - > connection closed > Sep 8 22:06:37 nameme kernel: host29: ib_srp: DREQ received - > connection closed > Sep 8 22:06:37 nameme kernel: host30: ib_srp: DREQ received - > connection closed > Sep 8 22:06:37 nameme kernel: host31: ib_srp: DREQ received - > connection closed > Sep 8 22:06:37 nameme kernel: host32: ib_srp: DREQ received - > connection closed > Sep 8 22:06:39 nameme kernel: host27: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host27: add qp_in_err timer > Sep 8 22:06:39 nameme kernel: host28: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host28: add qp_in_err timer > Sep 8 22:06:39 nameme kernel: host29: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host29: add qp_in_err timer > Sep 8 22:06:39 nameme kernel: host30: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host30: add qp_in_err timer > Sep 8 22:06:39 nameme kernel: host31: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host31: add qp_in_err timer > Sep 8 22:06:39 nameme kernel: host32: ib_srp: connection closed > Sep 8 22:06:39 nameme kernel: ib_srp: host32: add qp_in_err timer > Sep 8 22:07:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:07:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:07:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:07:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:07:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:07:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called > Sep 8 22:07:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:10 nameme kernel: host27: ib_srp: DREQ received - > connection closed > Sep 8 22:09:10 nameme kernel: host28: ib_srp: DREQ received - > connection closed > Sep 8 22:09:10 nameme kernel: host29: ib_srp: DREQ received - > connection closed > Sep 8 22:09:10 nameme kernel: host30: ib_srp: DREQ received - > connection closed > Sep 8 22:09:10 nameme kernel: host31: ib_srp: DREQ received - > connection closed > Sep 8 22:09:10 nameme kernel: host32: ib_srp: DREQ received - > connection closed > Sep 8 22:09:12 nameme kernel: host27: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host27: add qp_in_err timer > Sep 8 22:09:12 nameme kernel: host28: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host28: add qp_in_err timer > Sep 8 22:09:12 nameme kernel: host29: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host29: add qp_in_err timer > Sep 8 22:09:12 nameme kernel: host30: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host30: add qp_in_err timer > Sep 8 22:09:12 nameme kernel: host31: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host31: add qp_in_err timer > Sep 8 22:09:12 nameme kernel: host32: ib_srp: connection closed > Sep 8 22:09:12 nameme kernel: ib_srp: host32: add qp_in_err timer > Sep 8 22:09:37 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host27: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:37 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host28: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:37 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host29: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:37 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host30: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:37 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host31: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:09:37 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called > Sep 8 22:09:37 nameme kernel: host32: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:10:36 nameme kernel: host27: ib_srp: DREQ received - > connection closed > Sep 8 22:10:36 nameme kernel: host28: ib_srp: DREQ received - > connection closed > Sep 8 22:10:36 nameme kernel: host29: ib_srp: DREQ received - > connection closed > Sep 8 22:10:36 nameme kernel: host30: ib_srp: DREQ received - > connection closed > Sep 8 22:10:36 nameme kernel: host31: ib_srp: DREQ received - > connection closed > Sep 8 22:10:36 nameme kernel: host32: ib_srp: DREQ received - > connection closed > Sep 8 22:10:39 nameme kernel: host27: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host27: add qp_in_err timer > Sep 8 22:10:39 nameme kernel: host28: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host28: add qp_in_err timer > Sep 8 22:10:39 nameme kernel: host29: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host29: add qp_in_err timer > Sep 8 22:10:39 nameme kernel: host30: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host30: add qp_in_err timer > Sep 8 22:10:39 nameme kernel: host31: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host31: add qp_in_err timer > Sep 8 22:10:39 nameme kernel: host32: ib_srp: connection closed > Sep 8 22:10:39 nameme kernel: ib_srp: host32: add qp_in_err timer > Sep 8 22:11:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host27: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:11:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host28: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:11:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host29: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:11:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host30: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:11:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host31: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:11:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called > Sep 8 22:11:04 nameme kernel: host32: ib_srp: srp_qp_in_err_timer > flushed reset - done > > > Here is the start of another hang, un-hung by restarting the target, > > Sep 8 22:32:33 nameme kernel: sd 29:0:0:7: timing out command, waited 360s > Sep 8 22:32:33 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 > Sep 8 22:32:33 nameme kernel: end_request: I/O error, dev sdi, sector 123549288 > Sep 8 22:32:33 nameme kernel: device-mapper: multipath: Failing path 8:128. > Sep 8 22:32:33 nameme kernel: sd 29:0:0:8: timing out command, waited 360s > Sep 8 22:32:33 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 > Sep 8 22:32:33 nameme kernel: end_request: I/O error, dev sdj, sector 123538312 > Sep 8 22:32:33 nameme kernel: device-mapper: multipath: Failing path 8:144. > Sep 8 22:38:33 nameme kernel: sd 29:0:0:8: timing out command, waited 360s > Sep 8 22:38:33 nameme kernel: sd 29:0:0:8: SCSI error: return code = 0x06000000 > Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdj, sector 140607872 > Sep 8 22:38:33 nameme kernel: sd 29:0:0:7: timing out command, waited 360s > Sep 8 22:38:33 nameme kernel: sd 29:0:0:7: SCSI error: return code = 0x06000000 > Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdi, sector 123549416 > Sep 8 22:38:33 nameme kernel: sd 29:0:0:0: timing out command, waited 360s > Sep 8 22:38:33 nameme kernel: sd 29:0:0:0: SCSI error: return code = 0x06000000 > Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdb, sector 123535744 > Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:16. > Sep 8 22:38:33 nameme kernel: sd 29:0:0:2: timing out command, waited 360s > Sep 8 22:38:33 nameme kernel: sd 29:0:0:2: SCSI error: return code = 0x06000000 > Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdd, sector 123535744 > Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:48. > Sep 8 22:38:33 nameme kernel: sd 29:0:0:6: timing out command, waited 360s > Sep 8 22:38:33 nameme kernel: sd 29:0:0:6: SCSI error: return code = 0x06000000 > Sep 8 22:38:33 nameme kernel: end_request: I/O error, dev sdh, sector 123534960 > Sep 8 22:38:33 nameme kernel: device-mapper: multipath: Failing path 8:112. > Sep 8 22:40:39 nameme kernel: sd 29:0:0:3: timing out command, waited 360s > Sep 8 22:40:39 nameme kernel: sd 29:0:0:3: SCSI error: return code = 0x06000000 It means that the ini driver stopped accepting commands and not recovered after full line of resets. > Sep 8 22:40:39 nameme kernel: end_request: I/O error, dev sde, sector 123559464 > Sep 8 22:40:39 nameme kernel: device-mapper: multipath: Failing path 8:64. > Sep 8 22:41:32 nameme kernel: host27: ib_srp: DREQ received - > connection closed > Sep 8 22:41:32 nameme kernel: host28: ib_srp: DREQ received - > connection closed > Sep 8 22:41:32 nameme kernel: host29: ib_srp: DREQ received - > connection closed > Sep 8 22:41:32 nameme kernel: host30: ib_srp: DREQ received - > connection closed > Sep 8 22:41:32 nameme kernel: host31: ib_srp: DREQ received - > connection closed > Sep 8 22:41:32 nameme kernel: host32: ib_srp: DREQ received - > connection closed > Sep 8 22:41:34 nameme kernel: host27: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host27: add qp_in_err timer > Sep 8 22:41:34 nameme kernel: host28: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host28: add qp_in_err timer > Sep 8 22:41:34 nameme kernel: host29: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host29: add qp_in_err timer > Sep 8 22:41:34 nameme kernel: host30: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host30: add qp_in_err timer > Sep 8 22:41:34 nameme kernel: host31: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host31: add qp_in_err timer > Sep 8 22:41:34 nameme kernel: host32: ib_srp: connection closed > Sep 8 22:41:34 nameme kernel: ib_srp: host32: add qp_in_err timer > Sep 8 22:41:59 nameme kernel: host27: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host27: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:41:59 nameme kernel: host28: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host28: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:41:59 nameme kernel: host29: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host29: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:41:59 nameme kernel: host30: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host30: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:41:59 nameme kernel: host31: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host31: ib_srp: srp_qp_in_err_timer > flushed reset - done > Sep 8 22:41:59 nameme kernel: host32: ib_srp: srp_qp_in_err_timer called > Sep 8 22:41:59 nameme kernel: host32: ib_srp: srp_qp_in_err_timer Those messages should be analyzed and can be a key. > flushed reset - done > > If I get a chance to work w/ my systems again, I'll see if this > re-registration helps there too. > > Chris >>> Chris >>>> Chris >>>>> Bart. From tziporet at dev.mellanox.co.il Wed Sep 9 11:55:57 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 09 Sep 2009 21:55:57 +0300 Subject: [ofa-general] ibpanic In-Reply-To: <20090909102837.GE9156@cefeid.wcss.wroc.pl> References: <20090807112526.GD21691@cefeid.wcss.wroc.pl> <20090909102837.GE9156@cefeid.wcss.wroc.pl> Message-ID: <4AA7FA3D.6010707@mellanox.co.il> Pawel Dziekonski wrote: > Hi, > > today I got the following: > > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [20876] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21040] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21050] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21137] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21246] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21256] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21343] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21431] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21441] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [21528] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) > kernel: NETDEV WATCHDOG: ib0: transmit timed out > kernel: ib0: transmit timeout: latency 9037 msecs > kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213 > kernel: NETDEV WATCHDOG: ib0: transmit timed out > kernel: ib0: transmit timeout: latency 10087 msecs > kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213 > > Seems that IB connection is lost completely (lustre not working, can't ping > remote IPoIB addresses etc.). > > Is it a hardware problem with IB iface or something else? > > regards, P > > # ibv_devinfo > hca_id: mthca0 > fw_ver: 1.2.0 > node_guid: 0030:487e:0c06:0000 > sys_image_guid: 0030:487e:0c06:0003 > vendor_id: 0x02c9 > vendor_part_id: 25204 > hw_ver: 0xA0 > board_id: SM_0000000003 > phys_port_cnt: 1 > port: 1 > state: PORT_INIT (2) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 1 > port_lid: 448 > port_lmc: 0x00 > > > > It seems HW issue but to ensure it can you see if it also happened with OFED 1.4? If yes please approach Mellanox support since you might need a FW update Tziporet > > From bs_lists at aakef.fastmail.fm Wed Sep 9 13:18:50 2009 From: bs_lists at aakef.fastmail.fm (Bernd Schubert) Date: Wed, 9 Sep 2009 22:18:50 +0200 Subject: [ofa-general] ibpanic In-Reply-To: <4AA7FA3D.6010707@mellanox.co.il> References: <20090807112526.GD21691@cefeid.wcss.wroc.pl> <20090909102837.GE9156@cefeid.wcss.wroc.pl> <4AA7FA3D.6010707@mellanox.co.il> Message-ID: <200909092218.50792.bs_lists@aakef.fastmail.fm> On Wednesday 09 September 2009, Tziporet Koren wrote: > > Seems that IB connection is lost completely (lustre not working, can't > > ping remote IPoIB addresses etc.). > It seems HW issue but to ensure it can you see if it also happened with > OFED 1.4? > If yes please approach Mellanox support since you might need a FW update That might be a bit difficult for Pawel to test. I'm not sure which lustre version they have running now, but probably not yet 1.8.x, which officially supports ofed-1.4. Pawel, if you shouldn't want to test ofed-1.4 with lustre-1.8, we (DDN) have backported ofed-1.4 support to lustre-1.6. Just mail me to my ddn address if you should be interested. Cheers, Bernd From rdreier at cisco.com Wed Sep 9 13:42:21 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 09 Sep 2009 13:42:21 -0700 Subject: [ofa-general] [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code (was: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock) In-Reply-To: (Bart Van Assche's message of "Tue, 8 Sep 2009 21:09:42 +0200") References: Message-ID: Holding agent->lock across cancel_delayed_work() (which does del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of possible lock-timer deadlocks if a consumer ever does something that connects agent->lock to a lock taken in IRQ context (cf http://marc.info/?l=linux-rdma&m=125243699026045). However, it seems this locking is not necessary here, since the locking did not prevent the rmpp_list from having an item added immediately after the lock is dropped -- so there must be sufficient synchronization protecting the rmpp_list without the locking here. Therefore, we can fix the lockdep issue by simply deleting the locking. Hal/Sean, does this look right to you? --- drivers/infiniband/core/mad_rmpp.c | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c index 57a3c6f..865c109 100644 --- a/drivers/infiniband/core/mad_rmpp.c +++ b/drivers/infiniband/core/mad_rmpp.c @@ -85,12 +85,10 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent) struct mad_rmpp_recv *rmpp_recv, *temp_rmpp_recv; unsigned long flags; - spin_lock_irqsave(&agent->lock, flags); list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) { cancel_delayed_work(&rmpp_recv->timeout_work); cancel_delayed_work(&rmpp_recv->cleanup_work); } - spin_unlock_irqrestore(&agent->lock, flags); flush_workqueue(agent->qp_info->port_priv->wq); From sean.hefty at intel.com Wed Sep 9 14:22:28 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 9 Sep 2009 14:22:28 -0700 Subject: [ofa-general] RE: [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code (was: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock) In-Reply-To: References: Message-ID: >Holding agent->lock across cancel_delayed_work() (which does >del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of >possible lock-timer deadlocks if a consumer ever does something that >connects agent->lock to a lock taken in IRQ context (cf >http://marc.info/?l=linux-rdma&m=125243699026045). > >However, it seems this locking is not necessary here, since the locking >did not prevent the rmpp_list from having an item added immediately >after the lock is dropped -- so there must be sufficient synchronization >protecting the rmpp_list without the locking here. Therefore, we can >fix the lockdep issue by simply deleting the locking. The locking is needed to protect against items being removed from rmpp_list in recv_timeout_handler() and recv_cleanup_handler(). No new items should be added to the rmpp_list when ib_cancel_rmpp_recvs() is running (or there's a separate bug). - Sean From rdreier at cisco.com Wed Sep 9 14:34:41 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 09 Sep 2009 14:34:41 -0700 Subject: [ofa-general] Re: [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code In-Reply-To: (Sean Hefty's message of "Wed, 9 Sep 2009 14:22:28 -0700") References: Message-ID: > The locking is needed to protect against items being removed from rmpp_list in > recv_timeout_handler() and recv_cleanup_handler(). No new items should be added > to the rmpp_list when ib_cancel_rmpp_recvs() is running (or there's a separate > bug). Ah, I see. That's trickier I guess... hmm... - R. From arlin.r.davis at intel.com Wed Sep 9 15:14:42 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Wed, 9 Sep 2009 15:14:42 -0700 Subject: [ofa-general] [PATCH 2/4] DAPL v2: ucm cleanup extra cr/lf Message-ID: Signed-off-by: Arlin Davis --- dapl/openib_ucm/cm.c | 108 +++++++++++++++++++++++++------------------------- 1 files changed, 54 insertions(+), 54 deletions(-) diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index 72de5d5..aa6bb73 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -246,12 +246,12 @@ retry: struct ibv_wc wc; /* process completions, based on UCM_SND_BURST */ - ret = ibv_poll_cq(tp->scq, 1, &wc); - if (ret < 0) { + ret = ibv_poll_cq(tp->scq, 1, &wc); + if (ret < 0) { dapl_log(DAPL_DBG_TYPE_WARN, " get_smsg: cq %p %s\n", tp->scq, strerror(errno)); - } + } /* free up completed sends, update tail */ if (ret > 0) { tp->s_tl = (int)wc.wr_id; @@ -268,19 +268,19 @@ retry: /* RECEIVE CM MESSAGE PROCESSING */ static int ucm_post_rmsg(ib_hca_transport_t *tp, ib_cm_msg_t *msg) -{ - struct ibv_recv_wr recv_wr, *recv_err; - struct ibv_sge sge; - - recv_wr.next = NULL; - recv_wr.sg_list = &sge; - recv_wr.num_sge = 1; - recv_wr.wr_id = (uint64_t)(uintptr_t) msg; - sge.length = sizeof(ib_cm_msg_t) + sizeof(struct ibv_grh); - sge.lkey = tp->mr_rbuf->lkey; - sge.addr = (uintptr_t)((char *)msg - sizeof(struct ibv_grh)); - - return (ibv_post_recv(tp->qp, &recv_wr, &recv_err)); +{ + struct ibv_recv_wr recv_wr, *recv_err; + struct ibv_sge sge; + + recv_wr.next = NULL; + recv_wr.sg_list = &sge; + recv_wr.num_sge = 1; + recv_wr.wr_id = (uint64_t)(uintptr_t) msg; + sge.length = sizeof(ib_cm_msg_t) + sizeof(struct ibv_grh); + sge.lkey = tp->mr_rbuf->lkey; + sge.addr = (uintptr_t)((char *)msg - sizeof(struct ibv_grh)); + + return (ibv_post_recv(tp->qp, &recv_wr, &recv_err)); } static int ucm_reject(ib_hca_transport_t *tp, ib_cm_msg_t *msg) @@ -426,18 +426,18 @@ dp_ib_cm_handle_t ucm_cm_find(ib_hca_transport_t *tp, ib_cm_msg_t *msg) /* Get rmsgs from CM completion queue, 10 at a time */ static void ucm_recv(ib_hca_transport_t *tp) { - struct ibv_wc wc[10]; - ib_cm_msg_t *msg; - dp_ib_cm_handle_t cm; - int i, ret, notify = 0; - struct ibv_cq *ibv_cq = NULL; - DAPL_HCA *hca; - + struct ibv_wc wc[10]; + ib_cm_msg_t *msg; + dp_ib_cm_handle_t cm; + int i, ret, notify = 0; + struct ibv_cq *ibv_cq = NULL; + DAPL_HCA *hca; + /* POLLIN on channel FD */ ret = ibv_get_cq_event(tp->rch, &ibv_cq, (void *)&hca); if (ret == 0) { ibv_ack_cq_events(ibv_cq, 1); - } + } retry: ret = ibv_poll_cq(tp->rcq, 10, wc); if (ret <= 0) { @@ -493,9 +493,9 @@ static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg) { ib_cm_msg_t *smsg = NULL; struct ibv_send_wr wr, *bad_wr; - struct ibv_sge sge; - int len, ret = -1; - uint16_t dlid = ntohs(msg->daddr.ib.lid); + struct ibv_sge sge; + int len, ret = -1; + uint16_t dlid = ntohs(msg->daddr.ib.lid); /* Get message from send queue, copy data, and send */ dapl_os_lock(&tp->slock); @@ -504,39 +504,39 @@ static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg) len = ((sizeof(*msg) - DCM_MAX_PDATA_SIZE) + ntohs(msg->p_size)); dapl_os_memcpy(smsg, msg, len); - - wr.next = NULL; - wr.sg_list = &sge; - wr.num_sge = 1; - wr.opcode = IBV_WR_SEND; - wr.wr_id = (unsigned long)tp->s_hd; - wr.send_flags = (wr.wr_id % UCM_SND_BURST) ? 0 : IBV_SEND_SIGNALED; - if (len <= tp->max_inline_send) - wr.send_flags |= IBV_SEND_INLINE; - - sge.length = len; - sge.lkey = tp->mr_sbuf->lkey; - sge.addr = (uintptr_t)smsg; - + + wr.next = NULL; + wr.sg_list = &sge; + wr.num_sge = 1; + wr.opcode = IBV_WR_SEND; + wr.wr_id = (unsigned long)tp->s_hd; + wr.send_flags = (wr.wr_id % UCM_SND_BURST) ? 0 : IBV_SEND_SIGNALED; + if (len <= tp->max_inline_send) + wr.send_flags |= IBV_SEND_INLINE; + + sge.length = len; + sge.lkey = tp->mr_sbuf->lkey; + sge.addr = (uintptr_t)smsg; + dapl_dbg_log(DAPL_DBG_TYPE_CM, " ucm_send: op %d ln %d lid %x c_qpn %x rport %d\n", ntohs(smsg->op), len, htons(smsg->daddr.ib.lid), - htonl(smsg->dqpn), htons(smsg->dport)); - - /* empty slot, then create AH */ - if (!tp->ah[dlid]) { - tp->ah[dlid] = - dapls_create_ah(tp->hca, tp->pd, tp->qp, - htons(dlid), NULL); - if (!tp->ah[dlid]) - goto bail; - } - + htonl(smsg->dqpn), htons(smsg->dport)); + + /* empty slot, then create AH */ + if (!tp->ah[dlid]) { + tp->ah[dlid] = + dapls_create_ah(tp->hca, tp->pd, tp->qp, + htons(dlid), NULL); + if (!tp->ah[dlid]) + goto bail; + } + wr.wr.ud.ah = tp->ah[dlid]; wr.wr.ud.remote_qpn = ntohl(smsg->dqpn); - wr.wr.ud.remote_qkey = DAT_UD_QKEY; + wr.wr.ud.remote_qkey = DAT_UD_QKEY; - ret = ibv_post_send(tp->qp, &wr, &bad_wr); + ret = ibv_post_send(tp->qp, &wr, &bad_wr); bail: dapl_os_unlock(&tp->slock); return ret; -- 1.5.2.5 From arlin.r.davis at intel.com Wed Sep 9 15:14:33 2009 From: arlin.r.davis at intel.com (Arlin Davis) Date: Wed, 9 Sep 2009 15:14:33 -0700 Subject: [ofa-general] [PATCH 1/4] DAPL v2: ucm: fix issues with UD type QP's Message-ID: private data size not in host order when processing connection events. ud extentions event should include original ia_addr and qpn used during connection and not the IB qpn. ucm QP service resource cleanup in wrong order. cleanup extra cr/lf device.c Signed-off-by: Arlin Davis --- dapl/openib_common/qp.c | 5 + dapl/openib_ucm/cm.c | 71 +++++++++++++--- dapl/openib_ucm/device.c | 206 ++++++++++++++++++++++++---------------------- 3 files changed, 172 insertions(+), 110 deletions(-) diff --git a/dapl/openib_common/qp.c b/dapl/openib_common/qp.c index 581fc83..09a61b1 100644 --- a/dapl/openib_common/qp.c +++ b/dapl/openib_common/qp.c @@ -557,6 +557,7 @@ dapls_create_ah(IN DAPL_HCA *hca, /* address handle. RC and UD */ qp_attr.ah_attr.dlid = ntohs(lid); if (gid != NULL) { + dapl_log(DAPL_DBG_TYPE_CM, "dapl_create_ah: with GID\n"); qp_attr.ah_attr.is_global = 1; qp_attr.ah_attr.grh.dgid.global.subnet_prefix = ntohll(gid->global.subnet_prefix); @@ -569,6 +570,10 @@ dapls_create_ah(IN DAPL_HCA *hca, qp_attr.ah_attr.src_path_bits = 0; qp_attr.ah_attr.port_num = hca->port_num; + dapl_log(DAPL_DBG_TYPE_CM, + " dapls_create_ah: port %x lid %x pd %p ctx %p handle 0x%x\n", + hca->port_num,qp_attr.ah_attr.dlid, pd, pd->context, pd->handle); + /* UD: create AH for remote side */ ah = ibv_create_ah(pd, &qp_attr.ah_attr); if (!ah) { diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index a2db64e..72de5d5 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -381,7 +381,7 @@ dp_ib_cm_handle_t ucm_cm_find(ib_hca_transport_t *tp, ib_cm_msg_t *msg) continue; dapl_dbg_log(DAPL_DBG_TYPE_CM, - " MATCH? cm %p st %s sport %x sqpn %x lid %x\n", + " MATCH? cm %p st %s sport %d sqpn %x lid %x\n", cm, dapl_cm_state_str(cm->state), ntohs(cm->msg.sport), ntohl(cm->msg.sqpn), ntohs(cm->msg.saddr.ib.lid)); @@ -755,7 +755,7 @@ DAT_RETURN dapli_cm_connect(DAPL_EP *ep, dp_ib_cm_handle_t cm) { dapl_log(DAPL_DBG_TYPE_EP, - " connect: lid %x qpn %x lport %d p_sz=%d -> " + " connect: lid %x i_qpn %x lport %d p_sz=%d -> " " lid %x c_qpn %x rport %d\n", htons(cm->msg.saddr.ib.lid), htonl(cm->msg.saddr.ib.qpn), htons(cm->msg.sport), htons(cm->msg.p_size), @@ -934,6 +934,30 @@ ud_bail: dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, sizeof(union dcm_addr)); + /* remote ia_addr reference includes ucm qpn, not IB qpn */ + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qpn = cm->msg.dqpn; + + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " ACTIVE: UD xevent ah %p qpn 0x%x lid 0x%x\n", + xevent.remote_ah.ah, xevent.remote_ah.qpn, lid); + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " ACTIVE: UD xevent ia_addr qp_type %d, port %d" + " lid 0x%x qpn 0x%x gid 0x"F64x" 0x"F64x" \n", + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qp_type, + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.port_num, + ntohs(((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.lid), + ntohl(((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qpn), + ntohll(((union dcm_addr*) + &xevent.remote_ah.ia_addr)-> + ib.gid.global.subnet_prefix), + ntohll(((union dcm_addr*) + &xevent.remote_ah.ia_addr)-> + ib.gid.global.interface_id)); if (event == IB_CME_CONNECTED) event = DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED; @@ -944,7 +968,7 @@ ud_bail: (DAPL_EVD *)cm->ep->param.connect_evd_handle, event, (DAT_EP_HANDLE)ep, - (DAT_COUNT)cm->msg.p_size, + (DAT_COUNT)ntohs(cm->msg.p_size), (DAT_PVOID *)cm->msg.p_data, (DAT_PVOID *)&xevent); @@ -1026,7 +1050,7 @@ static void ucm_accept(ib_cm_srvc_handle_t cm, ib_cm_msg_t *msg) dapls_evd_post_cr_event_ext(acm->sp, DAT_IB_UD_CONNECTION_REQUEST_EVENT, acm, - (DAT_COUNT)acm->msg.p_size, + (DAT_COUNT)ntohs(acm->msg.p_size), (DAT_PVOID *)acm->msg.p_data, (DAT_PVOID *)&xevent); } else @@ -1070,7 +1094,7 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) dapl_dbg_log(DAPL_DBG_TYPE_CM, " PASSIVE: connected!\n"); #ifdef DAT_EXTENSIONS - if (cm->msg.daddr.ib.qp_type == IBV_QPT_UD) { + if (cm->msg.saddr.ib.qp_type == IBV_QPT_UD) { DAT_IB_EXTENSION_EVENT_DATA xevent; uint16_t lid = ntohs(cm->msg.daddr.ib.lid); @@ -1081,13 +1105,37 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) xevent.remote_ah.qpn = ntohl(cm->msg.daddr.ib.qpn); dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, - sizeof(cm->msg.daddr)); + sizeof(union dcm_addr)); + /* remote ia_addr reference includes ucm qpn, not IB qpn */ + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qpn = cm->msg.dqpn; + + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " PASSIVE: UD xevent ah %p qpn 0x%x lid 0x%x\n", + xevent.remote_ah.ah, xevent.remote_ah.qpn, lid); + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " PASSIVE: UD xevent ia_addr qp_type %d, port %d" + " lid 0x%x qpn 0x%x gid 0x"F64x" 0x"F64x" \n", + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qp_type, + ((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.port_num, + ntohs(((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.lid), + ntohl(((union dcm_addr*) + &xevent.remote_ah.ia_addr)->ib.qpn), + ntohll(((union dcm_addr*) + &xevent.remote_ah.ia_addr)-> + ib.gid.global.subnet_prefix), + ntohll(((union dcm_addr*) + &xevent.remote_ah.ia_addr)-> + ib.gid.global.interface_id)); dapls_evd_post_connection_event_ext( (DAPL_EVD *)cm->ep->param.connect_evd_handle, DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED, (DAT_EP_HANDLE)cm->ep, - (DAT_COUNT)cm->msg.p_size, + (DAT_COUNT)ntohs(cm->msg.p_size), (DAT_PVOID *)cm->msg.p_data, (DAT_PVOID *)&xevent); @@ -1130,9 +1178,9 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT p_size, DAT_PVOID p_data) dapl_dbg_log(DAPL_DBG_TYPE_CM, " ACCEPT_USR: remote port_num=%d lid=%x" " iqp=%x qp_type %d, psize=%d\n", - cm->msg.daddr.ib.port_num, cm->msg.daddr.ib.lid, - cm->msg.daddr.ib.qpn, cm->msg.daddr.ib.qp_type, - cm->msg.p_size); + cm->msg.daddr.ib.port_num, ntohs(cm->msg.daddr.ib.lid), + ntohl(cm->msg.daddr.ib.qpn), cm->msg.daddr.ib.qp_type, + ntohs(cm->msg.p_size)); dapl_dbg_log(DAPL_DBG_TYPE_CM, " ACCEPT_USR: remote GID subnet %016llx id %016llx\n", @@ -1186,7 +1234,7 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT p_size, DAT_PVOID p_data) /* setup local QP info and type from EP, copy pdata, for reply */ cm->msg.op = htons(DCM_REP); cm->msg.saddr.ib.qpn = htonl(ep->qp_handle->qp_num); - cm->msg.saddr.ib.qp_type = htons(ep->qp_handle->qp_type); + cm->msg.saddr.ib.qp_type = ep->qp_handle->qp_type; cm->msg.saddr.ib.port_num = cm->hca->port_num; cm->msg.saddr.ib.lid = cm->hca->ib_trans.addr.ib.lid; cm->msg.saddr.ib.gid = cm->hca->ib_trans.addr.ib.gid; @@ -1254,6 +1302,7 @@ dapls_ib_connect(IN DAT_EP_HANDLE ep_handle, /* remote uCM information, comes from consumer provider r_addr */ cm->msg.dport = htons((uint16_t)r_psp); cm->msg.dqpn = cm->msg.daddr.ib.qpn; + cm->msg.daddr.ib.qpn = 0; /* don't have a remote qpn until reply */ if (p_size) { cm->msg.p_size = htons(p_size); diff --git a/dapl/openib_ucm/device.c b/dapl/openib_ucm/device.c index 329b050..243044a 100644 --- a/dapl/openib_ucm/device.c +++ b/dapl/openib_ucm/device.c @@ -281,8 +281,9 @@ found: } dapl_dbg_log(DAPL_DBG_TYPE_UTIL, - " open_hca: devname %s, port %d, hostname_IP %s\n", + " open_hca: devname %s, ctx %p port %d, hostname_IP %s\n", ibv_get_device_name(hca_ptr->ib_trans.ib_dev), + hca_ptr->ib_hca_handle, hca_ptr->ib_trans.addr.ib.port_num, inet_ntoa(((struct sockaddr_in *) &hca_ptr->hca_address)->sin_addr)); @@ -371,72 +372,79 @@ static void ucm_service_destroy(IN DAPL_HCA *hca) ib_hca_transport_t *tp = &hca->ib_trans; int msg_size = sizeof(ib_cm_msg_t); - if (tp->pd) - ibv_dealloc_pd(tp->pd); - - if (tp->rch) - ibv_destroy_comp_channel(tp->rch); - - if (tp->scq) - ibv_destroy_cq(tp->scq); - - if (tp->rcq) - ibv_destroy_cq(tp->rcq); - - if (tp->qp) - ibv_destroy_qp(tp->qp); - if (tp->mr_sbuf) ibv_dereg_mr(tp->mr_sbuf); if (tp->mr_sbuf) - ibv_dereg_mr(tp->mr_sbuf); - - if (tp->ah) - dapl_os_free(tp->ah, (sizeof(*tp->ah) * 0xffff)); - - if (tp->sid) - dapl_os_free(tp->sid, (sizeof(*tp->sid) * 0xffff)); - - if (tp->rbuf) - dapl_os_free(tp->rbuf, (msg_size * tp->qpe)); - - if (tp->sbuf) - dapl_os_free(tp->sbuf, (msg_size * tp->qpe)); + ibv_dereg_mr(tp->mr_sbuf); + + if (tp->qp) + ibv_destroy_qp(tp->qp); + + if (tp->scq) + ibv_destroy_cq(tp->scq); + + if (tp->rcq) + ibv_destroy_cq(tp->rcq); + + if (tp->rch) + ibv_destroy_comp_channel(tp->rch); + + dapl_log(DAPL_DBG_TYPE_UTIL, + " destroy_service: pd %p ctx %p handle 0x%x\n", + tp->pd, tp->pd->context, tp->pd->handle); + if (tp->pd) + ibv_dealloc_pd(tp->pd); + + if (tp->ah) + dapl_os_free(tp->ah, (sizeof(*tp->ah) * 0xffff)); + + if (tp->sid) + dapl_os_free(tp->sid, (sizeof(*tp->sid) * 0xffff)); + + if (tp->rbuf) + dapl_os_free(tp->rbuf, (msg_size * tp->qpe)); + + if (tp->sbuf) + dapl_os_free(tp->sbuf, (msg_size * tp->qpe)); } static int ucm_service_create(IN DAPL_HCA *hca) { - struct ibv_qp_init_attr qp_create; - ib_hca_transport_t *tp = &hca->ib_trans; - struct ibv_recv_wr recv_wr, *recv_err; - struct ibv_sge sge; - int i, mlen = sizeof(ib_cm_msg_t); - int hlen = sizeof(struct ibv_grh); /* hdr included with UD recv */ + struct ibv_qp_init_attr qp_create; + ib_hca_transport_t *tp = &hca->ib_trans; + struct ibv_recv_wr recv_wr, *recv_err; + struct ibv_sge sge; + int i, mlen = sizeof(ib_cm_msg_t); + int hlen = sizeof(struct ibv_grh); /* hdr included with UD recv */ dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ucm_create: \n"); /* get queue sizes */ tp->qpe = dapl_os_get_env_val("DAPL_UCM_QPE", UCM_DEFAULT_QPE); - tp->cqe = dapl_os_get_env_val("DAPL_UCM_CQE", UCM_DEFAULT_CQE); - tp->pd = ibv_alloc_pd(hca->ib_hca_handle); - if (!tp->pd) - goto bail; - - tp->rch = ibv_create_comp_channel(hca->ib_hca_handle); - if (!tp->rch) - goto bail; - - tp->scq = ibv_create_cq(hca->ib_hca_handle, tp->cqe, hca, NULL, 0); - if (!tp->scq) - goto bail; + tp->cqe = dapl_os_get_env_val("DAPL_UCM_CQE", UCM_DEFAULT_CQE); + tp->pd = ibv_alloc_pd(hca->ib_hca_handle); + if (!tp->pd) + goto bail; + + dapl_log(DAPL_DBG_TYPE_UTIL, + " create_service: pd %p ctx %p handle 0x%x\n", + tp->pd, tp->pd->context, tp->pd->handle); + + tp->rch = ibv_create_comp_channel(hca->ib_hca_handle); + if (!tp->rch) + goto bail; + + tp->scq = ibv_create_cq(hca->ib_hca_handle, tp->cqe, hca, NULL, 0); + if (!tp->scq) + goto bail; - tp->rcq = ibv_create_cq(hca->ib_hca_handle, tp->cqe, hca, tp->rch, 0); - if (!tp->rcq) - goto bail; - - if(ibv_req_notify_cq(tp->rcq, 0)) - goto bail; + tp->rcq = ibv_create_cq(hca->ib_hca_handle, tp->cqe, hca, tp->rch, 0); + if (!tp->rcq) + goto bail; + + if(ibv_req_notify_cq(tp->rcq, 0)) + goto bail; dapl_os_memzero((void *)&qp_create, sizeof(qp_create)); qp_create.qp_type = IBV_QPT_UD; @@ -446,59 +454,59 @@ static int ucm_service_create(IN DAPL_HCA *hca) qp_create.cap.max_send_sge = qp_create.cap.max_recv_sge = 1; qp_create.cap.max_inline_data = tp->max_inline_send; qp_create.qp_context = (void *)hca; - - tp->qp = ibv_create_qp(tp->pd, &qp_create); - if (!tp->qp) - goto bail; - + + tp->qp = ibv_create_qp(tp->pd, &qp_create); + if (!tp->qp) + goto bail; + tp->ah = (ib_ah_handle_t*) dapl_os_alloc(sizeof(ib_ah_handle_t) * 0xffff); tp->sid = (uint8_t*) dapl_os_alloc(sizeof(uint8_t) * 0xffff); tp->rbuf = (void*) dapl_os_alloc((mlen + hlen) * tp->qpe); tp->sbuf = (void*) dapl_os_alloc(mlen * tp->qpe); - - if (!tp->ah || !tp->rbuf || !tp->sbuf || !tp->sid) - goto bail; - + + if (!tp->ah || !tp->rbuf || !tp->sbuf || !tp->sid) + goto bail; + (void)dapl_os_memzero(tp->ah, (sizeof(ib_ah_handle_t) * 0xffff)); (void)dapl_os_memzero(tp->sid, (sizeof(uint8_t) * 0xffff)); tp->sid[0] = 1; /* resv slot 0, 0 == no ports available */ - (void)dapl_os_memzero(tp->rbuf, ((mlen + hlen) * tp->qpe)); - (void)dapl_os_memzero(tp->sbuf, (mlen * tp->qpe)); - - tp->mr_sbuf = ibv_reg_mr(tp->pd, tp->sbuf, - (mlen * tp->qpe), - IBV_ACCESS_LOCAL_WRITE); - if (!tp->mr_sbuf) - goto bail; - - tp->mr_rbuf = ibv_reg_mr(tp->pd, tp->rbuf, - ((mlen + hlen) * tp->qpe), - IBV_ACCESS_LOCAL_WRITE); - if (!tp->mr_rbuf) - goto bail; - - /* modify UD QP: init, rtr, rts */ - if ((dapls_modify_qp_ud(hca, tp->qp)) != DAT_SUCCESS) - goto bail; - - /* post receive buffers, setup head, tail pointers */ - recv_wr.next = NULL; - recv_wr.sg_list = &sge; - recv_wr.num_sge = 1; - sge.length = mlen + hlen; - sge.lkey = tp->mr_rbuf->lkey; - - for (i = 0; i < tp->qpe; i++) { - recv_wr.wr_id = - (uintptr_t)((char *)&tp->rbuf[i] + - sizeof(struct ibv_grh)); - sge.addr = (uintptr_t) &tp->rbuf[i]; - if (ibv_post_recv(tp->qp, &recv_wr, &recv_err)) - goto bail; - } - - /* save qp_num as part of ia_address, network order */ - tp->addr.ib.qpn = htonl(tp->qp->qp_num); + (void)dapl_os_memzero(tp->rbuf, ((mlen + hlen) * tp->qpe)); + (void)dapl_os_memzero(tp->sbuf, (mlen * tp->qpe)); + + tp->mr_sbuf = ibv_reg_mr(tp->pd, tp->sbuf, + (mlen * tp->qpe), + IBV_ACCESS_LOCAL_WRITE); + if (!tp->mr_sbuf) + goto bail; + + tp->mr_rbuf = ibv_reg_mr(tp->pd, tp->rbuf, + ((mlen + hlen) * tp->qpe), + IBV_ACCESS_LOCAL_WRITE); + if (!tp->mr_rbuf) + goto bail; + + /* modify UD QP: init, rtr, rts */ + if ((dapls_modify_qp_ud(hca, tp->qp)) != DAT_SUCCESS) + goto bail; + + /* post receive buffers, setup head, tail pointers */ + recv_wr.next = NULL; + recv_wr.sg_list = &sge; + recv_wr.num_sge = 1; + sge.length = mlen + hlen; + sge.lkey = tp->mr_rbuf->lkey; + + for (i = 0; i < tp->qpe; i++) { + recv_wr.wr_id = + (uintptr_t)((char *)&tp->rbuf[i] + + sizeof(struct ibv_grh)); + sge.addr = (uintptr_t) &tp->rbuf[i]; + if (ibv_post_recv(tp->qp, &recv_wr, &recv_err)) + goto bail; + } + + /* save qp_num as part of ia_address, network order */ + tp->addr.ib.qpn = htonl(tp->qp->qp_num); return 0; bail: dapl_log(DAPL_DBG_TYPE_ERR, -- 1.5.2.5 From arlin.r.davis at intel.com Wed Sep 9 15:14:44 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Wed, 9 Sep 2009 15:14:44 -0700 Subject: [ofa-general] [PATCH 3/4] DAPL v2: ucm: For UD type QP's, return CR p_data with CONN_EST event on passive side. Message-ID: Intel MPI uses the p_data provided with CONN_EST as a reference to the UD pair and remote rank. The ucm provider was overwriting the CR p_data with the ACCEPT p_data. Change to save CR p_data but also provide storage for user provided ACCEPT p_data in case the REPLY is lost and needs retransmitted. p_data size was provided to event processing in network order instead of host order. For new QP's create new address handles and do not use existing AH's created for the CM. Different PD's are associated with each. Signed-off-by: Arlin Davis --- dapl/openib_common/dapl_ib_dto.h | 2 +- dapl/openib_scm/cm.c | 4 +- dapl/openib_ucm/cm.c | 62 +++++++++++++++++++++++++------------ dapl/openib_ucm/dapl_ib_util.h | 2 + 4 files changed, 47 insertions(+), 23 deletions(-) diff --git a/dapl/openib_common/dapl_ib_dto.h b/dapl/openib_common/dapl_ib_dto.h index e6c03b2..b93565c 100644 --- a/dapl/openib_common/dapl_ib_dto.h +++ b/dapl/openib_common/dapl_ib_dto.h @@ -346,7 +346,7 @@ dapls_ib_post_ext_send ( dapl_dbg_log(DAPL_DBG_TYPE_EP, " post_ext: OP_SEND_UD ah=%p" " qp_num=0x%x\n", - remote_ah, remote_ah->qpn); + remote_ah->ah, remote_ah->qpn); wr.opcode = OP_SEND; wr.wr.ud.ah = remote_ah->ah; diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 8560788..2403918 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -795,7 +795,7 @@ ud_bail: (DAPL_EVD *) ep_ptr->param.connect_evd_handle, event, (DAT_EP_HANDLE) ep_ptr, - (DAT_COUNT) cm_ptr->msg.p_size, + (DAT_COUNT) exp, (DAT_PVOID *) cm_ptr->msg.p_data, (DAT_PVOID *) &xevent); @@ -1213,7 +1213,7 @@ void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) cm_ptr->ep->param.connect_evd_handle, DAT_IB_UD_CONNECTION_EVENT_ESTABLISHED, (DAT_EP_HANDLE) cm_ptr->ep, - (DAT_COUNT) cm_ptr->msg.p_size, + (DAT_COUNT) ntohs(cm_ptr->msg.p_size), (DAT_PVOID *) cm_ptr->msg.p_data, (DAT_PVOID *) &xevent); diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index aa6bb73..5c5287f 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -182,7 +182,7 @@ static int dapl_select(struct dapl_fd_set *set) static void ucm_accept(ib_cm_srvc_handle_t cm, ib_cm_msg_t *msg); static void ucm_connect_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg); static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg); -static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg); +static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg, DAT_PVOID p_data, DAT_COUNT p_size); DAT_RETURN dapli_cm_disconnect(dp_ib_cm_handle_t cm); #define UCM_SND_BURST 100 @@ -304,7 +304,7 @@ static int ucm_reject(ib_hca_transport_t *tp, ib_cm_msg_t *msg) ntohs(smsg.daddr.ib.lid), ntohl(smsg.dqpn), ntohs(smsg.dport)); - return (ucm_send(tp, &smsg)); + return (ucm_send(tp, &smsg, NULL, 0)); } static void ucm_process_recv(ib_hca_transport_t *tp, @@ -489,7 +489,7 @@ retry: } /* ACTIVE/PASSIVE: build and send CM message out of CM object */ -static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg) +static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg, DAT_PVOID p_data, DAT_COUNT p_size) { ib_cm_msg_t *smsg = NULL; struct ibv_send_wr wr, *bad_wr; @@ -502,8 +502,10 @@ static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg) if ((smsg = ucm_get_smsg(tp)) == NULL) goto bail; - len = ((sizeof(*msg) - DCM_MAX_PDATA_SIZE) + ntohs(msg->p_size)); + len = (sizeof(*msg) - DCM_MAX_PDATA_SIZE); dapl_os_memcpy(smsg, msg, len); + if (p_size) + dapl_os_memcpy(&smsg->p_data, p_data, p_size); wr.next = NULL; wr.sg_list = &sge; @@ -514,13 +516,13 @@ static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg) if (len <= tp->max_inline_send) wr.send_flags |= IBV_SEND_INLINE; - sge.length = len; + sge.length = len + p_size; sge.lkey = tp->mr_sbuf->lkey; sge.addr = (uintptr_t)smsg; dapl_dbg_log(DAPL_DBG_TYPE_CM, " ucm_send: op %d ln %d lid %x c_qpn %x rport %d\n", - ntohs(smsg->op), len, htons(smsg->daddr.ib.lid), + ntohs(smsg->op), sge.length, htons(smsg->daddr.ib.lid), htonl(smsg->dqpn), htons(smsg->dport)); /* empty slot, then create AH */ @@ -717,7 +719,7 @@ DAT_RETURN dapli_cm_disconnect(dp_ib_cm_handle_t cm) } else { /* send disc, schedule destroy */ cm->msg.op = htons(DCM_DREQ); - if (ucm_send(&cm->hca->ib_trans, &cm->msg)) { + if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) { dapl_log(DAPL_DBG_TYPE_WARN, " disc_req: ERR-> %s lid %d qpn %d" " r_psp %d \n", strerror(errno), @@ -788,7 +790,8 @@ dapli_cm_connect(DAPL_EP *ep, dp_ib_cm_handle_t cm) dapl_os_unlock(&cm->lock); cm->msg.op = htons(DCM_REQ); - if (ucm_send(&cm->hca->ib_trans, &cm->msg)) + if (ucm_send(&cm->hca->ib_trans, &cm->msg, + &cm->msg.p_data, ntohs(cm->msg.p_size))) goto bail; /* first time through, put on work queue */ @@ -910,10 +913,10 @@ static void ucm_connect_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) } dapl_os_unlock(&cm->ep->header.lock); - /* Send RTU */ + /* Send RTU, no private data */ cm->msg.op = htons(DCM_RTU); - if (ucm_send(&cm->hca->ib_trans, &cm->msg)) + if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) goto bail; /* init cm_handle and post the event with private data */ @@ -929,11 +932,21 @@ ud_bail: /* post EVENT, modify_qp, AH already created, ucm msg */ xevent.status = 0; xevent.type = DAT_IB_UD_REMOTE_AH; - xevent.remote_ah.ah = cm->hca->ib_trans.ah[lid]; xevent.remote_ah.qpn = ntohl(cm->msg.daddr.ib.qpn); + xevent.remote_ah.ah = dapls_create_ah(cm->hca, + cm->ep->qp_handle->pd, + cm->ep->qp_handle, + htons(lid), + NULL); + if (xevent.remote_ah.ah == NULL) { + event = IB_CME_LOCAL_FAILURE; + goto bail; + } + dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, sizeof(union dcm_addr)); + /* remote ia_addr reference includes ucm qpn, not IB qpn */ ((union dcm_addr*) &xevent.remote_ah.ia_addr)->ib.qpn = cm->msg.dqpn; @@ -1086,10 +1099,6 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) cm->state = DCM_CONNECTED; dapl_os_unlock(&cm->lock); - if (msg->p_size) - dapl_os_memcpy(cm->msg.p_data, - msg->p_data, ntohs(msg->p_size)); - /* final data exchange if remote QP state is good to go */ dapl_dbg_log(DAPL_DBG_TYPE_CM, " PASSIVE: connected!\n"); @@ -1101,11 +1110,19 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) /* post EVENT, modify_qp, AH already created, ucm msg */ xevent.status = 0; xevent.type = DAT_IB_UD_PASSIVE_REMOTE_AH; - xevent.remote_ah.ah = cm->hca->ib_trans.ah[lid]; xevent.remote_ah.qpn = ntohl(cm->msg.daddr.ib.qpn); + xevent.remote_ah.ah = dapls_create_ah(cm->hca, + cm->ep->qp_handle->pd, + cm->ep->qp_handle, + htons(lid), + NULL); + if (xevent.remote_ah.ah == NULL) + goto bail; + dapl_os_memcpy(&xevent.remote_ah.ia_addr, &cm->msg.daddr, sizeof(union dcm_addr)); + /* remote ia_addr reference includes ucm qpn, not IB qpn */ ((union dcm_addr*) &xevent.remote_ah.ia_addr)->ib.qpn = cm->msg.dqpn; @@ -1238,9 +1255,14 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT p_size, DAT_PVOID p_data) cm->msg.saddr.ib.port_num = cm->hca->port_num; cm->msg.saddr.ib.lid = cm->hca->ib_trans.addr.ib.lid; cm->msg.saddr.ib.gid = cm->hca->ib_trans.addr.ib.gid; - dapl_os_memcpy(&cm->msg.p_data, p_data, p_size); - - if (ucm_send(&cm->hca->ib_trans, &cm->msg)) + + /* + * UD: deliver p_data with REQ and EST event, keep REQ p_data in + * cm->msg.p_data and save REPLY accept data in cm->p_data for retries + */ + cm->p_size = p_size; + dapl_os_memcpy(&cm->p_data, p_data, p_size); + if (ucm_send(&cm->hca->ib_trans, &cm->msg, p_data, p_size)) goto bail; /* save state and setup valid reference to EP, HCA */ @@ -1565,7 +1587,7 @@ dapls_ib_reject_connection(IN dp_ib_cm_handle_t cm, if (psize) dapl_os_memcpy(&cm->msg.p_data, pdata, psize); - if (ucm_send(&cm->hca->ib_trans, &cm->msg)) { + if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) { dapl_log(DAPL_DBG_TYPE_WARN, " cm_reject: ERR: %s\n", strerror(errno)); return DAT_INTERNAL_ERROR; diff --git a/dapl/openib_ucm/dapl_ib_util.h b/dapl/openib_ucm/dapl_ib_util.h index dfee2b9..ef5358a 100644 --- a/dapl/openib_ucm/dapl_ib_util.h +++ b/dapl/openib_ucm/dapl_ib_util.h @@ -45,6 +45,8 @@ struct ib_cm_handle struct dapl_hca *hca; struct dapl_sp *sp; struct dapl_ep *ep; + uint16_t p_size; /* accept p_data, for retries */ + uint8_t p_data[DCM_MAX_PDATA_SIZE]; ib_cm_msg_t msg; }; -- 1.5.2.5 From arlin.r.davis at intel.com Wed Sep 9 15:14:46 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Wed, 9 Sep 2009 15:14:46 -0700 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes Message-ID: tighten up locking on CM processing and state changes and reduce the send completion threshold to 50 from 100 to replenish the request message faster. Signed-off-by: Arlin Davis --- dapl/openib_ucm/cm.c | 24 ++++++++++++++++++------ 1 files changed, 18 insertions(+), 6 deletions(-) diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index 5c5287f..e76e920 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -185,7 +185,7 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg); static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg, DAT_PVOID p_data, DAT_COUNT p_size); DAT_RETURN dapli_cm_disconnect(dp_ib_cm_handle_t cm); -#define UCM_SND_BURST 100 +#define UCM_SND_BURST 50 /* Service ids - port space */ static uint16_t ucm_get_port(ib_hca_transport_t *tp, uint16_t port) @@ -916,11 +916,14 @@ static void ucm_connect_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) /* Send RTU, no private data */ cm->msg.op = htons(DCM_RTU); + dapl_os_lock(&cm->lock); + cm->state = DCM_CONNECTED; + dapl_os_unlock(&cm->lock); + if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) goto bail; /* init cm_handle and post the event with private data */ - cm->state = DCM_CONNECTED; dapl_dbg_log(DAPL_DBG_TYPE_EP, " ACTIVE: connected!\n"); #ifdef DAT_EXTENSIONS @@ -986,7 +989,10 @@ ud_bail: (DAT_PVOID *)&xevent); /* we are done, don't destroy cm_ptr, need pdata */ + dapl_os_lock(&cm->lock); cm->state = DCM_RELEASED; + dapl_os_unlock(&cm->lock); + } else #endif { @@ -1157,7 +1163,9 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, ib_cm_msg_t *msg) (DAT_PVOID *)&xevent); /* done with CM object, don't destroy cm, need pdata */ + dapl_os_lock(&cm->lock); cm->state = DCM_RELEASED; + dapl_os_unlock(&cm->lock); } else { #endif cm->ep->cm_handle = cm; /* only RC, multi CR's on UD */ @@ -1262,8 +1270,6 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT p_size, DAT_PVOID p_data) */ cm->p_size = p_size; dapl_os_memcpy(&cm->p_data, p_data, p_size); - if (ucm_send(&cm->hca->ib_trans, &cm->msg, p_data, p_size)) - goto bail; /* save state and setup valid reference to EP, HCA */ dapl_os_lock(&cm->lock); @@ -1272,6 +1278,9 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT p_size, DAT_PVOID p_data) cm->state = DCM_ACCEPTED; dapl_os_unlock(&cm->lock); + if (ucm_send(&cm->hca->ib_trans, &cm->msg, p_data, p_size)) + goto bail; + dapl_dbg_log(DAPL_DBG_TYPE_CM, " PASSIVE: accepted!\n"); return DAT_SUCCESS; @@ -1587,14 +1596,17 @@ dapls_ib_reject_connection(IN dp_ib_cm_handle_t cm, if (psize) dapl_os_memcpy(&cm->msg.p_data, pdata, psize); + /* cr_thread will destroy CR */ + dapl_os_lock(&cm->lock); + cm->state = DCM_REJECTING; + dapl_os_unlock(&cm->lock); + if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) { dapl_log(DAPL_DBG_TYPE_WARN, " cm_reject: ERR: %s\n", strerror(errno)); return DAT_INTERNAL_ERROR; } - /* cr_thread will destroy CR */ - cm->state = DCM_REJECTING; send(cm->hca->ib_trans.scm[1], "w", sizeof "w", 0); return DAT_SUCCESS; } -- 1.5.2.5 From Barry.Mavin at recital.com Wed Sep 9 17:02:24 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Thu, 10 Sep 2009 05:32:24 +0530 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: Message-ID: Hi Not sure if this is the correct place to ask this question. We have a cluster site with OFED 1.4.1 installed using mellanox cards and switches which we installed from source on all boxes. The cluster is behaving itself but we have had some throughput problems with ib in our applications. We have tried to use NFS / RDMA but cannot get it to work "out-of-the-box". The kerenel modules all load fine but any attempt to mount from the client hangs. On some occasions even causes the servers to reboot. OS is redhat 5.3. Mellanox fw is all at 2.6 Has anyone got NFS / RDMA working with redhat 5.3? --- Regards Barry Mavin Recital Corporation > From: "Davis, Arlin R" > Date: Wed, 9 Sep 2009 15:14:46 -0700 > To: "general at lists.openfabrics.org" , > "ofw at lists.openfabrics.org" > Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM > processing, state changes > > > > tighten up locking on CM processing and state changes > and reduce the send completion threshold to 50 from 100 > to replenish the request message faster. > > Signed-off-by: Arlin Davis > --- > dapl/openib_ucm/cm.c | 24 ++++++++++++++++++------ > 1 files changed, 18 insertions(+), 6 deletions(-) > > diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c > index 5c5287f..e76e920 100644 > --- a/dapl/openib_ucm/cm.c > +++ b/dapl/openib_ucm/cm.c > @@ -185,7 +185,7 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, > ib_cm_msg_t *msg); > static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg, DAT_PVOID > p_data, DAT_COUNT p_size); > DAT_RETURN dapli_cm_disconnect(dp_ib_cm_handle_t cm); > > -#define UCM_SND_BURST 100 > +#define UCM_SND_BURST 50 > > /* Service ids - port space */ > static uint16_t ucm_get_port(ib_hca_transport_t *tp, uint16_t port) > @@ -916,11 +916,14 @@ static void ucm_connect_rtu(dp_ib_cm_handle_t cm, > ib_cm_msg_t *msg) > /* Send RTU, no private data */ > cm->msg.op = htons(DCM_RTU); > > + dapl_os_lock(&cm->lock); > + cm->state = DCM_CONNECTED; > + dapl_os_unlock(&cm->lock); > + > if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) > goto bail; > > /* init cm_handle and post the event with private data */ > - cm->state = DCM_CONNECTED; > dapl_dbg_log(DAPL_DBG_TYPE_EP, " ACTIVE: connected!\n"); > > #ifdef DAT_EXTENSIONS > @@ -986,7 +989,10 @@ ud_bail: > (DAT_PVOID *)&xevent); > > /* we are done, don't destroy cm_ptr, need pdata */ > + dapl_os_lock(&cm->lock); > cm->state = DCM_RELEASED; > + dapl_os_unlock(&cm->lock); > + > } else > #endif > { > @@ -1157,7 +1163,9 @@ static void ucm_accept_rtu(dp_ib_cm_handle_t cm, > ib_cm_msg_t *msg) > (DAT_PVOID *)&xevent); > > /* done with CM object, don't destroy cm, need pdata */ > + dapl_os_lock(&cm->lock); > cm->state = DCM_RELEASED; > + dapl_os_unlock(&cm->lock); > } else { > #endif > cm->ep->cm_handle = cm; /* only RC, multi CR's on UD */ > @@ -1262,8 +1270,6 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT > p_size, DAT_PVOID p_data) > */ > cm->p_size = p_size; > dapl_os_memcpy(&cm->p_data, p_data, p_size); > - if (ucm_send(&cm->hca->ib_trans, &cm->msg, p_data, p_size)) > - goto bail; > > /* save state and setup valid reference to EP, HCA */ > dapl_os_lock(&cm->lock); > @@ -1272,6 +1278,9 @@ dapli_accept_usr(DAPL_EP *ep, DAPL_CR *cr, DAT_COUNT > p_size, DAT_PVOID p_data) > cm->state = DCM_ACCEPTED; > dapl_os_unlock(&cm->lock); > > + if (ucm_send(&cm->hca->ib_trans, &cm->msg, p_data, p_size)) > + goto bail; > + > dapl_dbg_log(DAPL_DBG_TYPE_CM, " PASSIVE: accepted!\n"); > return DAT_SUCCESS; > > @@ -1587,14 +1596,17 @@ dapls_ib_reject_connection(IN dp_ib_cm_handle_t cm, > if (psize) > dapl_os_memcpy(&cm->msg.p_data, pdata, psize); > > + /* cr_thread will destroy CR */ > + dapl_os_lock(&cm->lock); > + cm->state = DCM_REJECTING; > + dapl_os_unlock(&cm->lock); > + > if (ucm_send(&cm->hca->ib_trans, &cm->msg, NULL, 0)) { > dapl_log(DAPL_DBG_TYPE_WARN, > " cm_reject: ERR: %s\n", strerror(errno)); > return DAT_INTERNAL_ERROR; > } > > - /* cr_thread will destroy CR */ > - cm->state = DCM_REJECTING; > send(cm->hca->ib_trans.scm[1], "w", sizeof "w", 0); > return DAT_SUCCESS; > } > -- > 1.5.2.5 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From landman at scalableinformatics.com Wed Sep 9 17:16:34 2009 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 09 Sep 2009 20:16:34 -0400 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: References: Message-ID: <4AA84562.8030604@scalableinformatics.com> Barry Mavin wrote: > OS is redhat 5.3. > Mellanox fw is all at 2.6 > > Has anyone got NFS / RDMA working with redhat 5.3? Yes, but you have to update nfs-tools, and change your mount to use the 'insecure' option. Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz and build it (1.2.0 may work, we haven't played with it). > > --- > Regards > Barry Mavin > Recital Corporation -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From Barry.Mavin at recital.com Wed Sep 9 17:33:46 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Thu, 10 Sep 2009 06:03:46 +0530 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: <4AA84562.8030604@scalableinformatics.com> Message-ID: I have specified insecure on the server side. Is that required on the client mount command also? Looking at the logs, the mount is successful but an I/O error happens afterwards. OFED 1.4.1 installed nfs-utils 1.1.6. Do they need reinstalled again? I did in fact try to build 1.2 and configure fails against OFED 1.4.1 on RH 5.3 --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: Barry_Mavin at msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 > From: Joe Landman > Organization: Scalable Informatics > Reply-To: > Date: Wed, 09 Sep 2009 20:16:34 -0400 > To: Barry Mavin > Cc: "Davis, Arlin R" , > "general at lists.openfabrics.org" , > "ofw at lists.openfabrics.org" > Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with > CM processing, state changes > > Barry Mavin wrote: > >> OS is redhat 5.3. >> Mellanox fw is all at 2.6 >> >> Has anyone got NFS / RDMA working with redhat 5.3? > > Yes, but you have to update nfs-tools, and change your mount to use the > 'insecure' option. > > Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz > and build it (1.2.0 may work, we haven't played with it). > >> >> --- >> Regards >> Barry Mavin >> Recital Corporation > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics, Inc. > email: landman at scalableinformatics.com > web : http://scalableinformatics.com > http://scalableinformatics.com/jackrabbit > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 From Barry.Mavin at recital.com Wed Sep 9 22:20:39 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Thu, 10 Sep 2009 10:50:39 +0530 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: Message-ID: I am still having problems mounting an NFS share with RDMA. [root at senkas1 nfs-utils-1.1.6]# /sbin/mount.nfs 10.10.10.3:/home/nfs /mnt/nfs -o rdma,port=20049 mount.nfs: Unsupported nfs mount option: rdma This is after updating to 1.1.6 nfs-utils. --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: Barry_Mavin at msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 > From: Barry Mavin > Date: Thu, 10 Sep 2009 06:03:46 +0530 > To: > Cc: "ofw at lists.openfabrics.org" , > "general at lists.openfabrics.org" , "Davis, Arlin > R" > Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with > CM processing, state changes > > I have specified insecure on the server side. Is that required on the client > mount command also? > > Looking at the logs, the mount is successful but an I/O error happens > afterwards. > > OFED 1.4.1 installed nfs-utils 1.1.6. Do they need reinstalled again? > > I did in fact try to build 1.2 and configure fails against OFED 1.4.1 on RH > 5.3 > > --- > Regards > Barry Mavin > Recital Corporation > Chairman and CEO > Website: http://www.recital.com > MSN Messenger: Barry_Mavin at msn.com > Skype: BarryMavin > Direct line worldwide: +1 9785224139 > > > >> From: Joe Landman >> Organization: Scalable Informatics >> Reply-To: >> Date: Wed, 09 Sep 2009 20:16:34 -0400 >> To: Barry Mavin >> Cc: "Davis, Arlin R" , >> "general at lists.openfabrics.org" , >> "ofw at lists.openfabrics.org" >> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >> CM processing, state changes >> >> Barry Mavin wrote: >> >>> OS is redhat 5.3. >>> Mellanox fw is all at 2.6 >>> >>> Has anyone got NFS / RDMA working with redhat 5.3? >> >> Yes, but you have to update nfs-tools, and change your mount to use the >> 'insecure' option. >> >> Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz >> and build it (1.2.0 may work, we haven't played with it). >> >>> >>> --- >>> Regards >>> Barry Mavin >>> Recital Corporation >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics, Inc. >> email: landman at scalableinformatics.com >> web : http://scalableinformatics.com >> http://scalableinformatics.com/jackrabbit >> phone: +1 734 786 8423 x121 >> fax : +1 866 888 3112 >> cell : +1 734 612 4615 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From keshetti.mahesh at gmail.com Thu Sep 10 02:45:52 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Thu, 10 Sep 2009 15:15:52 +0530 Subject: [ofa-general] [PATCH] Added 'ibcheckspeed', Fix 'ibcheckportwidth' Message-ID: <829ded920909100245r1dfabe03k57b2088843c0a136@mail.gmail.com> - Added 'ibcheckspeed': Similar to 'ibcheckwidth' in operation and usage. Reports error/warning messages if the LinkSpeedActive is 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. - Fix 'ibcheckportwidth': Exit if the maximum LinkWidthSupported is 1X instead of processing next line of output. Signed-off-by: Keshetti Mahesh --- infiniband-diags/scripts/ibcheckportspeed.in | 146 ++++++++++++++++++++++++++ infiniband-diags/scripts/ibcheckportwidth.in | 2 +- infiniband-diags/scripts/ibcheckspeed.in | 135 ++++++++++++++++++++++++ 3 files changed, 282 insertions(+), 1 deletions(-) create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in create mode 100644 infiniband-diags/scripts/ibcheckspeed.in diff --git a/infiniband-diags/scripts/ibcheckportspeed.in b/infiniband-diags/scripts/ibcheckportspeed.in new file mode 100644 index 0000000..538a7a7 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckportspeed.in @@ -0,0 +1,146 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ + "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " + exit -1 +} + +function green() { + if [ "$bw" = "yes" ]; then + if [ "$verbose" = "yes" ]; then + echo $1 + fi + return + fi + if [ "$verbose" = "yes" ]; then + echo -e "\\033[1;032m" $1 "\\033[0;39m" + fi +} + +function red() { + if [ "$bw" = "yes" ]; then + echo $1 + return + fi + echo -e "\\033[1;031m" $1 "\\033[0;39m" +} + +function blue() +{ + if [ "$bw" = "yes" ]; then + echo $1 + return + fi + echo -e "\033[1;034m" $1 "\033[0;39m" +} + +guid_addr="" +bw="" +verbose="" +ca_info="" + +while [ "$1" ]; do + case $1 in + -G) + guid_addr=yes + ;; + -nocolor|-N) + bw=yes + ;; + -v) + verbose=yes + ;; + -P | -C | -t | -timeout) + case $2 in + -*) + usage + ;; + esac + if [ x$2 = x ] ; then + usage + fi + ca_info="$ca_info $1 $2" + shift + ;; + -*) + usage + ;; + *) + break + ;; + esac + shift +done + +if [ $# -lt 2 ] +then + usage +fi + +portnum=$2 + +if [ "$guid_addr" ] +then + if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit -1} {print $3}'` + then + echo -n "guid $1 address resolution: " + red "FAILED" + exit -1 + fi + guid=$1 +else + lid=$1 + if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit -1} {print $3}'` + then + echo -n "lid $1 address resolution: " + red "FAILED" + exit -1 + fi +fi + +text="`eval $IBPATH/smpquery $ca_info portinfo $lid $portnum`" +rv=$? +if echo "$text" | sed 's/[0-9]/#&/;s/ Gbps//g' | awk -v mono=$bw -F '#' ' +function blue(s) +{ + if (mono) + printf s + else if (!quiet) { + printf "\033[1;034m" s + printf "\033[0;39m" + } +} + +# Only check LinkSpeedActive if LinkSpeedSupported is not 2.5 Gbps +/^LinkSpeedSupported/{ if ($2 == "2.5") { exit } } +/^LinkSpeedActive/{ if ($2 == "2.5") warn = warn "#warn: Link configured as 2.5 Gbps lid '$lid' port '$portnum'\n"} + +/^ib/ {print $0; next} +/ibpanic:/ {print $0} +/ibwarn:/ {print $0} +/iberror:/ {print $0} + +END { + if (err != "") { + blue(err) + exit -1 + } + if (warn != "") { + blue(warn) + exit -1 + } + exit 0 +}' 2>&1 && test $rv -eq 0 ; then + if [ "$verbose" = "yes" ]; then + echo -n "Port check lid $lid port $portnum: " + green "OK" + fi + exit 0 +else + echo -n "Port check lid $lid port $portnum: " + red "FAILED" + exit -1 +fi diff --git a/infiniband-diags/scripts/ibcheckportwidth.in b/infiniband-diags/scripts/ibcheckportwidth.in index 32c5c5e..60a0892 100644 --- a/infiniband-diags/scripts/ibcheckportwidth.in +++ b/infiniband-diags/scripts/ibcheckportwidth.in @@ -103,7 +103,7 @@ function blue(s) } # Only check LinkWidthActive if LinkWidthSupported is not 1X -/^LinkWidthSupported/{ if ($2 != "1X") { next } } +/^LinkWidthSupported/{ if ($2 == "1X") { exit } } /^LinkWidthActive/{ if ($2 == "1X") warn = warn "#warn: Link configured as 1X lid '$lid' port '$portnum'\n"} /^ib/ {print $0; next} diff --git a/infiniband-diags/scripts/ibcheckspeed.in b/infiniband-diags/scripts/ibcheckspeed.in new file mode 100644 index 0000000..25c2201 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckspeed.in @@ -0,0 +1,135 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor]" \ + "[ \| -C ca_name -P ca_port -t(imeout) timeout_ms]" + exit -1 +} + +function user_abort() { + echo "Aborted" + exit 1 +} + +trap user_abort SIGINT + +gflags="" +verbose="" +v=0 +ntype="" +nodeguid="" +oldlid="" +topofile="" +ca_info="" + +while [ "$1" ]; do + case $1 in + -h) + usage + ;; + -N|-nocolor) + gflags=-N + ;; + -v) + verbose="-v" + v=1 + ;; + -P | -C | -t | -timeout) + case $2 in + -*) + usage + ;; + esac + if [ x$2 = x ] ; then + usage + fi + ca_info="$ca_info $1 $2" + shift + ;; + -*) + usage + ;; + *) + if [ "$topofile" ]; then + usage + fi + topofile="$1" + ;; + esac + shift +done + +if [ "$topofile" ]; then + netcmd="cat $topofile" +else + netcmd="$IBPATH/ibnetdiscover $ca_info" +fi + +text="`eval $netcmd`" +rv=$? +echo "$text" | awk ' +BEGIN { + ne=0 + pe=0 +} +function check_node(lid) +{ + nodechecked=1 + if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' '$verbose' " lid)) { + ne++ + badnode=1 + return + } +} + +/^Ca/ || /^Switch/ || /^Rt/ { + nnodes++ + ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 + if ('$v') + print "\n# Checking " ntype ": nodeguid 0x" nodeguid + + nodechecked=0 + badnode=0 + if (ntype != "Switch") + next + + lid = substr($0, index($0, "port 0 lid ") + 11) + lid = substr(lid, 1, index(lid, " ") - 1) + check_node(lid) + } +/^\[/ { + nports++ + port = $1 + if (!nodechecked) { + lid = substr($0, index($0, " lid ") + 5) + lid = substr(lid, 1, index(lid, " ") - 1) + check_node(lid) + } + if (badnode) { + print "\n# " ntype ": nodeguid 0x" nodeguid " failed" + next + } + sub("\\(.*\\)", "", port) + gsub("[\\[\\]]", "", port) + if (system("'$IBPATH'/ibcheckportspeed'"$ca_info"' '$gflags' '$verbose' " lid " " port)) { + if (!'$v' && oldlid != lid) { + print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" + oldlid = lid + } + pe++; + } +} + +/^ib/ {print $0; next} +/ibpanic:/ {print $0} +/ibwarn:/ {print $0} +/iberror:/ {print $0} + +END { + printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne + printf "## %d ports checked, %d ports with 2.5 Gbps speed in error found\n", nports, pe +} +' +exit $rv -- 1.6.4.2 From vlad at lists.openfabrics.org Thu Sep 10 03:09:59 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 10 Sep 2009 03:09:59 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090910-0200 daily build status Message-ID: <20090910100959.96C93E3004F@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Failed: From Barry.Mavin at recital.com Thu Sep 10 03:52:17 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Thu, 10 Sep 2009 16:22:17 +0530 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: Message-ID: When trying to mount an nfs share with nfs rdma, what is the mount command that should be used? --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: Barry_Mavin at msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 > From: Barry Mavin > Date: Thu, 10 Sep 2009 10:50:39 +0530 > To: Barry Mavin , > Cc: "ofw at lists.openfabrics.org" , > "general at lists.openfabrics.org" , "Davis, Arlin > R" > Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with > CM processing, state changes > > I am still having problems mounting an NFS share with RDMA. > > [root at senkas1 nfs-utils-1.1.6]# /sbin/mount.nfs 10.10.10.3:/home/nfs > /mnt/nfs -o rdma,port=20049 > mount.nfs: Unsupported nfs mount option: rdma > > This is after updating to 1.1.6 nfs-utils. > > --- > Regards > Barry Mavin > Recital Corporation > Chairman and CEO > Website: http://www.recital.com > MSN Messenger: Barry_Mavin at msn.com > Skype: BarryMavin > Direct line worldwide: +1 9785224139 > > > >> From: Barry Mavin >> Date: Thu, 10 Sep 2009 06:03:46 +0530 >> To: >> Cc: "ofw at lists.openfabrics.org" , >> "general at lists.openfabrics.org" , "Davis, >> Arlin >> R" >> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >> CM processing, state changes >> >> I have specified insecure on the server side. Is that required on the client >> mount command also? >> >> Looking at the logs, the mount is successful but an I/O error happens >> afterwards. >> >> OFED 1.4.1 installed nfs-utils 1.1.6. Do they need reinstalled again? >> >> I did in fact try to build 1.2 and configure fails against OFED 1.4.1 on RH >> 5.3 >> >> --- >> Regards >> Barry Mavin >> Recital Corporation >> Chairman and CEO >> Website: http://www.recital.com >> MSN Messenger: Barry_Mavin at msn.com >> Skype: BarryMavin >> Direct line worldwide: +1 9785224139 >> >> >> >>> From: Joe Landman >>> Organization: Scalable Informatics >>> Reply-To: >>> Date: Wed, 09 Sep 2009 20:16:34 -0400 >>> To: Barry Mavin >>> Cc: "Davis, Arlin R" , >>> "general at lists.openfabrics.org" , >>> "ofw at lists.openfabrics.org" >>> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >>> CM processing, state changes >>> >>> Barry Mavin wrote: >>> >>>> OS is redhat 5.3. >>>> Mellanox fw is all at 2.6 >>>> >>>> Has anyone got NFS / RDMA working with redhat 5.3? >>> >>> Yes, but you have to update nfs-tools, and change your mount to use the >>> 'insecure' option. >>> >>> Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz >>> and build it (1.2.0 may work, we haven't played with it). >>> >>>> >>>> --- >>>> Regards >>>> Barry Mavin >>>> Recital Corporation >>> >>> -- >>> Joseph Landman, Ph.D >>> Founder and CEO >>> Scalable Informatics, Inc. >>> email: landman at scalableinformatics.com >>> web : http://scalableinformatics.com >>> http://scalableinformatics.com/jackrabbit >>> phone: +1 734 786 8423 x121 >>> fax : +1 866 888 3112 >>> cell : +1 734 612 4615 >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general > From dorons at voltaire.com Thu Sep 10 04:56:46 2009 From: dorons at voltaire.com (Doron Shoham) Date: Thu, 10 Sep 2009 14:56:46 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts Message-ID: <4AA8E97E.1090109@voltaire.com> ibcheckroutes validates route between all hosts in the fabric. This script finds all leaf switches (switches that are connected to HCAs) and runs ibtracert between them. When using various routing algorithms (e.g. up-down), if fabric topology is not suitable there will be no routes between some nodes. It reports when the route exists between source and destination LIDs. Signed-off-by: Doron Shoham --- infiniband-diags/Makefile.am | 4 +- infiniband-diags/configure.in | 1 + infiniband-diags/man/ibcheckroutes.8 | 39 +++++++++++ infiniband-diags/scripts/ibcheckroutes.in | 101 +++++++++++++++++++++++++++++ 4 files changed, 143 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/ibcheckroutes.8 create mode 100755 infiniband-diags/scripts/ibcheckroutes.in diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 1cdb60e..57363c4 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ - scripts/check_lft_balance.pl + scripts/check_lft_balance.pl scripts/ibcheckroutes noinst_LIBRARIES = libcommon.a @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ - man/check_lft_balance.8 + man/check_lft_balance.8 man/ibcheckroutes.8 BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 3ef35cc..aa178c5 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ scripts/ibcheckportwidth \ scripts/ibcheckstate \ scripts/ibcheckwidth \ + scripts/ibcheckroutes \ scripts/ibclearcounters \ scripts/ibclearerrors \ scripts/ibdatacounts \ diff --git a/infiniband-diags/man/ibcheckroutes.8 b/infiniband-diags/man/ibcheckroutes.8 new file mode 100644 index 0000000..a6a073f --- /dev/null +++ b/infiniband-diags/man/ibcheckroutes.8 @@ -0,0 +1,39 @@ +.TH IBCHECKPORT 8 "September 10, 2009" "OpenIB" "OpenIB Diagnostics" + +.SH NAME +ibcheckroutes \- validates routes between all hosts in fabric + +.SH SYNOPSIS +.B ibcheckroutes +[\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] + +.SH DESCRIPTION +.PP +ibcheckroutes is a script which uses a full topology file that was created by ibnetdiscover, +scans the network to validate routes between all hosts in the fabric. + +.SH OPTIONS +.PP +\-h Show help. +.PP +\-N Use mono rather than color mode. +.PP +\-b Suppress output. +.PP +\-e Show errors only. +.PP +\-C Use the specified ca_name. +.PP +\-P Use the specified ca_port. +.PP +\-t Override the default timeout for the solicited mads. + +.SH SEE ALSO +.BR ibnetdiscover(8), +.BR ibtracert(8), +.BR ibroute(8) + +.SH AUTHOR +.TP +Doron Shoham +.RI < dorons at voltaire.com > diff --git a/infiniband-diags/scripts/ibcheckroutes.in b/infiniband-diags/scripts/ibcheckroutes.in new file mode 100755 index 0000000..eb3ad30 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckroutes.in @@ -0,0 +1,101 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo Usage: `basename $0` "[-h] [-N] [-b] [-e] [-C ca_name] [-P ca_port] [-t(imeout) timeout_ms]" + echo -e " validate routes between all hosts in fabric" + echo -e " -h - Show help" + echo -e " -N - Use mono rather than color mode" + echo -e " -b - Suppress output" + echo -e " -e - Show errors only" + echo -e " -C - Use the specified ca_name" + echo -e " -P - Use the specified ca_port" + echo -e " -t - Override the default timeout for the solicited" + exit -1 +} + +function user_abort() { + echo "Aborted" + exit 1 +} + +function green() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[OK]\n" $1 + return + fi + printf "\033[1;032m${res_col}[OK]\033[0;39m\n" $1 +} + +function red() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[FAILED]\n" "$1" + return + fi + printf "\033[31m${res_col}[FAILED]\033[0m\n" "$1" +} + +trap user_abort SIGINT SIGTERM + +bw="" +brief=0 +error=0 +ca_info="" +st=0 +topofile=/tmp/net +res_col="%-20.20s" + +function get_opts() { + while getopts P:C:t:beNh o; do + case "$o" in + h) + usage + ;; + N) + bw="yes" + ;; + b) + brief=1 + ;; + e) + error=1 + ;; + P | C | t | timeout) + ca_info="$ca_info -$o $OPTARG" + ;; + *) + usage + ;; + esac + done +} + +get_opts $* + +$IBPATH/ibnetdiscover $ca_info > $topofile + +# find all leaf switches LIDs +LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) print v}' $topofile)) +N=${#LIDS[@]} + +if [ $N -lt 2 ]; then + echo "Fabric contains only one switch" + exit 0 +fi + +# check routes between all switches in fabric +[ $brief -eq 0 ] && echo -e "Checking route between:\nSource lid --> Destination lid" +for((s=0; s /dev/null + if [ $? -eq 0 ]; then + [ $brief -eq 0 ] && [ $error -eq 0 ] && green "${LIDS[$s]}-->${LIDS[$d]}" + else + [ $brief -eq 0 ] && red "${LIDS[$s]}-->${LIDS[$d]}" + st=1 + fi + done +done + +exit $st -- 1.5.4 From hal.rosenstock at gmail.com Thu Sep 10 06:02:20 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 10 Sep 2009 09:02:20 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AA8E97E.1090109@voltaire.com> References: <4AA8E97E.1090109@voltaire.com> Message-ID: On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham wrote: > ibcheckroutes validates route between all hosts in the fabric. > This script finds all leaf switches (switches that are connected to HCAs) > CAs or HCAs ? What about switch port 0s ? > and runs ibtracert between them. > When using various routing algorithms (e.g. up-down), > With which routing algorithms has this been tried ? -- Hal > if fabric topology is not suitable there will be no > routes between some nodes. > It reports when the route exists between source and destination LIDs. > > Signed-off-by: Doron Shoham > -------------- next part -------------- An HTML attachment was scrubbed... URL: From keshetti.mahesh at gmail.com Thu Sep 10 06:02:08 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Thu, 10 Sep 2009 18:32:08 +0530 Subject: [ofa-general] [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts Message-ID: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. Reports error/warning messages if the LinkSpeedActive is configured as 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> --- infiniband-diags/scripts/ibcheckportspeed.in | 146 ++++++++++++++++++++++++++ infiniband-diags/scripts/ibcheckportwidth.in | 2 +- infiniband-diags/scripts/ibcheckspeed.in | 135 ++++++++++++++++++++++++ 3 files changed, 282 insertions(+), 1 deletions(-) create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in create mode 100644 infiniband-diags/scripts/ibcheckspeed.in diff --git a/infiniband-diags/scripts/ibcheckportspeed.in b/infiniband-diags/scripts/ibcheckportspeed.in new file mode 100644 index 0000000..538a7a7 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckportspeed.in @@ -0,0 +1,146 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ + "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " + exit -1 +} + +function green() { + if [ "$bw" = "yes" ]; then + if [ "$verbose" = "yes" ]; then + echo $1 + fi + return + fi + if [ "$verbose" = "yes" ]; then + echo -e "\\033[1;032m" $1 "\\033[0;39m" + fi +} + +function red() { + if [ "$bw" = "yes" ]; then + echo $1 + return + fi + echo -e "\\033[1;031m" $1 "\\033[0;39m" +} + +function blue() +{ + if [ "$bw" = "yes" ]; then + echo $1 + return + fi + echo -e "\033[1;034m" $1 "\033[0;39m" +} + +guid_addr="" +bw="" +verbose="" +ca_info="" + +while [ "$1" ]; do + case $1 in + -G) + guid_addr=yes + ;; + -nocolor|-N) + bw=yes + ;; + -v) + verbose=yes + ;; + -P | -C | -t | -timeout) + case $2 in + -*) + usage + ;; + esac + if [ x$2 = x ] ; then + usage + fi + ca_info="$ca_info $1 $2" + shift + ;; + -*) + usage + ;; + *) + break + ;; + esac + shift +done + +if [ $# -lt 2 ] +then + usage +fi + +portnum=$2 + +if [ "$guid_addr" ] +then + if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit -1} {print $3}'` + then + echo -n "guid $1 address resolution: " + red "FAILED" + exit -1 + fi + guid=$1 +else + lid=$1 + if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit -1} {print $3}'` + then + echo -n "lid $1 address resolution: " + red "FAILED" + exit -1 + fi +fi + +text="`eval $IBPATH/smpquery $ca_info portinfo $lid $portnum`" +rv=$? +if echo "$text" | sed 's/[0-9]/#&/;s/ Gbps//g' | awk -v mono=$bw -F '#' ' +function blue(s) +{ + if (mono) + printf s + else if (!quiet) { + printf "\033[1;034m" s + printf "\033[0;39m" + } +} + +# Only check LinkSpeedActive if LinkSpeedSupported is not 2.5 Gbps +/^LinkSpeedSupported/{ if ($2 == "2.5") { exit } } +/^LinkSpeedActive/{ if ($2 == "2.5") warn = warn "#warn: Link configured as 2.5 Gbps lid '$lid' port '$portnum'\n"} + +/^ib/ {print $0; next} +/ibpanic:/ {print $0} +/ibwarn:/ {print $0} +/iberror:/ {print $0} + +END { + if (err != "") { + blue(err) + exit -1 + } + if (warn != "") { + blue(warn) + exit -1 + } + exit 0 +}' 2>&1 && test $rv -eq 0 ; then + if [ "$verbose" = "yes" ]; then + echo -n "Port check lid $lid port $portnum: " + green "OK" + fi + exit 0 +else + echo -n "Port check lid $lid port $portnum: " + red "FAILED" + exit -1 +fi diff --git a/infiniband-diags/scripts/ibcheckportwidth.in b/infiniband-diags/scripts/ibcheckportwidth.in index 32c5c5e..60a0892 100644 --- a/infiniband-diags/scripts/ibcheckportwidth.in +++ b/infiniband-diags/scripts/ibcheckportwidth.in @@ -103,7 +103,7 @@ function blue(s) } # Only check LinkWidthActive if LinkWidthSupported is not 1X -/^LinkWidthSupported/{ if ($2 != "1X") { next } } +/^LinkWidthSupported/{ if ($2 == "1X") { exit } } /^LinkWidthActive/{ if ($2 == "1X") warn = warn "#warn: Link configured as 1X lid '$lid' port '$portnum'\n"} /^ib/ {print $0; next} diff --git a/infiniband-diags/scripts/ibcheckspeed.in b/infiniband-diags/scripts/ibcheckspeed.in new file mode 100644 index 0000000..25c2201 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckspeed.in @@ -0,0 +1,135 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor]" \ + "[ \| -C ca_name -P ca_port -t(imeout) timeout_ms]" + exit -1 +} + +function user_abort() { + echo "Aborted" + exit 1 +} + +trap user_abort SIGINT + +gflags="" +verbose="" +v=0 +ntype="" +nodeguid="" +oldlid="" +topofile="" +ca_info="" + +while [ "$1" ]; do + case $1 in + -h) + usage + ;; + -N|-nocolor) + gflags=-N + ;; + -v) + verbose="-v" + v=1 + ;; + -P | -C | -t | -timeout) + case $2 in + -*) + usage + ;; + esac + if [ x$2 = x ] ; then + usage + fi + ca_info="$ca_info $1 $2" + shift + ;; + -*) + usage + ;; + *) + if [ "$topofile" ]; then + usage + fi + topofile="$1" + ;; + esac + shift +done + +if [ "$topofile" ]; then + netcmd="cat $topofile" +else + netcmd="$IBPATH/ibnetdiscover $ca_info" +fi + +text="`eval $netcmd`" +rv=$? +echo "$text" | awk ' +BEGIN { + ne=0 + pe=0 +} +function check_node(lid) +{ + nodechecked=1 + if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' '$verbose' " lid)) { + ne++ + badnode=1 + return + } +} + +/^Ca/ || /^Switch/ || /^Rt/ { + nnodes++ + ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 + if ('$v') + print "\n# Checking " ntype ": nodeguid 0x" nodeguid + + nodechecked=0 + badnode=0 + if (ntype != "Switch") + next + + lid = substr($0, index($0, "port 0 lid ") + 11) + lid = substr(lid, 1, index(lid, " ") - 1) + check_node(lid) + } +/^\[/ { + nports++ + port = $1 + if (!nodechecked) { + lid = substr($0, index($0, " lid ") + 5) + lid = substr(lid, 1, index(lid, " ") - 1) + check_node(lid) + } + if (badnode) { + print "\n# " ntype ": nodeguid 0x" nodeguid " failed" + next + } + sub("\\(.*\\)", "", port) + gsub("[\\[\\]]", "", port) + if (system("'$IBPATH'/ibcheckportspeed'"$ca_info"' '$gflags' '$verbose' " lid " " port)) { + if (!'$v' && oldlid != lid) { + print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" + oldlid = lid + } + pe++; + } +} + +/^ib/ {print $0; next} +/ibpanic:/ {print $0} +/ibwarn:/ {print $0} +/iberror:/ {print $0} + +END { + printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne + printf "## %d ports checked, %d ports with 2.5 Gbps speed in error found\n", nports, pe +} +' +exit $rv -- 1.6.4.2 >From 76e5f441bac10dff185244139a46124ff4736d56 Mon Sep 17 00:00:00 2001 From: Keshetti Mahesh Date: Thu, 10 Sep 2009 18:24:16 +0530 Subject: [PATCH 2/2] Revert the change made to 'ibcheckportwidth' Add man pages of 'ibcheckportspeed' and 'ibcheckspeed' Integrate 'ibcheckportspeed' and 'ibcheckspeed' into the build system Organization: OFED --- infiniband-diags/Makefile.am | 3 +- infiniband-diags/configure.in | 2 + infiniband-diags/man/ibcheckportspeed.8 | 44 ++++++++++++++++++++++++++ infiniband-diags/man/ibcheckportwidth.8 | 2 +- infiniband-diags/man/ibcheckspeed.8 | 37 +++++++++++++++++++++ infiniband-diags/scripts/ibcheckportwidth.in | 2 +- 6 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 infiniband-diags/man/ibcheckportspeed.8 create mode 100644 infiniband-diags/man/ibcheckspeed.8 diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 1cdb60e..8c5b773 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -23,6 +23,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/ibcheckport scripts/ibhosts scripts/ibstatus \ scripts/ibswitches scripts/ibnodes scripts/ibrouters \ scripts/ibcheckwidth scripts/ibcheckportwidth \ + scripts/ibcheckspeed scripts/ibcheckportspeed \ scripts/ibcheckstate scripts/ibcheckportstate \ scripts/ibcheckerrors scripts/ibclearerrors \ scripts/ibclearcounters scripts/ibdatacounts \ @@ -76,7 +77,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ - man/check_lft_balance.8 + man/check_lft_balance.8 man/ibcheckportspeed.8 man/ibcheckspeed.8 BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 3ef35cc..c647f40 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -156,8 +156,10 @@ AC_CONFIG_FILES([\ scripts/ibcheckport \ scripts/ibcheckportstate \ scripts/ibcheckportwidth \ + scripts/ibcheckportspeed \ scripts/ibcheckstate \ scripts/ibcheckwidth \ + scripts/ibcheckspeed \ scripts/ibclearcounters \ scripts/ibclearerrors \ scripts/ibdatacounts \ diff --git a/infiniband-diags/man/ibcheckportspeed.8 b/infiniband-diags/man/ibcheckportspeed.8 new file mode 100644 index 0000000..36beaac --- /dev/null +++ b/infiniband-diags/man/ibcheckportspeed.8 @@ -0,0 +1,44 @@ +.TH IBCHECKPORTSPEED 8 "Sep 10, 2009" "OpenIB" "OpenIB Diagnostics" + +.SH NAME +ibcheckportspeed \- validate IB port for link speed + +.SH SYNOPSIS +.B ibcheckportspeed +[\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] +[\-t(imeout) timeout_ms] + +.SH DESCRIPTION +.PP +Check connectivity and check the specified port for link speed. Report warning if the LinkSpeedSupported is greater than 2.5 Gbps and LinkSpeedActive is configured as 2.5 Gbps. + +Port address is a lid unless -G option is used to specify a GUID address. + +.SH OPTIONS +.PP +\-G use GUID address argument. In most cases, it is the Port GUID. + Example: + "0x08f1040023" +.PP +\-v increase the verbosity level +.PP +\-N | \-nocolor use mono rather than color mode +.PP +\-C use the specified ca_name. +.PP +\-P use the specified ca_port. +.PP +\-t override the default timeout for the solicited mads. + +.SH EXAMPLE +.PP +ibcheckportspeed 2 3 # check lid 2 port 3 + +.SH SEE ALSO +.BR smpquery(8), +.BR ibaddr(8) + +.SH AUTHOR +.TP +Keshetti Mahesh +.RI < keshetti.mahesh at gmail.com > diff --git a/infiniband-diags/man/ibcheckportwidth.8 b/infiniband-diags/man/ibcheckportwidth.8 index 85c06fc..c368467 100644 --- a/infiniband-diags/man/ibcheckportwidth.8 +++ b/infiniband-diags/man/ibcheckportwidth.8 @@ -4,7 +4,7 @@ ibcheckportwidth \- validate IB port for 1x link width .SH SYNOPSIS -.B ibcheckport +.B ibcheckportwidth [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] diff --git a/infiniband-diags/man/ibcheckspeed.8 b/infiniband-diags/man/ibcheckspeed.8 new file mode 100644 index 0000000..29aee37 --- /dev/null +++ b/infiniband-diags/man/ibcheckspeed.8 @@ -0,0 +1,37 @@ +.TH IBCHECKSPEED 8 "Sep 10, 2009" "OpenIB" "OpenIB Diagnostics" + +.SH NAME +ibcheckspeed \- find link speed configuration errors in IB subnet + +.SH SYNOPSIS +.B ibcheckspeed +[\-h] [\-v] [\-N | \-nocolor] [ | \-C ca_name +\-P ca_port \-t(imeout) timeout_ms] + + +.SH DESCRIPTION +.PP +ibcheckspeed is a script which uses a full topology file that was created by +ibnetdiscover, scans the network to validate the active link speeds and reports +any links which are configured with less active link speed then the supported +link speed. + +.SH OPTIONS +.PP +\-N | \-nocolor use mono rather than color mode +.PP +\-C use the specified ca_name. +.PP +\-P use the specified ca_port. +.PP +\-t override the default timeout for the solicited mads. + +.SH SEE ALSO +.BR ibnetdiscover(8), +.BR ibchecknode(8), +.BR ibcheckportspeed(8) + +.SH AUTHOR +.TP +Keshetti Mahesh +.RI < keshetti.mahesh at gmail.com > diff --git a/infiniband-diags/scripts/ibcheckportwidth.in b/infiniband-diags/scripts/ibcheckportwidth.in index 60a0892..32c5c5e 100644 --- a/infiniband-diags/scripts/ibcheckportwidth.in +++ b/infiniband-diags/scripts/ibcheckportwidth.in @@ -103,7 +103,7 @@ function blue(s) } # Only check LinkWidthActive if LinkWidthSupported is not 1X -/^LinkWidthSupported/{ if ($2 == "1X") { exit } } +/^LinkWidthSupported/{ if ($2 != "1X") { next } } /^LinkWidthActive/{ if ($2 == "1X") warn = warn "#warn: Link configured as 1X lid '$lid' port '$portnum'\n"} /^ib/ {print $0; next} -- 1.6.4.2 From hal.rosenstock at gmail.com Thu Sep 10 06:23:35 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 10 Sep 2009 09:23:35 -0400 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> Message-ID: On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh wrote: > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > Reports error/warning messages if the LinkSpeedActive is configured as > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > ibportstate checks for more than this in terms of speed (and width) anomalies. Would it be better for these scripts to use that tool now ? Alternatively, the additional speed/width anomaly checks could be implemented in these scripts but it does involve checking the peer port so there's a little more to it. -- Hal > > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > --- > infiniband-diags/scripts/ibcheckportspeed.in | 146 > ++++++++++++++++++++++++++ > infiniband-diags/scripts/ibcheckportwidth.in | 2 +- > infiniband-diags/scripts/ibcheckspeed.in | 135 > ++++++++++++++++++++++++ > 3 files changed, 282 insertions(+), 1 deletions(-) > create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Thu Sep 10 06:42:35 2009 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 10 Sep 2009 08:42:35 -0500 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: References: Message-ID: <4AA9024B.5040703@opengridcomputing.com> I think depending on the kernel you're running on, you might have to add -i. Barry Mavin wrote: > I am still having problems mounting an NFS share with RDMA. > > [root at senkas1 nfs-utils-1.1.6]# /sbin/mount.nfs 10.10.10.3:/home/nfs > /mnt/nfs -o rdma,port=20049 > mount.nfs: Unsupported nfs mount option: rdma > > This is after updating to 1.1.6 nfs-utils. > > --- > Regards > Barry Mavin > Recital Corporation > Chairman and CEO > Website: http://www.recital.com > MSN Messenger: Barry_Mavin at msn.com > Skype: BarryMavin > Direct line worldwide: +1 9785224139 > > > > >> From: Barry Mavin >> Date: Thu, 10 Sep 2009 06:03:46 +0530 >> To: >> Cc: "ofw at lists.openfabrics.org" , >> "general at lists.openfabrics.org" , "Davis, Arlin >> R" >> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >> CM processing, state changes >> >> I have specified insecure on the server side. Is that required on the client >> mount command also? >> >> Looking at the logs, the mount is successful but an I/O error happens >> afterwards. >> >> OFED 1.4.1 installed nfs-utils 1.1.6. Do they need reinstalled again? >> >> I did in fact try to build 1.2 and configure fails against OFED 1.4.1 on RH >> 5.3 >> >> --- >> Regards >> Barry Mavin >> Recital Corporation >> Chairman and CEO >> Website: http://www.recital.com >> MSN Messenger: Barry_Mavin at msn.com >> Skype: BarryMavin >> Direct line worldwide: +1 9785224139 >> >> >> >> >>> From: Joe Landman >>> Organization: Scalable Informatics >>> Reply-To: >>> Date: Wed, 09 Sep 2009 20:16:34 -0400 >>> To: Barry Mavin >>> Cc: "Davis, Arlin R" , >>> "general at lists.openfabrics.org" , >>> "ofw at lists.openfabrics.org" >>> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >>> CM processing, state changes >>> >>> Barry Mavin wrote: >>> >>> >>>> OS is redhat 5.3. >>>> Mellanox fw is all at 2.6 >>>> >>>> Has anyone got NFS / RDMA working with redhat 5.3? >>>> >>> Yes, but you have to update nfs-tools, and change your mount to use the >>> 'insecure' option. >>> >>> Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz >>> and build it (1.2.0 may work, we haven't played with it). >>> >>> >>>> --- >>>> Regards >>>> Barry Mavin >>>> Recital Corporation >>>> >>> -- >>> Joseph Landman, Ph.D >>> Founder and CEO >>> Scalable Informatics, Inc. >>> email: landman at scalableinformatics.com >>> web : http://scalableinformatics.com >>> http://scalableinformatics.com/jackrabbit >>> phone: +1 734 786 8423 x121 >>> fax : +1 866 888 3112 >>> cell : +1 734 612 4615 >>> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >> > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From tziporet at mellanox.co.il Thu Sep 10 08:54:06 2009 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Thu, 10 Sep 2009 18:54:06 +0300 Subject: [ofa-general] OFED 1.5 beta release is available In-Reply-To: <2ED289D4E09FBD4D92D911E869B97FDD29F156@mtlexch01.mtl.com> References: <5D49E7A8952DC44FB38C38FA0D758EAD02C12252@mtlexch01.mtl.com> <5D49E7A8952DC44FB38C38FA0D758EAD02D5F39D@mtlexch01.mtl.com> <2ED289D4E09FBD4D92D911E869B97FDD29F156@mtlexch01.mtl.com> Message-ID: <2ED289D4E09FBD4D92D911E869B97FDDAF55AC@mtlexch01.mtl.com> OFED 1.5-beta1 is available Notes: The tarball is available on: http://www.openfabrics.org/downloads/OFED/ofed-1.5/OFED-1.5-beta1.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 1.5 Vladimir & Tziporet ======================================================================== Release information: -------------------- Linux Operating Systems: - RedHat EL4 up6: 2.6.9-67.ELsmp - RedHat EL4 up7: 2.6.9-78.ELsmp - RedHat EL4 up8: 2.6.9-89.ELsmp - RedHat EL5 up2: 2.6.18-92.el5 - RedHat EL5 up3: 2.6.18-128.el5 - SLES10 SP2: 2.6.16.60-0.21-smp - SLES11: 2.6.27.19-5-default - OEL 4 up7 2.6.9-78.ELsmp - OEL 5 up2 2.6.18-92.el5 - CentOS5.2 2.6.18-92.el5 - CentOS5.3 2.6.18-128.el5 - Fedora Core12 2.6.29 * - OpenSuSE 11 2.6.25.5-1.1 * - kernel.org: 2.6.29 and 2.6.30 * Minimal QA for these versions Systems: * x86_64 * x86 * ia64 * ppc64 Main changes from 1.5 alpha: ============================ 1. Backports for all kernel modules (almost) 2. Moved to new libraries package scheme. Look at ofed_info to see the changes 3. SDP Zero Copy 4. Bug fixes Tasks that should be completed for the beta2 ============================================ 1. RHEL 5.4 support 2. Bug fixes Limitations: ============ - mthca does not compile on SLES11 on PPC64 - RDS kernel panic on RHEL 4.x - RHEL 5.4 is not supported Changes from OFED-1.4.1 ======================== 1 General changes o Kernel code based on 2.6.30 2 SDP o Performance improvements 3 uDAPL o New library 4 Management o OpenSM - Mesh Analysis for LASH routing algorithm. - Reloadable OpenSM configuration (preliminary implemented) - Routing paths sorted balancing (for UpDown and MinHops) - Weighted Lid Matrices calculation (for UpDown, MinHop and DOR). - I/O nodes connectivity (for FatTree). 5 MPI: - For now same versions as in OFED 1.4.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: ofed_kernel-1.5-apha4_beta1.log Type: application/octet-stream Size: 48042 bytes Desc: ofed_kernel-1.5-apha4_beta1.log URL: From weiny2 at llnl.gov Thu Sep 10 09:02:13 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 10 Sep 2009 09:02:13 -0700 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> Message-ID: <20090910090213.6888b7d5.weiny2@llnl.gov> Also, iblinkinfo will report links which it finds capable of either faster or wider operation. iblinkinfo checks both ends of the link as Hal mentions. It reports this with output like. Switch 0x0005ad0000092106 Cisco Switch SFS7000D: ... 7 8[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 8 12[ ] "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) ... Also the portstatus console command in OpenSM will report links which are running at "reduced speed or width". Although this does not check the remote port. OpenSM $ help portstatus portstatus [ca|switch|router] summarize port status [ca|switch|router] -- limit the results to the node type specified OpenSM $ portstatus "ALL" port status: 115 port(s) scanned on 9 nodes in 26 us 85 down 30 active 32 at 4X 22 at 2.5 Gbps 8 at 5.0 Gbps 2 at 10.0 Gbps Possible issues: 2 disabled 0x0008f10400411b18 5 (ISR9024D Voltaire) 0x0005ad0000092106 13 (Cisco Switch SFS7000D) 6 with reduced speed 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) 0x0005ad0000092106 21 (Cisco Switch SFS7000D) 0x0005ad0000092106 20 (Cisco Switch SFS7000D) 0x0005ad0000092106 9 (Cisco Switch SFS7000D) 0x0005ad0000092106 8 (Cisco Switch SFS7000D) Ira On Thu, 10 Sep 2009 09:23:35 -0400 Hal Rosenstock wrote: > On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh > wrote: > > > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > > Reports error/warning messages if the LinkSpeedActive is configured as > > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > > > > ibportstate checks for more than this in terms of speed (and width) > anomalies. > > Would it be better for these scripts to use that tool now ? Alternatively, > the additional speed/width anomaly checks could be implemented in these > scripts but it does involve checking the peer port so there's a little more > to it. > > -- Hal > > > > > > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > > --- > > infiniband-diags/scripts/ibcheckportspeed.in | 146 > > ++++++++++++++++++++++++++ > > infiniband-diags/scripts/ibcheckportwidth.in | 2 +- > > infiniband-diags/scripts/ibcheckspeed.in | 135 > > ++++++++++++++++++++++++ > > 3 files changed, 282 insertions(+), 1 deletions(-) > > create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > > create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > > > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jenos at ncsa.uiuc.edu Thu Sep 10 11:43:10 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Thu, 10 Sep 2009 13:43:10 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <200908311217.43954.jackm@dev.mellanox.co.il> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> Message-ID: <4AA948BE.9060806@ncsa.uiuc.edu> An HTML attachment was scrubbed... URL: From jgunthorpe at obsidianresearch.com Thu Sep 10 12:34:54 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Thu, 10 Sep 2009 13:34:54 -0600 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <4AA948BE.9060806@ncsa.uiuc.edu> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> Message-ID: <20090910193454.GB7552@obsidianresearch.com> On Thu, Sep 10, 2009 at 01:43:10PM -0500, Jeremy Enos wrote: > So I accepted that I'd have to move Fedora version to get OFED > support... and I was ok with that. However, I see now that FC12 is > not released, and won't be until November. I have tested FC11, and > it doesn't work w/ the OFED 1.5 beta1 either. It seems OFED has > skipped out on support for Fedora for like 1.5 years of release. > Moving back to Fedora9 isn't really an option for me either... so > it looks like I'm out of options until November unless OFED were to > target support for the latest /released/ Fedora OS instead of > alpha/beta pre-releases. :( What doesn't work on FC11? What would it take to fix this? Jason From mashirle at us.ibm.com Thu Sep 10 12:40:49 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 12:40:49 -0700 Subject: [ofa-general] mlx4 second port lro issue Message-ID: <1252611649.4489.1.camel@localhost.localdomain> Hello, Anybody has seen lro doesn't work for port2 but works for port1 before? We found this issue on RHEL5. Thanks Shirley From jenos at ncsa.uiuc.edu Thu Sep 10 13:00:56 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Thu, 10 Sep 2009 15:00:56 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <20090910193454.GB7552@obsidianresearch.com> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> Message-ID: <4AA95AF8.9020905@ncsa.uiuc.edu> An HTML attachment was scrubbed... URL: From jenos at ncsa.uiuc.edu Thu Sep 10 13:32:24 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Thu, 10 Sep 2009 15:32:24 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <4AA95AF8.9020905@ncsa.uiuc.edu> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> Message-ID: <4AA96258.7040301@ncsa.uiuc.edu> An HTML attachment was scrubbed... URL: From rdreier at cisco.com Thu Sep 10 13:54:56 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Sep 2009 13:54:56 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1252611649.4489.1.camel@localhost.localdomain> (Shirley Ma's message of "Thu, 10 Sep 2009 12:40:49 -0700") References: <1252611649.4489.1.camel@localhost.localdomain> Message-ID: > Anybody has seen lro doesn't work for port2 but works for port1 before? > We found this issue on RHEL5. I haven't seen this. But I haven't really tried. Are you talking about mlx4_en or ipoib with mlx4_ib? What do you mean by lro not working -- the interface doesn't work with lro enabled, or you just don't see aggregation of receives? By the way, both mlx4_en and ipoib should probably move from lro to gro -- patches welcome. From mashirle at us.ibm.com Thu Sep 10 14:10:15 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 14:10:15 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: References: <1252611649.4489.1.camel@localhost.localdomain> Message-ID: <1252617015.4489.6.camel@localhost.localdomain> Hello Roland, On Thu, 2009-09-10 at 13:54 -0700, Roland Dreier wrote: > haven't seen this. But I haven't really tried. > > Are you talking about mlx4_en or ipoib with mlx4_ib? What do you mean > by lro not working -- the interface doesn't work with lro enabled, or > you just don't see aggregation of receives? > > By the way, both mlx4_en and ipoib should probably move from lro to > gro -- patches welcome. Thanks for your prompt response. The problem was seen on mlx4_en QDR. When LRO enabled, for large packet than 1500 mtu, port1 can process this packet, but port2 doesn't see this big packet at all from tcpdump trace. I wonder whether there is any FW issue since I don't think the driver handle two ports differently. Shirley From boris at mellanox.com Thu Sep 10 14:16:35 2009 From: boris at mellanox.com (Boris Shpolyansky) Date: Thu, 10 Sep 2009 14:16:35 -0700 Subject: [ofa-general] mlx4 second port lro issue References: <1252611649.4489.1.camel@localhost.localdomain> <1252617015.4489.6.camel@localhost.localdomain> Message-ID: <1E3DCD1C63492545881FACB6063A57C10459D4DA@mtiexch01.mti.com> Dumb question: what was the MTU setting of the eth interface associated with port 2? Dropping jumbo frames has nothing to do with LRO - it is plain layer 2 functionality. Boris Shpolyansky Sr. Member of Technical Staff, Applications Mellanox Technologies Inc. 350 Oakmead Parkway, Suite 100 Sunnyvale, CA 94085 Tel.: (408) 916 0014 Fax: (408) 585 0314 Cell: (408) 834 9365 www.mellanox.com -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Shirley Ma Sent: Thursday, September 10, 2009 2:10 PM To: Roland Dreier Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] mlx4 second port lro issue Hello Roland, On Thu, 2009-09-10 at 13:54 -0700, Roland Dreier wrote: > haven't seen this. But I haven't really tried. > > Are you talking about mlx4_en or ipoib with mlx4_ib? What do you mean > by lro not working -- the interface doesn't work with lro enabled, or > you just don't see aggregation of receives? > > By the way, both mlx4_en and ipoib should probably move from lro to > gro -- patches welcome. Thanks for your prompt response. The problem was seen on mlx4_en QDR. When LRO enabled, for large packet than 1500 mtu, port1 can process this packet, but port2 doesn't see this big packet at all from tcpdump trace. I wonder whether there is any FW issue since I don't think the driver handle two ports differently. Shirley _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From Barry.Mavin at recital.com Thu Sep 10 16:38:58 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 05:08:58 +0530 Subject: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with CM processing, state changes In-Reply-To: <4AA9024B.5040703@opengridcomputing.com> Message-ID: The -I option is not recognized with nfs-utils 1.1.6, neither is the -o rdma option. Is there any special option that should be used on the client mount command to tell it to use rdma? --- Regards Barry Mavin Recital Corporation > From: Steve Wise > Date: Thu, 10 Sep 2009 08:42:35 -0500 > To: Barry Mavin > Cc: , "ofw at lists.openfabrics.org" > , "general at lists.openfabrics.org" > , "Davis, Arlin R" > Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with > CM processing, state changes > > I think depending on the kernel you're running on, you might have to add -i. > > Barry Mavin wrote: >> I am still having problems mounting an NFS share with RDMA. >> >> [root at senkas1 nfs-utils-1.1.6]# /sbin/mount.nfs 10.10.10.3:/home/nfs >> /mnt/nfs -o rdma,port=20049 >> mount.nfs: Unsupported nfs mount option: rdma >> >> This is after updating to 1.1.6 nfs-utils. >> >> --- >> Regards >> Barry Mavin >> Recital Corporation >> Chairman and CEO >> Website: http://www.recital.com >> MSN Messenger: Barry_Mavin at msn.com >> Skype: BarryMavin >> Direct line worldwide: +1 9785224139 >> >> >> >> >>> From: Barry Mavin >>> Date: Thu, 10 Sep 2009 06:03:46 +0530 >>> To: >>> Cc: "ofw at lists.openfabrics.org" , >>> "general at lists.openfabrics.org" , "Davis, >>> Arlin >>> R" >>> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking with >>> CM processing, state changes >>> >>> I have specified insecure on the server side. Is that required on the client >>> mount command also? >>> >>> Looking at the logs, the mount is successful but an I/O error happens >>> afterwards. >>> >>> OFED 1.4.1 installed nfs-utils 1.1.6. Do they need reinstalled again? >>> >>> I did in fact try to build 1.2 and configure fails against OFED 1.4.1 on RH >>> 5.3 >>> >>> --- >>> Regards >>> Barry Mavin >>> Recital Corporation >>> Chairman and CEO >>> Website: http://www.recital.com >>> MSN Messenger: Barry_Mavin at msn.com >>> Skype: BarryMavin >>> Direct line worldwide: +1 9785224139 >>> >>> >>> >>> >>>> From: Joe Landman >>>> Organization: Scalable Informatics >>>> Reply-To: >>>> Date: Wed, 09 Sep 2009 20:16:34 -0400 >>>> To: Barry Mavin >>>> Cc: "Davis, Arlin R" , >>>> "general at lists.openfabrics.org" , >>>> "ofw at lists.openfabrics.org" >>>> Subject: Re: [ofa-general] [PATCH 4/4] DAPL v2: ucm: tighten up locking >>>> with >>>> CM processing, state changes >>>> >>>> Barry Mavin wrote: >>>> >>>> >>>>> OS is redhat 5.3. >>>>> Mellanox fw is all at 2.6 >>>>> >>>>> Has anyone got NFS / RDMA working with redhat 5.3? >>>>> >>>> Yes, but you have to update nfs-tools, and change your mount to use the >>>> 'insecure' option. >>>> >>>> Grab http://www.kernel.org/pub/linux/utils/nfs/nfs-utils-1.1.6.tar.gz >>>> and build it (1.2.0 may work, we haven't played with it). >>>> >>>> >>>>> --- >>>>> Regards >>>>> Barry Mavin >>>>> Recital Corporation >>>>> >>>> -- >>>> Joseph Landman, Ph.D >>>> Founder and CEO >>>> Scalable Informatics, Inc. >>>> email: landman at scalableinformatics.com >>>> web : http://scalableinformatics.com >>>> http://scalableinformatics.com/jackrabbit >>>> phone: +1 734 786 8423 x121 >>>> fax : +1 866 888 3112 >>>> cell : +1 734 612 4615 >>>> >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > From jgunthorpe at obsidianresearch.com Thu Sep 10 16:54:59 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Thu, 10 Sep 2009 17:54:59 -0600 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <4AA95AF8.9020905@ncsa.uiuc.edu> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> Message-ID: <20090910235459.GD7552@obsidianresearch.com> On Thu, Sep 10, 2009 at 03:00:56PM -0500, Jeremy Enos wrote: > Fails w/ ofa_kernel like the others have... I didn't test excluding this rpm > with FC11, but the others also fail elsewhere w/ this rpm excluded- so I'm > guessing FC11 would as well. I included the output (and last 50 lines of log) > in case you can glean something useful from it. Thank you- Hmm.. looks like a scripting problem to me, no compiler errors hidden someplace in that log? > if [ ! -n "/var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64" > ]; then /sbin/depmod -r -ae 2.6.30.5-43.fc11.x86_64;fi; > ++ find /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5 -name Module.symvers -o -name > Modules.symvers > + modsyms=/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/Module.symvers > + for modsym in '$modsyms' > + cat /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/Module.symvers > /var/tmp/rpm-tmp.7Js3WE: line 37: /var/tmp/OFED_topdir/BUILDROOT/ > ofa_kernel-1.5-ofed1.5.beta1.x86_64//usr/src/ofa_kernel/Module.symvers: No such > file or directory > error: Bad exit status from /var/tmp/rpm-tmp.7Js3WE (%install) Funky. Created it in the wrong place? Jason From cl at linux-foundation.org Thu Sep 10 16:54:04 2009 From: cl at linux-foundation.org (Christoph Lameter) Date: Thu, 10 Sep 2009 19:54:04 -0400 (EDT) Subject: [ofa-general] ibpanic In-Reply-To: <20090909102837.GE9156@cefeid.wcss.wroc.pl> References: <20090807112526.GD21691@cefeid.wcss.wroc.pl> <20090909102837.GE9156@cefeid.wcss.wroc.pl> Message-ID: > kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11) > perfquery: ibpanic: [20876] madrpc_init: client_register for mgmt 1 failed: (Cannot allocate memory) It also happens with mthca? Isnt that the same bug that we had with mlx4? From mashirle at us.ibm.com Thu Sep 10 19:07:35 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 19:07:35 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1E3DCD1C63492545881FACB6063A57C10459D4DA@mtiexch01.mti.com> References: <1252611649.4489.1.camel@localhost.localdomain> <1252617015.4489.6.camel@localhost.localdomain> <1E3DCD1C63492545881FACB6063A57C10459D4DA@mtiexch01.mti.com> Message-ID: <1252634855.4712.4.camel@localhost.localdomain> Hello Boris, On Thu, 2009-09-10 at 14:16 -0700, Boris Shpolyansky wrote: > Dumb question: what was the MTU setting of the eth interface > associated > with port 2? Dropping jumbo frames has nothing to do with LRO - it is > plain layer 2 functionality. ifconfig shows both port1 and port2 mtu are 1500. port 2 does't drop jumbo frames. The problem is the LRO is on for both interfaces so the interface will get large packet (packet size 1848). port1 can receive it and process it, but not port2. If I disables lro by reloading module with num_lro=0, then it will get small packet, and port2 works fine. My question here is why port1 can work well for lro but not port2. Thanks Shirley From boris at mellanox.com Thu Sep 10 19:31:35 2009 From: boris at mellanox.com (Boris Shpolyansky) Date: Thu, 10 Sep 2009 19:31:35 -0700 Subject: [ofa-general] mlx4 second port lro issue References: <1252611649.4489.1.camel@localhost.localdomain> <1252617015.4489.6.camel@localhost.localdomain> <1E3DCD1C63492545881FACB6063A57C10459D4DA@mtiexch01.mti.com> <1252634855.4712.4.camel@localhost.localdomain> Message-ID: <1E3DCD1C63492545881FACB6063A57C10459D597@mtiexch01.mti.com> Shirley, Are you referring to actual Ethernet frame size or to TCP message size? If the port MTU set to 1500 it will reject Ethernet frames larger than this size, this has nothing to do with the LRO. LRO is a TCP offload that improves CPU utilization on the TCP receiver by combining multiple packets belonging to the same TCP stream to a single buffer and transferring it to the TCP stack as a single large packet. Boris Shpolyansky Sr. Member of Technical Staff, Applications Mellanox Technologies Inc. 350 Oakmead Parkway, Suite 100 Sunnyvale, CA 94085 Tel.: (408) 916 0014 Fax: (408) 585 0314 Cell: (408) 834 9365 www.mellanox.com -----Original Message----- From: Shirley Ma [mailto:mashirle at us.ibm.com] Sent: Thursday, September 10, 2009 7:08 PM To: Boris Shpolyansky Cc: Roland Dreier; general at lists.openfabrics.org Subject: RE: [ofa-general] mlx4 second port lro issue Hello Boris, On Thu, 2009-09-10 at 14:16 -0700, Boris Shpolyansky wrote: > Dumb question: what was the MTU setting of the eth interface > associated > with port 2? Dropping jumbo frames has nothing to do with LRO - it is > plain layer 2 functionality. ifconfig shows both port1 and port2 mtu are 1500. port 2 does't drop jumbo frames. The problem is the LRO is on for both interfaces so the interface will get large packet (packet size 1848). port1 can receive it and process it, but not port2. If I disables lro by reloading module with num_lro=0, then it will get small packet, and port2 works fine. My question here is why port1 can work well for lro but not port2. Thanks Shirley From mashirle at us.ibm.com Thu Sep 10 20:02:47 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 20:02:47 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1E3DCD1C63492545881FACB6063A57C10459D597@mtiexch01.mti.com> References: <1252611649.4489.1.camel@localhost.localdomain> <1252617015.4489.6.camel@localhost.localdomain> <1E3DCD1C63492545881FACB6063A57C10459D4DA@mtiexch01.mti.com> <1252634855.4712.4.camel@localhost.localdomain> <1E3DCD1C63492545881FACB6063A57C10459D597@mtiexch01.mti.com> Message-ID: <1252638167.4712.15.camel@localhost.localdomain> Hello Boris, On Thu, 2009-09-10 at 19:31 -0700, Boris Shpolyansky wrote: > Are you referring to actual Ethernet frame size or to TCP message > size? > If the port MTU set to 1500 it will reject Ethernet frames larger than > this size, this has nothing to do with the LRO. > LRO is a TCP offload that improves CPU utilization on the TCP receiver > by combining multiple packets belonging to the same TCP stream to a > single buffer and transferring it to the TCP stack as a single large > packet. Because the LRO is enabled in device driver level, when it receives multiple tcp fragmentation packets, it will merge to a large packet to deliver to upper layer protocol to process. However, the port1 works well, but not for port2 when LRO enabled. I don't know when the packet drops since I don't have any tool rather than tcpdump. From tcpdump I can see the large packet when I use port1, but I can't see the large packet when I use port2 with LRO enabled. If I disabled LRO, all packets are smaller than 1500 mtu from tcpdump, everything works fine for both ports. I hope it's clear. Thanks Shirley From Barry.Mavin at recital.com Thu Sep 10 20:07:14 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 08:37:14 +0530 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1E3DCD1C63492545881FACB6063A57C10459D597@mtiexch01.mti.com> Message-ID: Is LRO on by default? If not how can we enable it? --- Regards Barry Mavin Recital Corporation > From: Boris Shpolyansky > Date: Thu, 10 Sep 2009 19:31:35 -0700 > To: Shirley Ma > Cc: Roland Dreier , > Subject: RE: [ofa-general] mlx4 second port lro issue > > Shirley, > > Are you referring to actual Ethernet frame size or to TCP message size? > If the port MTU set to 1500 it will reject Ethernet frames larger than > this size, this has nothing to do with the LRO. > LRO is a TCP offload that improves CPU utilization on the TCP receiver > by combining multiple packets belonging to the same TCP stream to a > single buffer and transferring it to the TCP stack as a single large > packet. > > Boris Shpolyansky > Sr. Member of Technical Staff, Applications > > Mellanox Technologies Inc. > 350 Oakmead Parkway, Suite 100 > Sunnyvale, CA 94085 > Tel.: (408) 916 0014 > Fax: (408) 585 0314 > Cell: (408) 834 9365 > www.mellanox.com > > -----Original Message----- > From: Shirley Ma [mailto:mashirle at us.ibm.com] > Sent: Thursday, September 10, 2009 7:08 PM > To: Boris Shpolyansky > Cc: Roland Dreier; general at lists.openfabrics.org > Subject: RE: [ofa-general] mlx4 second port lro issue > > Hello Boris, > > On Thu, 2009-09-10 at 14:16 -0700, Boris Shpolyansky wrote: >> Dumb question: what was the MTU setting of the eth interface >> associated >> with port 2? Dropping jumbo frames has nothing to do with LRO - it is >> plain layer 2 functionality. > > ifconfig shows both port1 and port2 mtu are 1500. port 2 does't drop > jumbo frames. The problem is the LRO is on for both interfaces so the > interface will get large packet (packet size 1848). port1 can receive it > and process it, but not port2. If I disables lro by reloading module > with num_lro=0, then it will get small packet, and port2 works fine. > > My question here is why port1 can work well for lro but not port2. > > Thanks > Shirley > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From boris at mellanox.com Thu Sep 10 20:12:18 2009 From: boris at mellanox.com (Boris Shpolyansky) Date: Thu, 10 Sep 2009 20:12:18 -0700 Subject: [ofa-general] mlx4 second port lro issue Message-ID: <1E3DCD1C63492545881FACB6063A57C1D52ABE@mtiexch01.mti.com> It should be. There is mlx4_en module param (num_lro) that controls it. Boris Shpolyansky Sr. Member of Technical Staff, Applications Mellanox Technologies Inc. 350 Oakmead Parkway, Suite 100 Sunnyvale, CA 94085 Tel.: (408) 916 0014 Fax: (408) 585 0314 Cell: (408) 834 9365 www.mellanox.com ----- Original Message ----- From: Barry Mavin To: Boris Shpolyansky; Shirley Ma Cc: Roland Dreier ; general at lists.openfabrics.org Sent: Thu Sep 10 20:07:14 2009 Subject: Re: [ofa-general] mlx4 second port lro issue Is LRO on by default? If not how can we enable it? --- Regards Barry Mavin Recital Corporation > From: Boris Shpolyansky > Date: Thu, 10 Sep 2009 19:31:35 -0700 > To: Shirley Ma > Cc: Roland Dreier , > Subject: RE: [ofa-general] mlx4 second port lro issue > > Shirley, > > Are you referring to actual Ethernet frame size or to TCP message size? > If the port MTU set to 1500 it will reject Ethernet frames larger than > this size, this has nothing to do with the LRO. > LRO is a TCP offload that improves CPU utilization on the TCP receiver > by combining multiple packets belonging to the same TCP stream to a > single buffer and transferring it to the TCP stack as a single large > packet. > > Boris Shpolyansky > Sr. Member of Technical Staff, Applications > > Mellanox Technologies Inc. > 350 Oakmead Parkway, Suite 100 > Sunnyvale, CA 94085 > Tel.: (408) 916 0014 > Fax: (408) 585 0314 > Cell: (408) 834 9365 > www.mellanox.com > > -----Original Message----- > From: Shirley Ma [mailto:mashirle at us.ibm.com] > Sent: Thursday, September 10, 2009 7:08 PM > To: Boris Shpolyansky > Cc: Roland Dreier; general at lists.openfabrics.org > Subject: RE: [ofa-general] mlx4 second port lro issue > > Hello Boris, > > On Thu, 2009-09-10 at 14:16 -0700, Boris Shpolyansky wrote: >> Dumb question: what was the MTU setting of the eth interface >> associated >> with port 2? Dropping jumbo frames has nothing to do with LRO - it is >> plain layer 2 functionality. > > ifconfig shows both port1 and port2 mtu are 1500. port 2 does't drop > jumbo frames. The problem is the LRO is on for both interfaces so the > interface will get large packet (packet size 1848). port1 can receive it > and process it, but not port2. If I disables lro by reloading module > with num_lro=0, then it will get small packet, and port2 works fine. > > My question here is why port1 can work well for lro but not port2. > > Thanks > Shirley > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From mashirle at us.ibm.com Thu Sep 10 20:14:25 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 20:14:25 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: References: Message-ID: <1252638865.4712.22.camel@localhost.localdomain> Hello Barry On Fri, 2009-09-11 at 08:37 +0530, Barry Mavin wrote: > Is LRO on by default? If not how can we enable it? > It is a mlx4_en module parameter, by default it is set to MLX4_EN_MAX_LRO_DESCRIPTORS which is 32, reloading the module with num_lro=0 will disable it. It will impact performance when disabling it. Any possible reason for port1 works but not port2? Shirley From Barry.Mavin at recital.com Thu Sep 10 20:25:57 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 08:55:57 +0530 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1252638167.4712.15.camel@localhost.localdomain> Message-ID: Is this a subnet manager issue? Configure your Subnet Manager (SM) to set the MTU value in the configuration file. The SM configuration for MTU value is per Partition Key (PKey). For example, to enable 4052-byte MTUs on a default PKey using the OpenSM SM, log into the Linux machine (running OpenSM) and perform the following commands: 1. Edit the file: /usr/local/ofed/etc/opensm/partitions.conf and include the line: key0=0x7fff,ipoib,mtu=5 : ALL=full; --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: Barry_Mavin at msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 > From: Shirley Ma > Date: Thu, 10 Sep 2009 20:02:47 -0700 > To: Boris Shpolyansky > Cc: Roland Dreier , > Subject: RE: [ofa-general] mlx4 second port lro issue > > Hello Boris, > > On Thu, 2009-09-10 at 19:31 -0700, Boris Shpolyansky wrote: >> Are you referring to actual Ethernet frame size or to TCP message >> size? >> If the port MTU set to 1500 it will reject Ethernet frames larger than >> this size, this has nothing to do with the LRO. >> LRO is a TCP offload that improves CPU utilization on the TCP receiver >> by combining multiple packets belonging to the same TCP stream to a >> single buffer and transferring it to the TCP stack as a single large >> packet. > > Because the LRO is enabled in device driver level, when it receives > multiple tcp fragmentation packets, it will merge to a large packet to > deliver to upper layer protocol to process. However, the port1 works > well, but not for port2 when LRO enabled. > > I don't know when the packet drops since I don't have any tool rather > than tcpdump. From tcpdump I can see the large packet when I use port1, > but I can't see the large packet when I use port2 with LRO enabled. If I > disabled LRO, all packets are smaller than 1500 mtu from tcpdump, > everything works fine for both ports. I hope it's clear. > > Thanks > Shirley > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From mashirle at us.ibm.com Thu Sep 10 20:28:40 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 10 Sep 2009 20:28:40 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: References: Message-ID: <1252639720.4712.24.camel@localhost.localdomain> Hello Barry, On Fri, 2009-09-11 at 08:55 +0530, Barry Mavin wrote: > Is this a subnet manager issue? It's mlx4_en 10GbE, not IPoIB. It's not a subnet manager issue here. Thanks Shirley From keshetti.mahesh at gmail.com Thu Sep 10 21:02:39 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Fri, 11 Sep 2009 09:32:39 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <20090910090213.6888b7d5.weiny2@llnl.gov> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> Message-ID: <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> My badness. I have not used 'iblinkinfo' before. So, I guess there is no need for the above script. Apart from that, I feel there should be a program/script which will first scan the fabric to find the maximum common supported width/speed and then report the warning messages of the links/ports which are configured with active width/speed less than the found value. Is there any tool already exists which does the same ? - Keshetti Mahesh On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > Also, iblinkinfo will report links which it finds capable of either faster or wider operation.  iblinkinfo checks both ends of the link as Hal mentions.  It reports this with output like. > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > ... >           7    8[  ] ==( 4X 2.5 Gbps Active/  LinkUp)==>       8   12[  ] "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > ... > > Also the portstatus console command in OpenSM will report links which are running at "reduced speed or width".  Although this does not check the remote port. > > OpenSM $ help portstatus > portstatus [ca|switch|router] > summarize port status >   [ca|switch|router] -- limit the results to the node type specified > OpenSM $ portstatus > "ALL" port status: >   115 port(s) scanned on 9 nodes in 26 us >   85 down >   30 active >   32 at 4X >   22 at 2.5 Gbps >   8 at 5.0 Gbps >   2 at 10.0 Gbps > > Possible issues: >   2 disabled >      0x0008f10400411b18 5 (ISR9024D Voltaire) >      0x0005ad0000092106 13 (Cisco Switch SFS7000D) >   6 with reduced speed >      0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) >      0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) >      0x0005ad0000092106 21 (Cisco Switch SFS7000D) >      0x0005ad0000092106 20 (Cisco Switch SFS7000D) >      0x0005ad0000092106 9 (Cisco Switch SFS7000D) >      0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > Ira > > On Thu, 10 Sep 2009 09:23:35 -0400 > Hal Rosenstock wrote: > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh >> wrote: >> >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. >> > Reports error/warning messages if the LinkSpeedActive is configured as >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. >> > >> >> ibportstate checks for more than this in terms of speed (and width) >> anomalies. >> >> Would it be better for these scripts to use that tool now ? Alternatively, >> the additional speed/width anomaly checks could be implemented in these >> scripts but it does involve checking the peer port so there's a little more >> to it. >> >> -- Hal >> >> >> > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> >> > --- >> >  infiniband-diags/scripts/ibcheckportspeed.in |  146 >> > ++++++++++++++++++++++++++ >> >  infiniband-diags/scripts/ibcheckportwidth.in |    2 +- >> >  infiniband-diags/scripts/ibcheckspeed.in     |  135 >> > ++++++++++++++++++++++++ >> >  3 files changed, 282 insertions(+), 1 deletions(-) >> >  create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in >> >  create mode 100644 infiniband-diags/scripts/ibcheckspeed.in >> > >> >> > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > From rdreier at cisco.com Thu Sep 10 21:23:26 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Sep 2009 21:23:26 -0700 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get the batch of RDMA/InfiniBand changes for the 2.6.32 merge window: Alexander Schmidt (1): IB/ehca: Make port autodetect mode the default Arputham Benjamin (2): mlx4_core: Distinguish multiple devices in /proc/interrupts IB/mthca: Distinguish multiple devices in /proc/interrupts Chien Tung (1): RDMA/nes: Map MTU to IB_MTU_* and correctly report link state Don Wood (10): RDMA/nes: Update refcnt during disconnect RDMA/nes: Allocate work item for disconnect event handling RDMA/nes: Change memory allocation for cqp request to GFP_ATOMIC RDMA/nes: Clean out CQ completions when QP is destroyed RDMA/nes: Add CQ error handling RDMA/nes: Implement Terminate Packet RDMA/nes: Use flush mechanism to set status for wqe in error RDMA/nes: Make poll_cq return correct number of wqes during flush RDMA/nes: Use the flush code to fill in cqe error RDMA/nes: Rework the disconn routine for terminate and flushing Hal Rosenstock (1): IB/mad: Allow tuning of QP0 and QP1 sizes Jack Morgenstein (3): IB/uverbs: Return ENOSYS for unimplemented commands (not EINVAL) IB/mlx4: Don't allow userspace open while recovering from catastrophic error IB/mthca: Don't allow userspace open while recovering from catastrophic error Jason Gunthorpe (1): IPoIB: Check multicast address format Joachim Fenkes (2): IB/ehca: Construct MAD redirect replies from request MAD IB/ehca: Fix CQE flags reporting Marcin Slusarz (1): IB: Use printk_once() for driver versions Roel Kluin (2): IB/ipath: strncpy() doesn't always NUL-terminate RDMA/amso1100: Check kmalloc() result in c2_register_device() Roland Dreier (15): IPoIB: Remove unused includes IPoIB: Drop priv->lock before calling ipoib_send() IB/mad: Check hop count field in directed route MAD to avoid array overflow IB: Use DEFINE_SPINLOCK() for static spinlocks mlx4_core: Use pci_request_regions() mlx4_core: Remove unnecessary includes of IB/mlx4: Annotate CQ locking mlx4_core: Allocate and map sufficient ICM memory for EQ context IB/mthca: Remove unnecessary include of IB/mthca: Remove unnecessary include of IB/mthca: Annotate CQ locking IB/mad: Fix possible lock-lock-timer deadlock MAINTAINERS: InfiniBand/RDMA mailing list transition to vger Merge branches 'cxgb3', 'ehca', 'ipath', 'ipoib', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus Merge branch 'mad' into for-linus Steve Wise (8): RDMA/cxgb3: iwch_unregister_device leaks memory RDMA/cxgb3: Set the appropriate IO channel in rdma_init work requests RDMA/cxgb3: Handle port events properly RDMA/cxgb3: Don't free endpoints early RDMA/cxgb3: Wake up any waiters on peer close/abort RDMA/cxgb3: Don't ignore insert_handle() failures RDMA/cxgb3: Clean up properly on FW mismatch failures RDMA/iwcm: Reject the connection when the cm_id is destroyed Tobias Klauser (1): RDMA/amso1100: Use %pM conversion specifier Yevgeny Petrilin (1): mlx4_core: Avoid double free_icms Yossi Etigin (1): IB/core: Fix send multicast group leave retry MAINTAINERS | 12 +- drivers/infiniband/core/iwcm.c | 1 + drivers/infiniband/core/mad.c | 35 +- drivers/infiniband/core/mad_priv.h | 3 + drivers/infiniband/core/multicast.c | 10 +- drivers/infiniband/core/sa_query.c | 7 +- drivers/infiniband/core/smi.c | 8 + drivers/infiniband/core/uverbs_main.c | 10 +- drivers/infiniband/hw/amso1100/c2.c | 6 +- drivers/infiniband/hw/amso1100/c2_provider.c | 24 +- drivers/infiniband/hw/cxgb3/cxio_hal.c | 5 +- drivers/infiniband/hw/cxgb3/cxio_wr.h | 6 + drivers/infiniband/hw/cxgb3/iwch.c | 37 +- drivers/infiniband/hw/cxgb3/iwch_cm.c | 68 ++- drivers/infiniband/hw/cxgb3/iwch_cm.h | 9 +- drivers/infiniband/hw/cxgb3/iwch_mem.c | 21 +- drivers/infiniband/hw/cxgb3/iwch_provider.c | 52 ++- drivers/infiniband/hw/cxgb3/iwch_qp.c | 1 + drivers/infiniband/hw/ehca/ehca_main.c | 8 +- drivers/infiniband/hw/ehca/ehca_reqs.c | 6 +- drivers/infiniband/hw/ehca/ehca_sqp.c | 47 ++- drivers/infiniband/hw/ipath/ipath_file_ops.c | 2 +- drivers/infiniband/hw/ipath/ipath_mad.c | 2 +- drivers/infiniband/hw/mlx4/main.c | 12 +- drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/infiniband/hw/mlx4/qp.c | 12 +- drivers/infiniband/hw/mthca/mthca_catas.c | 1 + drivers/infiniband/hw/mthca/mthca_config_reg.h | 2 - drivers/infiniband/hw/mthca/mthca_dev.h | 1 + drivers/infiniband/hw/mthca/mthca_eq.c | 17 +- drivers/infiniband/hw/mthca/mthca_main.c | 8 +- drivers/infiniband/hw/mthca/mthca_provider.c | 3 + drivers/infiniband/hw/mthca/mthca_provider.h | 1 + drivers/infiniband/hw/mthca/mthca_qp.c | 12 +- drivers/infiniband/hw/mthca/mthca_reset.c | 1 - drivers/infiniband/hw/nes/nes.h | 2 +- drivers/infiniband/hw/nes/nes_cm.c | 128 ++-- drivers/infiniband/hw/nes/nes_cm.h | 2 - drivers/infiniband/hw/nes/nes_hw.c | 767 +++++++++++++++++------- drivers/infiniband/hw/nes/nes_hw.h | 103 ++++ drivers/infiniband/hw/nes/nes_utils.c | 5 +- drivers/infiniband/hw/nes/nes_verbs.c | 204 +++++-- drivers/infiniband/hw/nes/nes_verbs.h | 16 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 1 - drivers/infiniband/ulp/ipoib/ipoib_ib.c | 1 - drivers/infiniband/ulp/ipoib/ipoib_main.c | 7 +- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 21 + drivers/net/cxgb3/cxgb3_main.c | 6 +- drivers/net/cxgb3/cxgb3_offload.c | 6 +- drivers/net/cxgb3/cxgb3_offload.h | 8 +- drivers/net/mlx4/cq.c | 1 - drivers/net/mlx4/eq.c | 77 +-- drivers/net/mlx4/icm.c | 1 - drivers/net/mlx4/main.c | 37 +- drivers/net/mlx4/mcg.c | 1 - drivers/net/mlx4/mlx4.h | 7 +- drivers/net/mlx4/mr.c | 1 - drivers/net/mlx4/pd.c | 1 - drivers/net/mlx4/profile.c | 2 - drivers/net/mlx4/qp.c | 2 - drivers/net/mlx4/reset.c | 1 - drivers/net/mlx4/srq.c | 2 - drivers/scsi/cxgb3i/cxgb3i_init.c | 12 +- 63 files changed, 1278 insertions(+), 595 deletions(-) From rdreier at cisco.com Thu Sep 10 21:38:22 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Sep 2009 21:38:22 -0700 Subject: [ofa-general] [GIT PULL] please pull ummunotify Message-ID: Linus, please consider pulling from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify This will get "ummunotify," a new character device that allows a userspace library to register for MMU notifications; this is particularly useful for MPI implementions (message passing libraries used in HPC) to be able to keep track of what wacky things consumers do to their memory mappings. My colleague Jeff Squyres from the Open MPI project posted a blog entry about why MPI wants this: http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking/ His summary of ummunotify: "It’s elegant, doesn’t require strange linker tricks, and seems to work in all cases. Yay!" This code went through several review iterations on lkml and was in -mm and -next for quite a few weeks. Andrew is OK with merging it (I think -- Andrew please correct me if I misunderstood you). Roland Dreier (1): ummunotify: Userspace support for MMU notifications Documentation/Makefile | 3 +- Documentation/ummunotify/Makefile | 7 + Documentation/ummunotify/ummunotify.txt | 150 ++++++++ Documentation/ummunotify/umn-test.c | 200 +++++++++++ drivers/char/Kconfig | 12 + drivers/char/Makefile | 1 + drivers/char/ummunotify.c | 566 +++++++++++++++++++++++++++++++ include/linux/Kbuild | 1 + include/linux/ummunotify.h | 121 +++++++ 9 files changed, 1060 insertions(+), 1 deletions(-) create mode 100644 Documentation/ummunotify/Makefile create mode 100644 Documentation/ummunotify/ummunotify.txt create mode 100644 Documentation/ummunotify/umn-test.c create mode 100644 drivers/char/ummunotify.c create mode 100644 include/linux/ummunotify.h From Barry.Mavin at recital.com Thu Sep 10 22:38:15 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 11:08:15 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> Message-ID: I use mellanox connectx cards and switches in a cluster. When I try and use ibtracert I get this output. # ibtracert 10.10.10.1 10.10.10.3 ibwarn: [6998] _do_madrpc: recv failed: Connection timed out ibwarn: [6998] mad_rpc: _do_madrpc failed; dport (Lid 10) ibwarn: [6998] find_route: can't reach to/from ports ibtracert: iberror: failed: can't find a route to the src port Does anyone have any idea why this would be happening? --- Regards Barry Mavin Recital Corporation > From: Keshetti Mahesh > Date: Fri, 11 Sep 2009 09:32:39 +0530 > To: Ira Weiny > Cc: OFED mailing list , OFED mailing list > > Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add > 'ibcheckspeed' and 'ibcheckportspeed' to scripts > > My badness. I have not used 'iblinkinfo' before. > So, I guess there is no need for the above script. Apart from that, I feel > there should be a program/script which will first scan the fabric to find the > maximum common supported width/speed and then report the warning messages > of the links/ports which are configured with active width/speed less > than the found > value. Is there any tool already exists which does the same ? > > - > Keshetti Mahesh > > On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: >> Also, iblinkinfo will report links which it finds capable of either faster or >> wider operation.  iblinkinfo checks both ends of the link as Hal mentions. >>  It reports this with output like. >> >> Switch 0x0005ad0000092106 Cisco Switch SFS7000D: >> ... >>           7    8[  ] ==( 4X 2.5 Gbps Active/  LinkUp)==>       8   12[  ] >> "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) >> ... >> >> Also the portstatus console command in OpenSM will report links which are >> running at "reduced speed or width".  Although this does not check the remote >> port. >> >> OpenSM $ help portstatus >> portstatus [ca|switch|router] >> summarize port status >>   [ca|switch|router] -- limit the results to the node type specified >> OpenSM $ portstatus >> "ALL" port status: >>   115 port(s) scanned on 9 nodes in 26 us >>   85 down >>   30 active >>   32 at 4X >>   22 at 2.5 Gbps >>   8 at 5.0 Gbps >>   2 at 10.0 Gbps >> >> Possible issues: >>   2 disabled >>      0x0008f10400411b18 5 (ISR9024D Voltaire) >>      0x0005ad0000092106 13 (Cisco Switch SFS7000D) >>   6 with reduced speed >>      0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) >>      0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) >>      0x0005ad0000092106 21 (Cisco Switch SFS7000D) >>      0x0005ad0000092106 20 (Cisco Switch SFS7000D) >>      0x0005ad0000092106 9 (Cisco Switch SFS7000D) >>      0x0005ad0000092106 8 (Cisco Switch SFS7000D) >> >> >> Ira >> >> On Thu, 10 Sep 2009 09:23:35 -0400 >> Hal Rosenstock wrote: >> >>> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh >>> wrote: >>> >>>> Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to >>>> 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. >>>> Reports error/warning messages if the LinkSpeedActive is configured as >>>> 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. >>>> >>> >>> ibportstate checks for more than this in terms of speed (and width) >>> anomalies. >>> >>> Would it be better for these scripts to use that tool now ? Alternatively, >>> the additional speed/width anomaly checks could be implemented in these >>> scripts but it does involve checking the peer port so there's a little more >>> to it. >>> >>> -- Hal >>> >>> >>>> >>>> Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> >>>> --- >>>>  infiniband-diags/scripts/ibcheckportspeed.in |  146 >>>> ++++++++++++++++++++++++++ >>>>  infiniband-diags/scripts/ibcheckportwidth.in |    2 +- >>>>  infiniband-diags/scripts/ibcheckspeed.in     |  135 >>>> ++++++++++++++++++++++++ >>>>  3 files changed, 282 insertions(+), 1 deletions(-) >>>>  create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in >>>>  create mode 100644 infiniband-diags/scripts/ibcheckspeed.in >>>> >>> >>> >> >> >> -- >> Ira Weiny >> Math Programmer/Computer Scientist >> Lawrence Livermore National Lab >> 925-423-8008 >> weiny2 at llnl.gov >> > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From keshetti.mahesh at gmail.com Thu Sep 10 22:46:44 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Fri, 11 Sep 2009 11:16:44 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> Message-ID: <829ded920909102246o18b25caagf84bfe352f699a82@mail.gmail.com> > # ibtracert 10.10.10.1 10.10.10.3 ibtracert only supports source/destination addresses to be specified in LID/GUID format. See man page of ibtracert. - Keshetti Mahesh From rdreier at cisco.com Thu Sep 10 23:03:25 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Sep 2009 23:03:25 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090911145036.DB65.A69D9226@jp.fujitsu.com> (KOSAKI Motohiro's message of "Fri, 11 Sep 2009 14:56:13 +0900 (JST)") References: <20090911145036.DB65.A69D9226@jp.fujitsu.com> Message-ID: > Can I this version already solved fork() + COW issue? if so, could you > please explain what happen at fork. Obviously RDMA point to either parent > or child page, not both. but Corrent COW rule is, first touch process > get copyed page and other process still own original page. I think it's > unpecected behavior form RDMA. No, ummunotify doesn't really help that much with fork() + COW. If a parent forks and then touches pages that are actively in use for RDMA, then of course they get COWed and RDMA goes to the wrong memory (from the point of view of the parent). ummunotify does deal with the case where a process forks and touches memory that was used for RDMA but no longer is -- in that case, the MPI library has a chance to flush its registration cache because it will get a ummunotify event invalidating the old mapping. The real purpose of ummunotify is to allow MPI implementations to cache registrations, even when the MPI library is used with an application that does funny things for allocation (mmap()/munmap() or brk(), etc). - Roland From Barry.Mavin at recital.com Thu Sep 10 23:13:40 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 11:43:40 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909102246o18b25caagf84bfe352f699a82@mail.gmail.com> Message-ID: When I start the subnet manager on redhat 5.3 with: # service opensm restart I get the following messages in the log file. Sep 11 11:41:46 252576 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b91 (ibas1 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252688 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300044c61 (ibds2 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252731 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b7d (ibds1 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Sep 11 11:41:46 252929 [43170940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 failed from port 0x0002c90300047b75 (ibas2 HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID What is the cause of these? --- Regards Barry Mavin Recital Corporation Chairman and CEO Website: http://www.recital.com MSN Messenger: Barry_Mavin at msn.com Skype: BarryMavin Direct line worldwide: +1 9785224139 > From: Keshetti Mahesh > Date: Fri, 11 Sep 2009 11:16:44 +0530 > To: Barry Mavin > Cc: OFED mailing list , OFED mailing list > > Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add > 'ibcheckspeed' and 'ibcheckportspeed' to scripts > >> # ibtracert 10.10.10.1 10.10.10.3 > > ibtracert only supports source/destination addresses to be specified > in LID/GUID format. See man page of ibtracert. > > - > Keshetti Mahesh From rdreier at cisco.com Thu Sep 10 23:22:20 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Sep 2009 23:22:20 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <4AA9EAF7.5010401@inria.fr> (Brice Goglin's message of "Fri, 11 Sep 2009 08:15:19 +0200") References: <20090911145036.DB65.A69D9226@jp.fujitsu.com> <4AA9EAF7.5010401@inria.fr> Message-ID: > My understanding of the code is that fork will end-up calling > copy_page_range() on all VMA, and copy_page_range() calls > mmu_notifier_invalidate_range_start() if is_cow_mapping() is true, > which should be the case here. So you should get some invalidate events > on fork. Yes, I agree (that's what the second half of my email tried to say). However, that doesn't help if the parent process is actively doing RDMA on the range being invalidated -- the MPI library or whatever will get the invalidate event via ummunotify, but what can it do? The event is basically saying "your data is going to the wrong place" and I don't see what useful thing MPI could do with that. As I said, it does mean that MPI can invalidate cached registrations for COWed memory, which might be useful in case a parent forks and then touches memory it used to use for RDMA, but I think that's the easier part of the fork/COW problem. - R. From jgunthorpe at obsidianresearch.com Thu Sep 10 23:40:19 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Fri, 11 Sep 2009 00:40:19 -0600 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <20090911145036.DB65.A69D9226@jp.fujitsu.com> <4AA9EAF7.5010401@inria.fr> Message-ID: <20090911064019.GZ4973@obsidianresearch.com> On Thu, Sep 10, 2009 at 11:22:20PM -0700, Roland Dreier wrote: > As I said, it does mean that MPI can invalidate cached registrations for > COWed memory, which might be useful in case a parent forks and then > touches memory it used to use for RDMA, but I think that's the easier > part of the fork/COW problem. What happens to all the other IB resources (PD, CQ, QP, etc) on fork? AFAIK, pretty much by design the IB stack cannot/does not duplicate these objects. The natural consequence is that a PD is always associated with a single process at a time, thus a memory registration which is associated with a PD must also be associated with a single process. So.. What is the problem with fork? The semantics of what should happen seem natural enough to me, the PD doesn't get copied to the child, so the MR stays with the parent. COW events on the pinned region must be resolved so that the physical page stays with the process that has pinned it - the pin is logically released in the child because the MR doesn't exist because the PD doesn't exist. Is this a general problem with the MR mechanism? If I mmap(MAP_SHARED|MAP_READONLY) and someone mmaps(MAP_PRIVATE|MAP_WRITE) on the same file I can generate COW events - will this make RDMAs go randomly too?? Jason From vlad at lists.openfabrics.org Fri Sep 11 03:08:24 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 11 Sep 2009 03:08:24 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090911-0200 daily build status Message-ID: <20090911100824.DA94EE60AA0@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From hnrose at comcast.net Fri Sep 11 04:34:06 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Fri, 11 Sep 2009 07:34:06 -0400 Subject: [ofa-general] [PATCH] opensm/opensm.8.in: Cosmetic formatting change Message-ID: <20090911113406.GA14150@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index 5ad7631..fcdc168 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -257,7 +257,6 @@ This option provides the means to define a set of ports equalization algorithm. .TP \fB\-w\fR, \fB\-\-hop_weights_file\fR - This option provides weighting factors per port representing a hop cost in computing the lid matrix. The file consists of lines containing a switch port GUID (specified as a 64 bit hex number, with leading 0x), output port number, @@ -265,7 +264,6 @@ and weighting factor. Any port not listed in the file defaults to a weighting factor of 1. Lines starting with # are comments. Weights affect only the output route from the port, so many useful configurations will require weights to be specified in pairs. - .TP \fB\-x\fR, \fB\-\-honor_guid2lid\fR This option forces OpenSM to honor the guid2lid file, From hal.rosenstock at gmail.com Fri Sep 11 06:20:58 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 11 Sep 2009 09:20:58 -0400 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: References: <829ded920909102246o18b25caagf84bfe352f699a82@mail.gmail.com> Message-ID: On Fri, Sep 11, 2009 at 2:13 AM, Barry Mavin wrote: > When I start the subnet manager on redhat 5.3 with: > > # service opensm restart > > I get the following messages in the log file. > > Sep 11 11:41:46 252576 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b91 (ibas1 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252688 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300044c61 (ibds2 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252731 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b7d (ibds1 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252929 [43170940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b75 (ibas2 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > > What is the cause of these? > Those ports are unable to join some multicast group likely due to rate or MTU mismatch with the group. What are their rates/MTUs ? See opensm man page on partition configuration for the default partition for information on how to change the MTU/rate. -- Hal > > --- > Regards > Barry Mavin > Recital Corporation > Chairman and CEO > Website: http://www.recital.com > MSN Messenger: Barry_Mavin at msn.com > Skype: BarryMavin > Direct line worldwide: +1 9785224139 > > > > > From: Keshetti Mahesh > > Date: Fri, 11 Sep 2009 11:16:44 +0530 > > To: Barry Mavin > > Cc: OFED mailing list , OFED mailing list > > > > Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add > > 'ibcheckspeed' and 'ibcheckportspeed' to scripts > > > >> # ibtracert 10.10.10.1 10.10.10.3 > > > > ibtracert only supports source/destination addresses to be specified > > in LID/GUID format. See man page of ibtracert. > > > > - > > Keshetti Mahesh > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Barry.Mavin at recital.com Fri Sep 11 06:34:40 2009 From: Barry.Mavin at recital.com (Barry Mavin) Date: Fri, 11 Sep 2009 19:04:40 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: Message-ID: Thanks that resolved it --- Regards Barry Mavin Recital Corporation From: Hal Rosenstock Date: Fri, 11 Sep 2009 09:20:58 -0400 To: Barry Mavin Cc: Keshetti Mahesh , OFED mailing list , OFED mailing list Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts On Fri, Sep 11, 2009 at 2:13 AM, Barry Mavin wrote: > When I start the subnet manager on redhat 5.3 with: > > # service opensm restart > > I get the following messages in the log file. > > Sep 11 11:41:46 252576 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b91 (ibas1 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252688 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300044c61 (ibds2 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252731 [4EF83940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b7d (ibds1 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Sep 11 11:41:46 252929 [43170940] 0x01 -> __osm_mcmr_rcv_join_mgrp: ERR > 1B12: __validate_more_comp_fields, __validate_port_caps, or JoinState = 0 > failed from port 0x0002c90300047b75 (ibas2 HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > > What is the cause of these?    Those ports are unable to join some multicast group likely due to rate or MTU mismatch with the group. What are their rates/MTUs ? See opensm man page on partition configuration for the default partition for information on how to change the MTU/rate.   -- Hal   > > --- > Regards > Barry Mavin > Recital Corporation > Chairman and CEO > Website: http://www.recital.com > MSN Messenger: Barry_Mavin at msn.com > Skype: BarryMavin > Direct line worldwide: +1 9785224139 > > > >> > From: Keshetti Mahesh >> > Date: Fri, 11 Sep 2009 11:16:44 +0530 >> > To: Barry Mavin >> > Cc: OFED mailing list , OFED mailing list >> > >> > Subject: Re: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add >> > 'ibcheckspeed' and 'ibcheckportspeed' to scripts >> > >>> >> # ibtracert 10.10.10.1 10.10.10.3 >> > >> > ibtracert only supports source/destination addresses to be specified >> > in LID/GUID format. See man page of ibtracert. >> > >> > - >> > Keshetti Mahesh > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at  http://vger.kernel.org/majordomo-info.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Fri Sep 11 08:18:37 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 11 Sep 2009 18:18:37 +0300 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> Message-ID: <20090911151837.GB17481@me> On 09:17 Thu 03 Sep , Hal Rosenstock wrote: > > Also it says the default partition will be created “unconditionally even > > when partition configuration file does not exist or cannot be accessed.” > > Will it also be created if the partition configuration file exists but does > > not have a default partition defined? > > > No. AFAIR OpenSM will create the default partition even before partitions config file parsing, when the file exists it will be: Default=0x7fff: ALL=limited, SELF=full; (no IPoIB there). And this can be overwritten by configuration. Sasha From sashak at voltaire.com Fri Sep 11 08:21:40 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 11 Sep 2009 18:21:40 +0300 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: <4A9FC81F.4040009@dev.mellanox.co.il> References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> <4A9FC81F.4040009@dev.mellanox.co.il> Message-ID: <20090911152140.GC17481@me> On 16:43 Thu 03 Sep , Yevgeny Kliteynik wrote: > > > > I've never done it this way but it does look like the partition create > > code will detect the duplicated partitions (0x111 and 0x112) and merge > > ports from rack2 with rack1 and rack4 with rack3. > > It will. > Note that partition names are meaningless in terms of IB management. > Basically they are used just for logging. Not only for logging, OpenSM will also try preserve same PKey value for partition name over reconfiguration, and guess it is all. Sasha From rdreier at cisco.com Fri Sep 11 09:58:10 2009 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Sep 2009 09:58:10 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090911064019.GZ4973@obsidianresearch.com> (Jason Gunthorpe's message of "Fri, 11 Sep 2009 00:40:19 -0600") References: <20090911145036.DB65.A69D9226@jp.fujitsu.com> <4AA9EAF7.5010401@inria.fr> <20090911064019.GZ4973@obsidianresearch.com> Message-ID: > So.. What is the problem with fork? The semantics of what should > happen seem natural enough to me, the PD doesn't get copied to the > child, so the MR stays with the parent. COW events on the pinned > region must be resolved so that the physical page stays with the > process that has pinned it - the pin is logically released in the > child because the MR doesn't exist because the PD doesn't exist. This is getting away from the problem that ummunotify is solving, but handling a COW fault generated by the parent by doing the copy in the child seems like a pretty major, tricky change to make. The child may have forked 100 more times in the meantime, meaning we now have to change 101 memory maps ... the cost of page faults goes through the roof probably... - R. From gleb at redhat.com Fri Sep 11 09:42:47 2009 From: gleb at redhat.com (Gleb Natapov) Date: Fri, 11 Sep 2009 19:42:47 +0300 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090911150552.DB68.A69D9226@jp.fujitsu.com> References: <20090911145036.DB65.A69D9226@jp.fujitsu.com> <20090911150552.DB68.A69D9226@jp.fujitsu.com> Message-ID: <20090911164247.GA6736@redhat.com> On Fri, Sep 11, 2009 at 03:11:36PM +0900, KOSAKI Motohiro wrote: > Hi > > Thank you explanation. > > > > > > Can I this version already solved fork() + COW issue? if so, could you > > > please explain what happen at fork. Obviously RDMA point to either parent > > > or child page, not both. but Corrent COW rule is, first touch process > > > get copyed page and other process still own original page. I think it's > > > unpecected behavior form RDMA. > > > > No, ummunotify doesn't really help that much with fork() + COW. If a > > parent forks and then touches pages that are actively in use for RDMA, > > then of course they get COWed and RDMA goes to the wrong memory (from > > the point of view of the parent). > > So, Can we assume OpenMPI user process doesn't such thing? > > Parhaps, madvise(DONTFORK) or vfork() avoid this issue. but I'm not > sure all program in the world do that. > MPI (or is it libibverbs?) marks all registered memory as DONTFORK. -- Gleb. From hal.rosenstock at gmail.com Fri Sep 11 11:17:34 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 11 Sep 2009 14:17:34 -0400 Subject: [ofa-general] question about partitioning IB networks In-Reply-To: <20090911151837.GB17481@me> References: <6203933669E90E4AB42B5BC4EDE38D350C7D048C32@orsmsx510.amr.corp.intel.com> <20090911151837.GB17481@me> Message-ID: On Fri, Sep 11, 2009 at 11:18 AM, Sasha Khapyorsky wrote: > On 09:17 Thu 03 Sep , Hal Rosenstock wrote: > > > Also it says the default partition will be created “unconditionally > even > > > when partition configuration file does not exist or cannot be > accessed.” > > > Will it also be created if the partition configuration file exists but > does > > > not have a default partition defined? > > > > > No. > > AFAIR OpenSM will create the default partition even before partitions > config file parsing, when the file exists it will be: > > Default=0x7fff: ALL=limited, SELF=full; > > (no IPoIB there). And this can be overwritten by configuration. > Indeed it does (since the default partition is always required). The man page (and partition doc) should be updated to clarify this. -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenos at ncsa.uiuc.edu Fri Sep 11 11:57:59 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Fri, 11 Sep 2009 13:57:59 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <4AA96258.7040301@ncsa.uiuc.edu> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> <4AA96258.7040301@ncsa.uiuc.edu> Message-ID: <4AAA9DB7.9010105@ncsa.uiuc.edu> An HTML attachment was scrubbed... URL: From worleys at gmail.com Fri Sep 11 12:50:21 2009 From: worleys at gmail.com (Chris Worley) Date: Fri, 11 Sep 2009 21:50:21 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Wed, Sep 9, 2009 at 6:38 PM, Bart Van Assche wrote: > On Wed, Sep 9, 2009 at 12:29 AM, Chris Worley wrote: >> I'm wondering why it's so easily repeatable by me, and those I work >> with, and nobody else?  I have another completely different >> configuration w/ the same issue... > > It would help if you could run the following test: > * Connect an SRP initiator back-to-back to the SRP target. Many of my tests have been back-to-back, and using different switches, with no effect. > * Install an operating system combination on initiator and target with > which SRP did not work properly under heavy load. I'm not sure what test is being requested. I have tried many tests using the same OS/Dirtro/etc... combination on both target/initiator sides. > * Install and run OpenSM on the target if this software is not yet > running on the target. I do generally avoid running it in the target, so will try this test. > * Repeat the SRP stress test on the initiator system that is connected > back-to-back to the SRP target. Does "SRP Stress Test" mean some specific test included w/ SCST? > > This will tell us whether or not the IB switch or its firmware is > causing the SRP issues. I've definitely removed the switch/firmware from being the cause. I'm thinking the reason you can't repeat the test may be latency related. We get ~50usecs average latency (on small block sizes), which can't be achieved using regular SSD's (and rotating drives are nowhere close). Maybe a ramdisk would help repeat the issue. Thanks, Chris > > Bart. > From jenos at ncsa.uiuc.edu Fri Sep 11 21:07:38 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Fri, 11 Sep 2009 23:07:38 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <20090910235459.GD7552@obsidianresearch.com> References: <4A8E4854.2060909@ncsa.uiuc.edu> <200908301856.33259.jackm@dev.mellanox.co.il> <4A9AB9AD.80803@ncsa.uiuc.edu> <200908311217.43954.jackm@dev.mellanox.co.il> <4AA948BE.9060806@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> <20090910235459.GD7552@obsidianresearch.com> Message-ID: <4AAB1E8A.3070609@ncsa.uiuc.edu> If created in the wrong place though, why isn't this a universal problem? Or is it? thx- Jeremy Jason Gunthorpe wrote: > On Thu, Sep 10, 2009 at 03:00:56PM -0500, Jeremy Enos wrote: > >> Fails w/ ofa_kernel like the others have... I didn't test excluding this rpm >> with FC11, but the others also fail elsewhere w/ this rpm excluded- so I'm >> guessing FC11 would as well. I included the output (and last 50 lines of log) >> in case you can glean something useful from it. Thank you- >> > > Hmm.. looks like a scripting problem to me, no compiler errors hidden > someplace in that log? > > >> if [ ! -n "/var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64" >> ]; then /sbin/depmod -r -ae 2.6.30.5-43.fc11.x86_64;fi; >> ++ find /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5 -name Module.symvers -o -name >> Modules.symvers >> + modsyms=/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/Module.symvers >> + for modsym in '$modsyms' >> + cat /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/Module.symvers >> /var/tmp/rpm-tmp.7Js3WE: line 37: /var/tmp/OFED_topdir/BUILDROOT/ >> ofa_kernel-1.5-ofed1.5.beta1.x86_64//usr/src/ofa_kernel/Module.symvers: No such >> file or directory >> error: Bad exit status from /var/tmp/rpm-tmp.7Js3WE (%install) >> > > Funky. Created it in the wrong place? > > Jason > > From vlad at lists.openfabrics.org Sat Sep 12 03:06:23 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 12 Sep 2009 03:06:23 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090912-0200 daily build status Message-ID: <20090912100623.947BAE28238@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From bart.vanassche at gmail.com Sat Sep 12 05:18:09 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Sat, 12 Sep 2009 14:18:09 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AA7F626.7060804@vlnb.net> References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> <4AA7F626.7060804@vlnb.net> Message-ID: On Wed, Sep 9, 2009 at 8:38 PM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/09/2009 02:29 AM wrote: >> [ ... ] >> Sep  8 22:40:39 nameme kernel: end_request: I/O error, dev sde, sector >> 123559464 >> Sep  8 22:40:39 nameme kernel: device-mapper: multipath: Failing path >> 8:64. >> Sep  8 22:41:32 nameme kernel:  host27: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:32 nameme kernel:  host28: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:32 nameme kernel:  host29: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:32 nameme kernel:  host30: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:32 nameme kernel:  host31: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:32 nameme kernel:  host32: ib_srp: DREQ received - >> connection closed >> Sep  8 22:41:34 nameme kernel:  host27: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host27: add qp_in_err timer >> Sep  8 22:41:34 nameme kernel:  host28: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host28: add qp_in_err timer >> Sep  8 22:41:34 nameme kernel:  host29: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host29: add qp_in_err timer >> Sep  8 22:41:34 nameme kernel:  host30: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host30: add qp_in_err timer >> Sep  8 22:41:34 nameme kernel:  host31: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host31: add qp_in_err timer >> Sep  8 22:41:34 nameme kernel:  host32: ib_srp: connection closed >> Sep  8 22:41:34 nameme kernel: ib_srp:  host32: add qp_in_err timer >> Sep  8 22:41:59 nameme kernel:  host27: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host27: ib_srp: srp_qp_in_err_timer >> flushed reset - done >> Sep  8 22:41:59 nameme kernel:  host28: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host28: ib_srp: srp_qp_in_err_timer >> flushed reset - done >> Sep  8 22:41:59 nameme kernel:  host29: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host29: ib_srp: srp_qp_in_err_timer >> flushed reset - done >> Sep  8 22:41:59 nameme kernel:  host30: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host30: ib_srp: srp_qp_in_err_timer >> flushed reset - done >> Sep  8 22:41:59 nameme kernel:  host31: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host31: ib_srp: srp_qp_in_err_timer >> flushed reset - done >> Sep  8 22:41:59 nameme kernel:  host32: ib_srp: srp_qp_in_err_timer called >> Sep  8 22:41:59 nameme kernel:  host32: ib_srp: srp_qp_in_err_timer > > Those messages should be analyzed and can be a key. The above messages mean that the SRP initiator correctly detected that the SRP target closed six SRP connections. But how did you get six SRP connections ? Are there six HCA's in the target system or are there six different target systems ? And how has the device mapper been configured on the initiator ? Bart. From bart.vanassche at gmail.com Sat Sep 12 08:24:33 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Sat, 12 Sep 2009 17:24:33 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> Message-ID: On Fri, Sep 11, 2009 at 9:50 PM, Chris Worley wrote: > I'm thinking the reason you can't repeat the test may be latency > related.  We get ~50usecs average latency (on small block sizes), > which can't be achieved using regular SSD's (and rotating drives are > nowhere close).  Maybe a ramdisk would help repeat the issue. SRPT is being tested routinely with a RAM disk as target and this works fine. Bart. From dorons at voltaire.com Sun Sep 13 00:55:30 2009 From: dorons at voltaire.com (Doron Shoham) Date: Sun, 13 Sep 2009 10:55:30 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> Message-ID: <4AACA572.2000603@voltaire.com> Hal Rosenstock wrote: > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > wrote: > > ibcheckroutes validates route between all hosts in the fabric. > This script finds all leaf switches (switches that are connected to > HCAs) > This script parses the output of ibnetdiscoer. It finds all leaf switches (from the topology file generated by ibnetdiscover). The it checks if a route exists between all leaf switches using ibtracert. > > CAs or HCAs ? CAs > > What about switch port 0s ? It checks connectivity only between leaf switches (not all switches). I assume that traffic is generated only between CAs and therefor connectivity between other switches (not leaf switches) does not important. > > > and runs ibtracert between them. > When using various routing algorithms (e.g. up-down), > > > With which routing algorithms has this been tried ? I assume that from complexity perspective, the routing algorithms calculate routes only between leaf switches and not between all CAs. Then it adds one hop for all CAs connected to the leaf switches. I've tested it with up-down but it really doesn't matter which routing algorithm you are using. It just check the routes between leaf switches (and if the routing algorithm behave as above, it means that it checks all CAs connectivity). > > -- Hal > > > if fabric topology is not suitable there will be no > routes between some nodes. > It reports when the route exists between source and destination LIDs. > > Signed-off-by: Doron Shoham > > > > From jackm at dev.mellanox.co.il Sun Sep 13 02:29:29 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 13 Sep 2009 12:29:29 +0300 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <4AA95AF8.9020905@ncsa.uiuc.edu> References: <4A8E4854.2060909@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> Message-ID: <200909131229.29887.jackm@dev.mellanox.co.il> On Thursday 10 September 2009 23:00, Jeremy Enos wrote: > Fails w/ ofa_kernel like the others have... I didn't test excluding this rpm with FC11, but the others also fail elsewhere w/ this rpm excluded- so I'm guessing FC11 would as well.  I included the output (and last 50 lines of log) in case you can glean something useful from it.  Thank you- > >     Jeremy > > Build ofa_kernel RPM > Running rpmbuild --rebuild  --define '_topdir /var/tmp/OFED_topdir' --define 'configure_options   --with-core-mod --with-user_mad-mod --with-user_access-mod --with-addr_trans-mod --with-mthca-mod --with-mlx4-mod --with-mlx4_en-mod --with-cxgb3-mod --with-nes-mod --with-qib-mod --with-ipoib-mod --with-sdp-mod --with-rds-mod --with-qlgc_vnic-mod --with-iser-mod --with-nfsrdma-mod' --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' --define 'KVERSION 2.6.30.5-43.fc11.x86_64' --define 'K_SRC /lib/modules/2.6.30.5-43.fc11.x86_64/build' --define 'network_dir /etc/sysconfig/network-scripts' --define '_prefix /usr' --define '__arch_install_post %{nil}' /home-ib/ac/jenos/ofed/OFED-1.5-beta1/SRPMS/ofa_kernel-1.5-ofed1.5.beta1.src.rpm > Failed to build ofa_kernel RPM > See /tmp/OFED.1584.logs/ofa_kernel.rpmbuild.log > [root at ac32 OFED-1.5-beta1]# tail -50 /tmp/OFED.1584.logs/ofa_kernel.rpmbuild.log >                 mkdir -p /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds; \ >                 mv /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra/net/rds/rds*.ko /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds/ ; \ >         elif [ -d /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra ]; then \ >                 mkdir -p /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds; \ >                 mv /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra/rds*.ko /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds/ ; \ >         fi; I notice that you have a 2.6.30-based kernel. The FC11 that we have here is 2.6.29-based: 2.6.29.4-167.fc11.x86_64 Has there been some change in FC11 that we are not aware of? (the release notes for FC11 also state that it is 2.6.29-based). That said, I notice that there is a bug in rds on our Fedora Core11 (2.6.29-based). Andy, you need to put rds_to_2_6_28.patch into a 2.6.29 kernel_patches/backports directory. There isn't one yet. I can do this for you, but I suggest a rename to rds_to_2_6_29.patch for the file (in this directory only). Summary: I will create kernel_patches/backport/2.6.29 directory, and put rds_to_2_6_29.patch inside. rds_to_2_6_29.patch is identical to rds_to_2_6_28.patch NOTE: as of now, there is no need for a separate 2.6.29_FC11 backports directory. If the need arises, we will create it. If you ACK, I will take care of it. -Jack From vlad at lists.openfabrics.org Sun Sep 13 03:07:47 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 13 Sep 2009 03:07:47 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090913-0200 daily build status Message-ID: <20090913100747.AEF17E61F1B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sashak at voltaire.com Sun Sep 13 06:11:10 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Sep 2009 16:11:10 +0300 Subject: [ofa-general] Re: [PATCH] opensm/opensm.8.in: Cosmetic formatting change In-Reply-To: <20090911113406.GA14150@comcast.net> References: <20090911113406.GA14150@comcast.net> Message-ID: <20090913131110.GG17481@me> On 07:34 Fri 11 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sun Sep 13 06:26:21 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Sep 2009 16:26:21 +0300 Subject: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing In-Reply-To: References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> Message-ID: <20090913132621.GH17481@me> Hi Hal, On 18:23 Tue 08 Sep , Hal Rosenstock wrote: > On Sun, Sep 6, 2009 at 1:39 PM, Sasha Khapyorsky wrote: > > > > > When we have two or more changes in a same multicast group multiple > > multicast rerouting requests will be created and processed. To prevent > > this we will use array of requests indexed by mlid value minus > > IB_LID_MCAST_START_HO and for each multicast group change we will just > > mark that specific mlid requires re-routing and "duplicated" requests > > will be merged there. > > > > Also in this way we will be able to process multicast group routing > > entries deletion for already removed groups by just knowing its MLID > > and not using its content - this will let us to not delay mutlicast > > groups deletion ('to_be_deleted' flag) and will simplify many multicast > > related code flows. > > > > While the delay adds complexity, it is a feature. Delayed deletion (and > join) is allowed by IBA and is needed in a fast changing subnet when there > are a lot of groups changing. This was seen quite a while ago and was how > OpenSM evolved based on field experience and other testing. Eitan is the > expert here. IMO support for this needs to be added (back in). Maybe my changelog message is unclear (although I don't see this immediately). This patch does nothing with multicast re-routing scheduling, but with multicast group object deletion. Sasha From tziporet at dev.mellanox.co.il Sun Sep 13 06:45:01 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 13 Sep 2009 16:45:01 +0300 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <1252638865.4712.22.camel@localhost.localdomain> References: <1252638865.4712.22.camel@localhost.localdomain> Message-ID: <4AACF75D.8020706@mellanox.co.il> Shirley Ma wrote: > Hello Barry > > On Fri, 2009-09-11 at 08:37 +0530, Barry Mavin wrote: > >> Is LRO on by default? If not how can we enable it? >> >> > > It is a mlx4_en module parameter, by default it is set to > MLX4_EN_MAX_LRO_DESCRIPTORS which is 32, reloading the module with > num_lro=0 will disable it. It will impact performance when disabling it. > > Any possible reason for port1 works but not port2? > > Shirley > > > Yevgeny - our maintainer of mlx4_en driver is on vacation He will look into it when he is back Tziporet From sashak at voltaire.com Sun Sep 13 06:47:59 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Sep 2009 16:47:59 +0300 Subject: [ofa-general] [PATCH] opensm: improve multicast re-routing requests processing In-Reply-To: References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> Message-ID: <20090913134759.GJ17481@me> On 21:27 Wed 09 Sep , Eitan Zahavi wrote: > > Sorry I am not following the exact details of the changes made. Yes. The introduced patches don't reduce multicast routing scalability. Sasha From worleys at gmail.com Sun Sep 13 07:57:03 2009 From: worleys at gmail.com (Chris Worley) Date: Sun, 13 Sep 2009 16:57:03 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AA7F626.7060804@vlnb.net> Message-ID: On Sat, Sep 12, 2009 at 2:18 PM, Bart Van Assche wrote: > On Wed, Sep 9, 2009 at 8:38 PM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/09/2009 02:29 AM wrote: >>> [ ... ] >>> Sep  8 22:40:39 nameme kernel: end_request: I/O error, dev sde, sector >>> 123559464 >>> Sep  8 22:40:39 nameme kernel: device-mapper: multipath: Failing path >>> 8:64. >>> Sep  8 22:41:32 nameme kernel:  host27: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:32 nameme kernel:  host28: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:32 nameme kernel:  host29: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:32 nameme kernel:  host30: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:32 nameme kernel:  host31: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:32 nameme kernel:  host32: ib_srp: DREQ received - >>> connection closed >>> Sep  8 22:41:34 nameme kernel:  host27: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host27: add qp_in_err timer >>> Sep  8 22:41:34 nameme kernel:  host28: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host28: add qp_in_err timer >>> Sep  8 22:41:34 nameme kernel:  host29: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host29: add qp_in_err timer >>> Sep  8 22:41:34 nameme kernel:  host30: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host30: add qp_in_err timer >>> Sep  8 22:41:34 nameme kernel:  host31: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host31: add qp_in_err timer >>> Sep  8 22:41:34 nameme kernel:  host32: ib_srp: connection closed >>> Sep  8 22:41:34 nameme kernel: ib_srp:  host32: add qp_in_err timer >>> Sep  8 22:41:59 nameme kernel:  host27: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host27: ib_srp: srp_qp_in_err_timer >>> flushed reset - done >>> Sep  8 22:41:59 nameme kernel:  host28: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host28: ib_srp: srp_qp_in_err_timer >>> flushed reset - done >>> Sep  8 22:41:59 nameme kernel:  host29: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host29: ib_srp: srp_qp_in_err_timer >>> flushed reset - done >>> Sep  8 22:41:59 nameme kernel:  host30: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host30: ib_srp: srp_qp_in_err_timer >>> flushed reset - done >>> Sep  8 22:41:59 nameme kernel:  host31: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host31: ib_srp: srp_qp_in_err_timer >>> flushed reset - done >>> Sep  8 22:41:59 nameme kernel:  host32: ib_srp: srp_qp_in_err_timer called >>> Sep  8 22:41:59 nameme kernel:  host32: ib_srp: srp_qp_in_err_timer >> >> Those messages should be analyzed and can be a key. > > The above messages mean that the SRP initiator correctly detected that > the SRP target closed six SRP connections. But how did you get six SRP > connections ? Are there six HCA's in the target system or are there > six different target systems ? And how has the device mapper been > configured on the initiator ? The only thing there were six of are LVM volumes. One HCA, using both ports; 10 target drives being exported from one target. Chris > > Bart. > From dorons at voltaire.com Sun Sep 13 09:23:13 2009 From: dorons at voltaire.com (Doron Shoham) Date: Sun, 13 Sep 2009 19:23:13 +0300 Subject: [ofa-general] [PATCH] ibportstate: add width option Message-ID: <4AAD1C71.4010702@voltaire.com> ibportstate: add width option. Similar to the speed option, this option can explicitly set the port's LinkWidthEnable value. It supports values from 0-15 and 255. Signed-off-by: Doron Shoham --- infiniband-diags/man/ibportstate.8 | 11 +++++++---- infiniband-diags/src/ibportstate.c | 17 ++++++++++++++++- 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/infiniband-diags/man/ibportstate.8 b/infiniband-diags/man/ibportstate.8 index 9b5e618..1dc3724 100644 --- a/infiniband-diags/man/ibportstate.8 +++ b/infiniband-diags/man/ibportstate.8 @@ -15,7 +15,7 @@ ibportstate allows the port state and port physical state of an IB port to be queried (in addition to link width and speed being validated relative to the peer port when the port queried is a switch port), or a switch port to be disabled, enabled, or reset. It -also allows the link speed enabled on any IB port to be adjusted. +also allows the link speed/width enabled on any IB port to be adjusted. .SH OPTIONS @@ -23,16 +23,17 @@ also allows the link speed enabled on any IB port to be adjusted. .TP op Port operations allowed - supported ops: enable, disable, reset, speed, query + supported ops: enable, disable, reset, speed, width, query Default is query .PP ops enable, disable, and reset are only allowed on switch ports (An error is indicated if attempted on CA or router ports) - speed op is allowed on any port + speed/width ops are allowed on any port speed values are legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) - (NOTE: Speed changes are not effected until the port goes through + width values are legal values for PortInfo:LinkWidthEnabled + (NOTE: Speed/Width changes are not effected until the port goes through link renegotiation) query also validates port characteristics (link width and speed) based on the peer port. This checking is done when the port @@ -108,6 +109,8 @@ ibportstate -D 0 1 # (query) by direct route ibportstate 3 1 reset # by lid .PP ibportstate 3 1 speed 1 # by lid +.PP +ibportstate 3 1 width 1 # by lid .SH AUTHOR .TP diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 76e74f7..4ed8e82 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -216,13 +216,14 @@ int main(int argc, char **argv) int selfport = 0; char usage_args[] = " []\n" - "\nSupported ops: enable, disable, reset, speed, query"; + "\nSupported ops: enable, disable, reset, speed, width, query"; const char *usage_examples[] = { "3 1 disable\t\t\t# by lid", "-G 0x2C9000100D051 1 enable\t# by guid", "-D 0 1\t\t\t# (query) by direct route", "3 1 reset\t\t\t# by lid", "3 1 speed 1\t\t\t# by lid", + "3 1 width 1\t\t\t# by lid", NULL }; @@ -263,6 +264,15 @@ int main(int argc, char **argv) speed = strtoul(argv[3], 0, 0); if (speed > 15) IBERROR("invalid speed value %d", speed); + } else if (!strcmp(argv[2], "width")) { + if (argc < 4) + IBERROR + ("width requires an additional parameter"); + port_op = 5; + /* Parse width value */ + width = strtoul(argv[3], 0, 0); + if (width > 15 && width != 255) + IBERROR("invalid width value %d", width); } } @@ -298,6 +308,11 @@ int main(int argc, char **argv) speed); mad_set_field(data, 0, IB_PORT_STATE_F, 0); mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); + } else if (port_op == 5) { /* Set width */ + mad_set_field(data, 0, IB_PORT_LINK_WIDTH_ENABLED_F, + width); + mad_set_field(data, 0, IB_PORT_STATE_F, 0); + mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); } err = set_port_info(&portid, data, portnum, port_op); -- 1.5.4 From jenos at ncsa.uiuc.edu Sun Sep 13 13:34:20 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Sun, 13 Sep 2009 15:34:20 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <200909131229.29887.jackm@dev.mellanox.co.il> References: <4A8E4854.2060909@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> <200909131229.29887.jackm@dev.mellanox.co.il> Message-ID: <4AAD574C.8070502@ncsa.uiuc.edu> Jack Morgenstein wrote: > On Thursday 10 September 2009 23:00, Jeremy Enos wrote: > >> Fails w/ ofa_kernel like the others have... I didn't test excluding this rpm with FC11, but the others also fail elsewhere w/ this rpm excluded- so I'm guessing FC11 would as well. I included the output (and last 50 lines of log) in case you can glean something useful from it. Thank you- >> >> Jeremy >> >> Build ofa_kernel RPM >> Running rpmbuild --rebuild --define '_topdir /var/tmp/OFED_topdir' --define 'configure_options --with-core-mod --with-user_mad-mod --with-user_access-mod --with-addr_trans-mod --with-mthca-mod --with-mlx4-mod --with-mlx4_en-mod --with-cxgb3-mod --with-nes-mod --with-qib-mod --with-ipoib-mod --with-sdp-mod --with-rds-mod --with-qlgc_vnic-mod --with-iser-mod --with-nfsrdma-mod' --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' --define 'KVERSION 2.6.30.5-43.fc11.x86_64' --define 'K_SRC /lib/modules/2.6.30.5-43.fc11.x86_64/build' --define 'network_dir /etc/sysconfig/network-scripts' --define '_prefix /usr' --define '__arch_install_post %{nil}' /home-ib/ac/jenos/ofed/OFED-1.5-beta1/SRPMS/ofa_kernel-1.5-ofed1.5.beta1.src.rpm >> Failed to build ofa_kernel RPM >> See /tmp/OFED.1584.logs/ofa_kernel.rpmbuild.log >> [root at ac32 OFED-1.5-beta1]# tail -50 /tmp/OFED.1584.logs/ofa_kernel.rpmbuild.log >> mkdir -p /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds; \ >> mv /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra/net/rds/rds*.ko /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds/ ; \ >> elif [ -d /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra ]; then \ >> mkdir -p /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds; \ >> mv /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5/lib/modules/2.6.30.5-43.fc11.x86_64/extra/rds*.ko /var/tmp/OFED_topdir/BUILDROOT/ofa_kernel-1.5-ofed1.5.beta1.x86_64//lib/modules/2.6.30.5-43.fc11.x86_64/updates/kernel/net/rds/ ; \ >> fi; >> > > I notice that you have a 2.6.30-based kernel. The FC11 that we have here is 2.6.29-based: > 2.6.29.4-167.fc11.x86_64 > > Has there been some change in FC11 that we are not aware of? > (the release notes for FC11 also state that it is 2.6.29-based). > > Yep... your FC11 is not up to date. An up to date FC11 runs kernel 2.6.30.5-43.fc11.x86_64. (This is critical given recent kernel exploit vulnerabilities) thx- Jeremy > That said, I notice that there is a bug in rds on our Fedora Core11 (2.6.29-based). > Andy, you need to put rds_to_2_6_28.patch into a 2.6.29 kernel_patches/backports directory. > There isn't one yet. > > I can do this for you, but I suggest a rename to rds_to_2_6_29.patch for the file (in this directory > only). > Summary: I will create kernel_patches/backport/2.6.29 directory, and put rds_to_2_6_29.patch inside. > rds_to_2_6_29.patch is identical to rds_to_2_6_28.patch > NOTE: as of now, there is no need for a separate 2.6.29_FC11 backports directory. If the need > arises, we will create it. > > If you ACK, I will take care of it. > > -Jack > > From keshetti.mahesh at gmail.com Sun Sep 13 23:20:33 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Mon, 14 Sep 2009 11:50:33 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <20090910090213.6888b7d5.weiny2@llnl.gov> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> Message-ID: <829ded920909132320m7e33d79cx959fba7e6addb513@mail.gmail.com> I have a small question. If there are all 5 Gbps (maximum supported speed) ports except one with 10 Gbps in a subnet then what is the expected behavior of OpenSM while setting active link speed ? Does OpenSM force the port with 10 Gbps to operate at 5 Gbps or not ? -- Keshetti Mahesh On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > Also, iblinkinfo will report links which it finds capable of either faster or wider operation.  iblinkinfo checks both ends of the link as Hal mentions.  It reports this with output like. > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > ... >           7    8[  ] ==( 4X 2.5 Gbps Active/  LinkUp)==>       8   12[  ] "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > ... > > Also the portstatus console command in OpenSM will report links which are running at "reduced speed or width".  Although this does not check the remote port. > > OpenSM $ help portstatus > portstatus [ca|switch|router] > summarize port status >   [ca|switch|router] -- limit the results to the node type specified > OpenSM $ portstatus > "ALL" port status: >   115 port(s) scanned on 9 nodes in 26 us >   85 down >   30 active >   32 at 4X >   22 at 2.5 Gbps >   8 at 5.0 Gbps >   2 at 10.0 Gbps > > Possible issues: >   2 disabled >      0x0008f10400411b18 5 (ISR9024D Voltaire) >      0x0005ad0000092106 13 (Cisco Switch SFS7000D) >   6 with reduced speed >      0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) >      0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) >      0x0005ad0000092106 21 (Cisco Switch SFS7000D) >      0x0005ad0000092106 20 (Cisco Switch SFS7000D) >      0x0005ad0000092106 9 (Cisco Switch SFS7000D) >      0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > Ira > > On Thu, 10 Sep 2009 09:23:35 -0400 > Hal Rosenstock wrote: > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh >> wrote: >> >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. >> > Reports error/warning messages if the LinkSpeedActive is configured as >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. >> > >> >> ibportstate checks for more than this in terms of speed (and width) >> anomalies. >> >> Would it be better for these scripts to use that tool now ? Alternatively, >> the additional speed/width anomaly checks could be implemented in these >> scripts but it does involve checking the peer port so there's a little more >> to it. >> >> -- Hal >> >> >> > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> >> > --- >> >  infiniband-diags/scripts/ibcheckportspeed.in |  146 >> > ++++++++++++++++++++++++++ >> >  infiniband-diags/scripts/ibcheckportwidth.in |    2 +- >> >  infiniband-diags/scripts/ibcheckspeed.in     |  135 >> > ++++++++++++++++++++++++ >> >  3 files changed, 282 insertions(+), 1 deletions(-) >> >  create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in >> >  create mode 100644 infiniband-diags/scripts/ibcheckspeed.in >> > >> >> > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > From vlad at lists.openfabrics.org Mon Sep 14 03:18:55 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 14 Sep 2009 03:18:55 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090914-0200 daily build status Message-ID: <20090914101855.885DFE61F21@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sebastien.dugue at bull.net Mon Sep 14 03:59:33 2009 From: sebastien.dugue at bull.net (sebastien dugue) Date: Mon, 14 Sep 2009 12:59:33 +0200 Subject: [ofa-general] [PATCH] infiniband-diags/ibnetdiscover: Add separator when printing chassis type Message-ID: <20090914125933.0755a2dc@frecb007965> When grouping is enabled add a '#' separator between the switch guid and the chassis type and slot. Signed-off-by: Sebastien Dugue --- infiniband-diags/src/ibnetdiscover.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/infiniband-diags/src/ibnetdiscover.c b/infiniband-diags/src/ibnetdiscover.c index 2aa29c8..ce9affc 100644 --- a/infiniband-diags/src/ibnetdiscover.c +++ b/infiniband-diags/src/ibnetdiscover.c @@ -226,6 +226,7 @@ void out_switch(ibnd_node_t * node, int group, char *chname) fprintf(f, "(%" PRIx64 ")", mad_get_field64(node->info, 0, IB_NODE_PORT_GUID_F)); if (group) { + fprintf(f, "\t# "); str = ibnd_get_chassis_type(node); if (str) fprintf(f, "%s ", str); -- 1.6.3.1 From hnrose at comcast.net Mon Sep 14 04:18:06 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 14 Sep 2009 07:18:06 -0400 Subject: [ofa-general] [PATCH] opensm/opensm.8.in: Indicate default rule for Default partition Message-ID: <20090914111805.GA1996@comcast.net> Also, similar change to doc/partition-config.txt Signed-off-by: Hal Rosenstock --- diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index f855268..ae3d6f6 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -3,10 +3,13 @@ OpenSM Partition configuration The default name of OpenSM partitions configuration file is '/etc/opensm/partitions.conf'. The default may be changed by -using --Pconfig (-P) option with OpenSM. +using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. +Effectively, this amounts to the same as if the following line appears in the +partitions config file: +Default=0x7fff : ALL=limited, SELF=full ; The default partition has P_Key value 0x7fff. OpenSM's port will have full membership in default partition. All other end ports will have diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index fcdc168..caee2ef 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 "September 3, 2009" "OpenIB" "OpenIB Management" +.TH OPENSM 8 "September 13, 2009" "OpenIB" "OpenIB Management" .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -418,11 +418,15 @@ logrotate purposes. .SH PARTITION CONFIGURATION .PP The default name of OpenSM partitions configuration file is -\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed by using ---Pconfig (-P) option with OpenSM. +\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed +by using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. +Effectively, this amounts to the same as if the following line appears in the +partitions config file: + +Default=0x7fff : ALL=limited, SELF=full ; The default partition has P_Key value 0x7fff. OpenSM\'s port will have full membership in default partition. All other end ports will have From hal.rosenstock at gmail.com Mon Sep 14 05:45:39 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 08:45:39 -0400 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info In-Reply-To: <20090906154901.GF25241@me> References: <20090906154901.GF25241@me> Message-ID: On Sun, Sep 6, 2009 at 11:49 AM, Sasha Khapyorsky wrote: > > Port needs to access multicast groups where it is joined to. Now it is > implemented by keeping list of list of mcm_info elements where MLID of > each multicast group is stored. Obviously this assumes single MGID to > MLID mapping model. > Does this mean consolidate_ipv6_snm_req does not work now ? If so, did OFED 1.5 Beta go out this way ? Also, what is the plan/timeframe to restore this functionality ? -- Hal > > This patch changes this so that instead of MLID mcm_info stores pointer > to multicast group object (mgrp). Such model makes it possible to > have MGIDs to MLID compression. > > Signed-off-by: Sasha Khapyorsky > --- > opensm/include/opensm/osm_mcm_info.h | 13 +++++++------ > opensm/include/opensm/osm_port.h | 13 +++++++------ > opensm/opensm/osm_drop_mgr.c | 10 +++------- > opensm/opensm/osm_mcm_info.c | 8 ++++---- > opensm/opensm/osm_port.c | 10 +++++----- > opensm/opensm/osm_sm.c | 6 +++--- > 6 files changed, 29 insertions(+), 31 deletions(-) > > diff --git a/opensm/include/opensm/osm_mcm_info.h > b/opensm/include/opensm/osm_mcm_info.h > index dec607f..62ae326 100644 > --- a/opensm/include/opensm/osm_mcm_info.h > +++ b/opensm/include/opensm/osm_mcm_info.h > @@ -47,6 +47,7 @@ > #include > #include > #include > +#include > > #ifdef __cplusplus > # define BEGIN_C_DECLS extern "C" { > @@ -73,15 +74,15 @@ BEGIN_C_DECLS > */ > typedef struct osm_mcm_info { > cl_list_item_t list_item; > - ib_net16_t mlid; > + osm_mgrp_t *mgrp; > } osm_mcm_info_t; > /* > * FIELDS > * list_item > * Linkage structure for cl_qlist. MUST BE FIRST MEMBER! > * > -* mlid > -* MLID of this multicast group. > +* mgrp > +* The pointer to multicast group where this port is member of > * > * SEE ALSO > *********/ > @@ -95,11 +96,11 @@ typedef struct osm_mcm_info { > * > * SYNOPSIS > */ > -osm_mcm_info_t *osm_mcm_info_new(IN const ib_net16_t mlid); > +osm_mcm_info_t *osm_mcm_info_new(IN osm_mgrp_t *mgrp); > /* > * PARAMETERS > -* mlid > -* [in] MLID value for this multicast group. > +* mgrp > +* [in] the pointer to multicast group. > * > * RETURN VALUES > * Pointer to an initialized tree node. > diff --git a/opensm/include/opensm/osm_port.h > b/opensm/include/opensm/osm_port.h > index 7079e74..0e0d3d2 100644 > --- a/opensm/include/opensm/osm_port.h > +++ b/opensm/include/opensm/osm_port.h > @@ -65,6 +65,7 @@ BEGIN_C_DECLS > */ > struct osm_port; > struct osm_node; > +struct osm_mgrp; > > /****h* OpenSM/Physical Port > * NAME > @@ -1420,14 +1421,14 @@ osm_get_port_by_base_lid(IN const osm_subn_t * > const p_subn, > * SYNOPSIS > */ > ib_api_status_t > -osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid); > +osm_port_add_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); > /* > * PARAMETERS > * p_port > * [in] Pointer to an osm_port_t object. > * > -* mlid > -* [in] MLID of the multicast group. > +* mgrp > +* [in] Pointer to the multicast group. > * > * RETURN VALUES > * IB_SUCCESS > @@ -1449,14 +1450,14 @@ osm_port_add_mgrp(IN osm_port_t * const p_port, IN > const ib_net16_t mlid); > * SYNOPSIS > */ > void > -osm_port_remove_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t > mlid); > +osm_port_remove_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp > *mgrp); > /* > * PARAMETERS > * p_port > * [in] Pointer to an osm_port_t object. > * > -* mlid > -* [in] MLID of the multicast group. > +* mgrp > +* [in] Pointer to the multicast group. > * > * RETURN VALUES > * None. > diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c > index c9a4f33..4891bb8 100644 > --- a/opensm/opensm/osm_drop_mgr.c > +++ b/opensm/opensm/osm_drop_mgr.c > @@ -158,7 +158,6 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN > osm_port_t * p_port) > osm_port_t *p_port_check; > cl_qmap_t *p_sm_guid_tbl; > osm_mcm_info_t *p_mcm; > - osm_mgrp_t *p_mgrp; > cl_ptr_vector_t *p_port_lid_tbl; > uint16_t min_lid_ho; > uint16_t max_lid_ho; > @@ -212,12 +211,9 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN > osm_port_t * p_port) > > p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); > while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) > { > - p_mgrp = osm_get_mgrp_by_mlid(sm->p_subn, p_mcm->mlid); > - if (p_mgrp) { > - osm_mgrp_delete_port(sm->p_subn, sm->p_log, > - p_mgrp, p_port->guid); > - osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); > - } > + osm_mgrp_delete_port(sm->p_subn, sm->p_log, p_mcm->mgrp, > + p_port->guid); > + osm_mcm_info_delete(p_mcm); > p_mcm = > (osm_mcm_info_t *) > cl_qlist_remove_head(&p_port->mcm_list); > } > diff --git a/opensm/opensm/osm_mcm_info.c b/opensm/opensm/osm_mcm_info.c > index 0325a34..c07c70b 100644 > --- a/opensm/opensm/osm_mcm_info.c > +++ b/opensm/opensm/osm_mcm_info.c > @@ -49,17 +49,17 @@ > > /********************************************************************** > **********************************************************************/ > -osm_mcm_info_t *osm_mcm_info_new(IN const ib_net16_t mlid) > +osm_mcm_info_t *osm_mcm_info_new(IN osm_mgrp_t *mgrp) > { > osm_mcm_info_t *p_mcm; > > - p_mcm = (osm_mcm_info_t *) malloc(sizeof(*p_mcm)); > + p_mcm = malloc(sizeof(*p_mcm)); > if (p_mcm) { > memset(p_mcm, 0, sizeof(*p_mcm)); > - p_mcm->mlid = mlid; > + p_mcm->mgrp = mgrp; > } > > - return (p_mcm); > + return p_mcm; > } > > /********************************************************************** > diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c > index 751c0f0..3470381 100644 > --- a/opensm/opensm/osm_port.c > +++ b/opensm/opensm/osm_port.c > @@ -223,12 +223,12 @@ Found: > > /********************************************************************** > **********************************************************************/ > -ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN ib_net16_t > mlid) > +ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t > *mgrp) > { > ib_api_status_t status = IB_SUCCESS; > osm_mcm_info_t *p_mcm; > > - p_mcm = osm_mcm_info_new(mlid); > + p_mcm = osm_mcm_info_new(mgrp); > if (p_mcm) > cl_qlist_insert_tail(&p_port->mcm_list, > (cl_list_item_t *) p_mcm); > @@ -243,7 +243,7 @@ ib_api_status_t osm_port_add_mgrp(IN osm_port_t * > p_port, IN ib_net16_t mlid) > static cl_status_t port_mgrp_find_func(IN const cl_list_item_t * > p_list_item, > IN void *context) > { > - if (*((ib_net16_t *) context) == ((osm_mcm_info_t *) > p_list_item)->mlid) > + if (context == ((osm_mcm_info_t *) p_list_item)->mgrp) > return CL_SUCCESS; > else > return CL_NOT_FOUND; > @@ -251,12 +251,12 @@ static cl_status_t port_mgrp_find_func(IN const > cl_list_item_t * p_list_item, > > /********************************************************************** > **********************************************************************/ > -void osm_port_remove_mgrp(IN osm_port_t * p_port, IN const ib_net16_t > mlid) > +void osm_port_remove_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t *mgrp) > { > cl_list_item_t *p_mcm; > > p_mcm = cl_qlist_find_from_head(&p_port->mcm_list, > port_mgrp_find_func, > - &mlid); > + mgrp); > > if (p_mcm != cl_qlist_end(&p_port->mcm_list)) { > cl_qlist_remove_item(&p_port->mcm_list, p_mcm); > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c > index b3ce69a..2794775 100644 > --- a/opensm/opensm/osm_sm.c > +++ b/opensm/opensm/osm_sm.c > @@ -520,7 +520,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, > IN const ib_net16_t mlid, > */ > p_mcm = (osm_mcm_info_t *) cl_qlist_head(&p_port->mcm_list); > while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) > { > - if (p_mcm->mlid == mlid) { > + if (p_mcm->mgrp->mlid == mlid) { > OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, > "Found mlid object for Port:" > "0x%016" PRIx64 " lid:0x%X\n", > @@ -530,7 +530,7 @@ ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, > IN const ib_net16_t mlid, > p_mcm = (osm_mcm_info_t *) cl_qlist_next(&p_mcm->list_item); > } > > - status = osm_port_add_mgrp(p_port, mlid); > + status = osm_port_add_mgrp(p_port, p_mgrp); > if (status != IB_SUCCESS) { > OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E03: " > "Unable to associate port 0x%" PRIx64 " to mlid > 0x%X\n", > @@ -590,7 +590,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, > IN const ib_net16_t mlid, > /* > * Walk the list of ports in the group, and remove the appropriate > one. > */ > - osm_port_remove_mgrp(p_port, mlid); > + osm_port_remove_mgrp(p_port, p_mgrp); > > status = sm_mgrp_process(p_sm, p_mgrp); > Exit: > -- > 1.6.4.2 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Sep 14 06:12:35 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 09:12:35 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AACA572.2000603@voltaire.com> References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> Message-ID: On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham wrote: > Hal Rosenstock wrote: > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > wrote: > > > > ibcheckroutes validates route between all hosts in the fabric. > > This script finds all leaf switches (switches that are connected to > > HCAs) > > > > This script parses the output of ibnetdiscoer. > It finds all leaf switches (from the topology file > generated by ibnetdiscover). > The it checks if a route exists between all leaf switches > using ibtracert. > Why leaf switches (and not CAs) ? How are they determined (from the ibnetdiscover output) ? > > > > > CAs or HCAs ? > CAs > > > > What about switch port 0s ? > It checks connectivity only between leaf switches (not all switches). > I assume that traffic is generated only between CAs and therefor > connectivity between other switches (not leaf switches) does not important. > It's important for a couple of reasons: first PMA access on switches and secondly it's an IBA requirement although some OpenSM routing protocols ignore this. IMO it should be an option (not the default) to add these LIDs in too to the ones checked. > > > > > > > and runs ibtracert between them. > > When using various routing algorithms (e.g. up-down), > > > > > > With which routing algorithms has this been tried ? > I assume that from complexity perspective, the routing algorithms calculate > routes only between leaf switches and not between all CAs. > Then it adds one hop for all CAs connected to the leaf switches. > It depends on the routing algorithm (some violate this) but the basic IBA requirement is: * C14-62.1.4: *From every endport within the subnet, the SM *shall *provide at least one reversible path to every other endport. -- Hal > > I've tested it with up-down but it really doesn't matter which > routing algorithm you are using. > It just check the routes between leaf switches (and if the routing > algorithm behave as above, it means that it checks all CAs connectivity). > > > > > -- Hal > > > > > > if fabric topology is not suitable there will be no > > routes between some nodes. > > It reports when the route exists between source and destination LIDs. > > > > Signed-off-by: Doron Shoham > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Sep 14 06:19:06 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 09:19:06 -0400 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909132320m7e33d79cx959fba7e6addb513@mail.gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909132320m7e33d79cx959fba7e6addb513@mail.gmail.com> Message-ID: On Mon, Sep 14, 2009 at 2:20 AM, Keshetti Mahesh wrote: > I have a small question. If there are all 5 Gbps (maximum supported > speed) ports > except one with 10 Gbps in a subnet then what is the expected behavior > of OpenSM > while setting active link speed ? It depends on the peer port and the link between them. > Does OpenSM force the port with 10 > Gbps to operate > at 5 Gbps or not ? > SM (including OpenSM) sets PortInfo enabled components based on peer ports' supported components and link negotiation determines the active components. So in the case where one port supports 10 Gbps speed and it's peer port only supports 5, the SM sets LinkSpeedEnabled components on the peer ports to 5 Gbps (encoded as 3). In the case where the peer port supports 10 Gbps, it is set to 10 Gbps (encoded as 5 or 7 depending on what is supported). The link then negotiates to one of the enabled speeds and sets LinkSpeedActive accoridingly. -- Hal > > -- > Keshetti Mahesh > > On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > > Also, iblinkinfo will report links which it finds capable of either > faster or wider operation. iblinkinfo checks both ends of the link as Hal > mentions. It reports this with output like. > > > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > > ... > > 7 8[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 8 12[ ] > "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > > ... > > > > Also the portstatus console command in OpenSM will report links which are > running at "reduced speed or width". Although this does not check the > remote port. > > > > OpenSM $ help portstatus > > portstatus [ca|switch|router] > > summarize port status > > [ca|switch|router] -- limit the results to the node type specified > > OpenSM $ portstatus > > "ALL" port status: > > 115 port(s) scanned on 9 nodes in 26 us > > 85 down > > 30 active > > 32 at 4X > > 22 at 2.5 Gbps > > 8 at 5.0 Gbps > > 2 at 10.0 Gbps > > > > Possible issues: > > 2 disabled > > 0x0008f10400411b18 5 (ISR9024D Voltaire) > > 0x0005ad0000092106 13 (Cisco Switch SFS7000D) > > 6 with reduced speed > > 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) > > 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) > > 0x0005ad0000092106 21 (Cisco Switch SFS7000D) > > 0x0005ad0000092106 20 (Cisco Switch SFS7000D) > > 0x0005ad0000092106 9 (Cisco Switch SFS7000D) > > 0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > > > > Ira > > > > On Thu, 10 Sep 2009 09:23:35 -0400 > > Hal Rosenstock wrote: > > > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh > >> wrote: > >> > >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > >> > Reports error/warning messages if the LinkSpeedActive is configured as > >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > >> > > >> > >> ibportstate checks for more than this in terms of speed (and width) > >> anomalies. > >> > >> Would it be better for these scripts to use that tool now ? > Alternatively, > >> the additional speed/width anomaly checks could be implemented in these > >> scripts but it does involve checking the peer port so there's a little > more > >> to it. > >> > >> -- Hal > >> > >> > >> > > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > >> > --- > >> > infiniband-diags/scripts/ibcheckportspeed.in | 146 > >> > ++++++++++++++++++++++++++ > >> > infiniband-diags/scripts/ibcheckportwidth.in | 2 +- > >> > infiniband-diags/scripts/ibcheckspeed.in | 135 > >> > ++++++++++++++++++++++++ > >> > 3 files changed, 282 insertions(+), 1 deletions(-) > >> > create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > >> > create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > >> > > >> > >> > > > > > > -- > > Ira Weiny > > Math Programmer/Computer Scientist > > Lawrence Livermore National Lab > > 925-423-8008 > > weiny2 at llnl.gov > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorfman.eli at gmail.com Mon Sep 14 07:32:10 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Mon, 14 Sep 2009 17:32:10 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> Message-ID: <4AAE53EA.1030009@gmail.com> Hal Rosenstock wrote: > > > On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham > wrote: > > Hal Rosenstock wrote: > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > >> wrote: > > > > ibcheckroutes validates route between all hosts in the fabric. > > This script finds all leaf switches (switches that are > connected to > > HCAs) > > > > This script parses the output of ibnetdiscoer. > It finds all leaf switches (from the topology file > generated by ibnetdiscover). > The it checks if a route exists between all leaf switches > using ibtracert. > > > Why leaf switches (and not CAs) ? How are they determined (from the > ibnetdiscover output) ? because there are much less combinations (routes) of leaf switches than CAs. And since we assume that opensm routing builds lid matrix based on switch connectivity than if two switches have route between each other then all CAs that are connected to them will have route to each other. In ibnetdiscover you can see to which switch (LID) each CA is connected. > > > > > > > CAs or HCAs ? > CAs > > > > What about switch port 0s ? > It checks connectivity only between leaf switches (not all switches). > I assume that traffic is generated only between CAs and therefor > connectivity between other switches (not leaf switches) does not > important. > > > It's important for a couple of reasons: first PMA access on switches and > secondly it's an IBA requirement although some OpenSM routing protocols > ignore this. IMO it should be an option (not the default) to add these > LIDs in too to the ones checked. Ok, we can this option once this patch is applied. Also it may be better to provide the switch LID(s) from which PM is running to reduce number of tested routes. Eli > > > > > > > > > > and runs ibtracert between them. > > When using various routing algorithms (e.g. up-down), > > > > > > With which routing algorithms has this been tried ? > I assume that from complexity perspective, the routing algorithms > calculate > routes only between leaf switches and not between all CAs. > Then it adds one hop for all CAs connected to the leaf switches. > > > It depends on the routing algorithm (some violate this) but the basic > IBA requirement is: > * > > C14-62.1.4: > > *From every endport within the subnet, the SM *shall *provide at least > one reversible path to every other endport. > > -- Hal > > > I've tested it with up-down but it really doesn't matter which > routing algorithm you are using. > It just check the routes between leaf switches (and if the routing > algorithm behave as above, it means that it checks all CAs > connectivity). > > > > > -- Hal > > > > > > if fabric topology is not suitable there will be no > > routes between some nodes. > > It reports when the route exists between source and > destination LIDs. > > > > Signed-off-by: Doron Shoham > > >> > > > > > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From dorfman.eli at gmail.com Mon Sep 14 07:47:02 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Mon, 14 Sep 2009 17:47:02 +0300 Subject: [ofa-general] [PATCH] opensm/opensm.8.in: Indicate default rule for Default partition In-Reply-To: <20090914111805.GA1996@comcast.net> References: <20090914111805.GA1996@comcast.net> Message-ID: <4AAE5766.9090504@gmail.com> Hal Rosenstock wrote: > Also, similar change to doc/partition-config.txt > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt > index f855268..ae3d6f6 100644 > --- a/opensm/doc/partition-config.txt > +++ b/opensm/doc/partition-config.txt > @@ -3,10 +3,13 @@ OpenSM Partition configuration > > The default name of OpenSM partitions configuration file is > '/etc/opensm/partitions.conf'. The default may be changed by > -using --Pconfig (-P) option with OpenSM. > +using the --Pconfig (-P) option with OpenSM. > > The default partition will be created by OpenSM unconditionally even > when partition configuration file does not exist or cannot be accessed. > +Effectively, this amounts to the same as if the following line appears in the > +partitions config file: > +Default=0x7fff : ALL=limited, SELF=full ; When partition configuration file is missing (is_config=0) the opensm sets all ports to full membership. See opensm/osm_prtn.c status = osm_prtn_make_default(p_log, p_subn, !is_config); > > The default partition has P_Key value 0x7fff. OpenSM's port will have > full membership in default partition. All other end ports will have > diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in > index fcdc168..caee2ef 100644 > --- a/opensm/man/opensm.8.in > +++ b/opensm/man/opensm.8.in > @@ -1,4 +1,4 @@ > -.TH OPENSM 8 "September 3, 2009" "OpenIB" "OpenIB Management" > +.TH OPENSM 8 "September 13, 2009" "OpenIB" "OpenIB Management" > > .SH NAME > opensm \- InfiniBand subnet manager and administration (SM/SA) > @@ -418,11 +418,15 @@ logrotate purposes. > .SH PARTITION CONFIGURATION > .PP > The default name of OpenSM partitions configuration file is > -\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed by using > ---Pconfig (-P) option with OpenSM. > +\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed > +by using the --Pconfig (-P) option with OpenSM. > > The default partition will be created by OpenSM unconditionally even > when partition configuration file does not exist or cannot be accessed. > +Effectively, this amounts to the same as if the following line appears in the > +partitions config file: > + > +Default=0x7fff : ALL=limited, SELF=full ; > > The default partition has P_Key value 0x7fff. OpenSM\'s port will have > full membership in default partition. All other end ports will have > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hal.rosenstock at gmail.com Mon Sep 14 07:41:55 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 10:41:55 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AAE53EA.1030009@gmail.com> References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) < dorfman.eli at gmail.com> wrote: > Hal Rosenstock wrote: > > > > > > On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham > > wrote: > > > > Hal Rosenstock wrote: > > > > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > > > >> wrote: > > > > > > ibcheckroutes validates route between all hosts in the fabric. > > > This script finds all leaf switches (switches that are > > connected to > > > HCAs) > > > > > > > This script parses the output of ibnetdiscoer. > > It finds all leaf switches (from the topology file > > generated by ibnetdiscover). > > The it checks if a route exists between all leaf switches > > using ibtracert. > > > > > > Why leaf switches (and not CAs) ? How are they determined (from the > > ibnetdiscover output) ? > How are the leaf switches determined (from core switches) in the ibnetdiscover output ? Is it any switch which has an attached CA versus any switch which has no attached CAs ? > > because there are much less combinations (routes) of leaf switches than > CAs. > So is the check is that there are routes between all the leaf switches ? > And since we assume that opensm routing builds lid matrix based on switch > connectivity than if two switches have route between each other then all CAs > that are connected to them will have route to each other. > I can't parse this sentence. Also, this should have nothing to do with OpenSM as it is SM independent AFAIT. > In ibnetdiscover you can see to which switch (LID) each CA is connected. > Sure. > > > > > > > > > > > > > CAs or HCAs ? > > CAs > > > > > > What about switch port 0s ? > > It checks connectivity only between leaf switches (not all switches). > > I assume that traffic is generated only between CAs and therefor > > connectivity between other switches (not leaf switches) does not > > important. > > > > > > It's important for a couple of reasons: first PMA access on switches and > > secondly it's an IBA requirement although some OpenSM routing protocols > > ignore this. IMO it should be an option (not the default) to add these > > LIDs in too to the ones checked. > > Ok, we can this option once this patch is applied. I have some other specific comments on the patch. > Also it may be better to provide the switch LID(s) from which PM is running > to reduce number of tested routes. This is in the vein of only checking leaf switch connectivity but is not the IBA general requirement. -- Hal > > Eli > > > > > > > > > > > > > > > > > > and runs ibtracert between them. > > > When using various routing algorithms (e.g. up-down), > > > > > > > > > With which routing algorithms has this been tried ? > > I assume that from complexity perspective, the routing algorithms > > calculate > > routes only between leaf switches and not between all CAs. > > Then it adds one hop for all CAs connected to the leaf switches. > > > > > > It depends on the routing algorithm (some violate this) but the basic > > IBA requirement is: > > * > > > > C14-62.1.4: > > > > *From every endport within the subnet, the SM *shall *provide at least > > one reversible path to every other endport. > > > > -- Hal > > > > > > I've tested it with up-down but it really doesn't matter which > > routing algorithm you are using. > > It just check the routes between leaf switches (and if the routing > > algorithm behave as above, it means that it checks all CAs > > connectivity). > > > > > > > > -- Hal > > > > > > > > > if fabric topology is not suitable there will be no > > > routes between some nodes. > > > It reports when the route exists between source and > > destination LIDs. > > > > > > Signed-off-by: Doron Shoham > > > > >> > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Sep 14 07:58:10 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 10:58:10 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AA8E97E.1090109@voltaire.com> References: <4AA8E97E.1090109@voltaire.com> Message-ID: On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham wrote: > ibcheckroutes validates route between all hosts in the fabric. > This script finds all leaf switches (switches that are connected to HCAs) > and runs ibtracert between them. > When using various routing algorithms (e.g. up-down), > if fabric topology is not suitable there will be no > routes between some nodes. > It reports when the route exists between source and destination LIDs. > > Signed-off-by: Doron Shoham > --- > infiniband-diags/Makefile.am | 4 +- > infiniband-diags/configure.in | 1 + > infiniband-diags/man/ibcheckroutes.8 | 39 +++++++++++ > infiniband-diags/scripts/ibcheckroutes.in | 101 > +++++++++++++++++++++++++++++ > 4 files changed, 143 insertions(+), 2 deletions(-) > create mode 100644 infiniband-diags/man/ibcheckroutes.8 > create mode 100755 infiniband-diags/scripts/ibcheckroutes.in > > diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am > index 1cdb60e..57363c4 100644 > --- a/infiniband-diags/Makefile.am > +++ b/infiniband-diags/Makefile.am > @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet > scripts/ibchecknode \ > scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ > scripts/ibprintca.pl scripts/ibprintrt.pl \ > scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ > - scripts/check_lft_balance.pl > + scripts/check_lft_balance.pl scripts/ibcheckroutes > > noinst_LIBRARIES = libcommon.a > > @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 > man/ibcheckerrs.8 \ > man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ > man/ibdatacounts.8 man/ibdatacounters.8 \ > man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ > - man/check_lft_balance.8 > + man/check_lft_balance.8 man/ibcheckroutes.8 > > BUILT_SOURCES = ibdiag_version > ibdiag_version: > diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in > index 3ef35cc..aa178c5 100644 > --- a/infiniband-diags/configure.in > +++ b/infiniband-diags/configure.in > @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ > scripts/ibcheckportwidth \ > scripts/ibcheckstate \ > scripts/ibcheckwidth \ > + scripts/ibcheckroutes \ > scripts/ibclearcounters \ > scripts/ibclearerrors \ > scripts/ibdatacounts \ > diff --git a/infiniband-diags/man/ibcheckroutes.8 > b/infiniband-diags/man/ibcheckroutes.8 > new file mode 100644 > index 0000000..a6a073f > --- /dev/null > +++ b/infiniband-diags/man/ibcheckroutes.8 > @@ -0,0 +1,39 @@ > +.TH IBCHECKPORT 8 "September 10, 2009" "OpenIB" "OpenIB Diagnostics" > + > +.SH NAME > +ibcheckroutes \- validates routes between all hosts in fabric > + > +.SH SYNOPSIS > +.B ibcheckroutes > +[\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) > timeout_ms] > + > +.SH DESCRIPTION > +.PP > +ibcheckroutes is a script which uses a full topology file that was created > by ibnetdiscover, > +scans the network to validate routes between all hosts in the fabric. > Based on what has been discussed, this really isn't the case. It only validates routes between leaf switches (at least currently). > + > +.SH OPTIONS > +.PP > +\-h Show help. > +.PP > +\-N Use mono rather than color mode. > +.PP > +\-b Suppress output. > +.PP > +\-e Show errors only. > +.PP > +\-C Use the specified ca_name. > +.PP > +\-P Use the specified ca_port. > +.PP > +\-t Override the default timeout for the solicited mads. > + > +.SH SEE ALSO > +.BR ibnetdiscover(8), > +.BR ibtracert(8), > +.BR ibroute(8) > Is ibroute used ? > + > +.SH AUTHOR > +.TP > +Doron Shoham > +.RI < dorons at voltaire.com > > diff --git a/infiniband-diags/scripts/ibcheckroutes.inb/infiniband-diags/scripts/ > ibcheckroutes.in > new file mode 100755 > index 0000000..eb3ad30 > --- /dev/null > +++ b/infiniband-diags/scripts/ibcheckroutes.in > @@ -0,0 +1,101 @@ > +#!/bin/sh > + > +IBPATH=${IBPATH:- at IBSCRIPTPATH@} > + > +function usage() { > + echo Usage: `basename $0` "[-h] [-N] [-b] [-e] [-C ca_name] [-P > ca_port] [-t(imeout) timeout_ms]" > + echo -e " validate routes between all hosts in fabric" > + echo -e " -h - Show help" > + echo -e " -N - Use mono rather than color mode" > + echo -e " -b - Suppress output" > + echo -e " -e - Show errors only" > + echo -e " -C - Use the specified ca_name" > + echo -e " -P - Use the specified ca_port" > + echo -e " -t - Override the default timeout for the solicited" > add " mads" to the end of this > + exit -1 > +} > + > +function user_abort() { > + echo "Aborted" > + exit 1 > +} > + > +function green() { > + if [ "$bw" = "yes" ]; then > + printf "${res_col}[OK]\n" $1 > + return > + fi > + printf "\033[1;032m${res_col}[OK]\033[0;39m\n" $1 > +} > + > +function red() { > + if [ "$bw" = "yes" ]; then > + printf "${res_col}[FAILED]\n" "$1" > + return > + fi > + printf "\033[31m${res_col}[FAILED]\033[0m\n" "$1" > +} > + > +trap user_abort SIGINT SIGTERM > + > +bw="" > +brief=0 > +error=0 > +ca_info="" > +st=0 > +topofile=/tmp/net > +res_col="%-20.20s" > + > +function get_opts() { > + while getopts P:C:t:beNh o; do > + case "$o" in > + h) > + usage > + ;; > + N) > + bw="yes" > + ;; > + b) > + brief=1 > + ;; > + e) > + error=1 > + ;; > + P | C | t | timeout) > + ca_info="$ca_info -$o $OPTARG" > + ;; > + *) > + usage > + ;; > + esac > + done > +} > + > +get_opts $* > + > +$IBPATH/ibnetdiscover $ca_info > $topofile > How about allowing an already saved ibnetdiscover file to be used as well as a "fresh" ibnetdiscover output ? > + > +# find all leaf switches LIDs > +LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) print v}' > $topofile)) > +N=${#LIDS[@]} > + > +if [ $N -lt 2 ]; then > + echo "Fabric contains only one switch" one leaf switch ? > + exit 0 > +fi > + > +# check routes between all switches in fabric > all leaf switches ? > +[ $brief -eq 0 ] && echo -e "Checking route between:\nSource lid --> > Destination lid" > +for((s=0; s + for ((d=s+1; d + $IBPATH/ibtracert $ca_info ${LIDS[$s]} ${LIDS[$d]} > > /dev/null > Is LMC > 0 handled ? -- Hal > + if [ $? -eq 0 ]; then > + [ $brief -eq 0 ] && [ $error -eq 0 ] && green > "${LIDS[$s]}-->${LIDS[$d]}" > + else > + [ $brief -eq 0 ] && red "${LIDS[$s]}-->${LIDS[$d]}" > + st=1 > + fi > + done > +done > + > +exit $st > -- > 1.5.4 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorfman.eli at gmail.com Mon Sep 14 08:05:07 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Mon, 14 Sep 2009 18:05:07 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> Message-ID: <4AAE5BA3.20101@gmail.com> Hal Rosenstock wrote: > > > On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) > > wrote: > > Hal Rosenstock wrote: > > > > > > On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham > > >> wrote: > > > > Hal Rosenstock wrote: > > > > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > > > > > > > >>> wrote: > > > > > > ibcheckroutes validates route between all hosts in the > fabric. > > > This script finds all leaf switches (switches that are > > connected to > > > HCAs) > > > > > > > This script parses the output of ibnetdiscoer. > > It finds all leaf switches (from the topology file > > generated by ibnetdiscover). > > The it checks if a route exists between all leaf switches > > using ibtracert. > > > > > > Why leaf switches (and not CAs) ? How are they determined (from the > > ibnetdiscover output) ? > > > > How are the leaf switches determined (from core switches) in the > ibnetdiscover output ? Is it any switch which has an attached CA versus > any switch which has no attached CAs ? yes > > > > because there are much less combinations (routes) of leaf switches > than CAs. > > > So is the check is that there are routes between all the leaf switches ? yes > > > And since we assume that opensm routing builds lid matrix based on > switch connectivity than if two switches have route between each > other then all CAs that are connected to them will have route to > each other. > > > I can't parse this sentence. Also, this should have nothing to do with > OpenSM as it is SM independent AFAIT. since we check routes between leaf switches LIDs, for opensm this also assures that we have route between CAs that are attached to them. > > > In ibnetdiscover you can see to which switch (LID) each CA is connected. > > > Sure. > > > > > > > > > > > > > > > CAs or HCAs ? > > CAs > > > > > > What about switch port 0s ? > > It checks connectivity only between leaf switches (not all > switches). > > I assume that traffic is generated only between CAs and therefor > > connectivity between other switches (not leaf switches) does not > > important. > > > > > > It's important for a couple of reasons: first PMA access on > switches and > > secondly it's an IBA requirement although some OpenSM routing > protocols > > ignore this. IMO it should be an option (not the default) to add these > > LIDs in too to the ones checked. > > Ok, we can this option once this patch is applied. > > > I have some other specific comments on the patch. > > > Also it may be better to provide the switch LID(s) from which PM is > running to reduce number of tested routes. > > > This is in the vein of only checking leaf switch connectivity but is not > the IBA general requirement. > > -- Hal > > > > Eli > > > > > > > > > > > > > > > > > > and runs ibtracert between them. > > > When using various routing algorithms (e.g. up-down), > > > > > > > > > With which routing algorithms has this been tried ? > > I assume that from complexity perspective, the routing algorithms > > calculate > > routes only between leaf switches and not between all CAs. > > Then it adds one hop for all CAs connected to the leaf switches. > > > > > > It depends on the routing algorithm (some violate this) but the basic > > IBA requirement is: > > * > > > > C14-62.1.4: > > > > *From every endport within the subnet, the SM *shall *provide at least > > one reversible path to every other endport. > > > > -- Hal > > > > > > I've tested it with up-down but it really doesn't matter which > > routing algorithm you are using. > > It just check the routes between leaf switches (and if the routing > > algorithm behave as above, it means that it checks all CAs > > connectivity). > > > > > > > > -- Hal > > > > > > > > > if fabric topology is not suitable there will be no > > > routes between some nodes. > > > It reports when the route exists between source and > > destination LIDs. > > > > > > Signed-off-by: Doron Shoham > > > > > > > >>> > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > From hal.rosenstock at gmail.com Mon Sep 14 08:01:12 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 11:01:12 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AAE5BA3.20101@gmail.com> References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> <4AAE5BA3.20101@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 11:05 AM, Eli Dorfman (Voltaire) < dorfman.eli at gmail.com> wrote: > Hal Rosenstock wrote: > > > > > > On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) > > > wrote: > > > > Hal Rosenstock wrote: > > > > > > > > > On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham > > > > >> wrote: > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > > > > > > > > > > > >>> wrote: > > > > > > > > ibcheckroutes validates route between all hosts in the > > fabric. > > > > This script finds all leaf switches (switches that are > > > connected to > > > > HCAs) > > > > > > > > > > This script parses the output of ibnetdiscoer. > > > It finds all leaf switches (from the topology file > > > generated by ibnetdiscover). > > > The it checks if a route exists between all leaf switches > > > using ibtracert. > > > > > > > > > Why leaf switches (and not CAs) ? How are they determined (from the > > > ibnetdiscover output) ? > > > > > > > > How are the leaf switches determined (from core switches) in the > > ibnetdiscover output ? Is it any switch which has an attached CA versus > > any switch which has no attached CAs ? > yes > > > > > > > > > because there are much less combinations (routes) of leaf switches > > than CAs. > > > > > > So is the check is that there are routes between all the leaf switches ? > yes > > > > > > And since we assume that opensm routing builds lid matrix based on > > switch connectivity than if two switches have route between each > > other then all CAs that are connected to them will have route to > > each other. > > > > > > I can't parse this sentence. Also, this should have nothing to do with > > OpenSM as it is SM independent AFAIT. > since we check routes between leaf switches LIDs, for opensm this also > assures that we have route between CAs that are attached to them. > Not quite. It assures there should be a path but not necessarily a route as all the LFTs are not checked with the CA port LIDs. -- Hal > > > > > > > In ibnetdiscover you can see to which switch (LID) each CA is > connected. > > > > > > Sure. > > > > > > > > > > > > > > > > > > > > > > > CAs or HCAs ? > > > CAs > > > > > > > > What about switch port 0s ? > > > It checks connectivity only between leaf switches (not all > > switches). > > > I assume that traffic is generated only between CAs and > therefor > > > connectivity between other switches (not leaf switches) does > not > > > important. > > > > > > > > > It's important for a couple of reasons: first PMA access on > > switches and > > > secondly it's an IBA requirement although some OpenSM routing > > protocols > > > ignore this. IMO it should be an option (not the default) to add > these > > > LIDs in too to the ones checked. > > > > Ok, we can this option once this patch is applied. > > > > > > I have some other specific comments on the patch. > > > > > > Also it may be better to provide the switch LID(s) from which PM is > > running to reduce number of tested routes. > > > > > > This is in the vein of only checking leaf switch connectivity but is not > > the IBA general requirement. > > > > -- Hal > > > > > > > > Eli > > > > > > > > > > > > > > > > > > > > > > > > > > and runs ibtracert between them. > > > > When using various routing algorithms (e.g. up-down), > > > > > > > > > > > > With which routing algorithms has this been tried ? > > > I assume that from complexity perspective, the routing > algorithms > > > calculate > > > routes only between leaf switches and not between all CAs. > > > Then it adds one hop for all CAs connected to the leaf > switches. > > > > > > > > > It depends on the routing algorithm (some violate this) but the > basic > > > IBA requirement is: > > > * > > > > > > C14-62.1.4: > > > > > > *From every endport within the subnet, the SM *shall *provide at > least > > > one reversible path to every other endport. > > > > > > -- Hal > > > > > > > > > I've tested it with up-down but it really doesn't matter which > > > routing algorithm you are using. > > > It just check the routes between leaf switches (and if the > routing > > > algorithm behave as above, it means that it checks all CAs > > > connectivity). > > > > > > > > > > > -- Hal > > > > > > > > > > > > if fabric topology is not suitable there will be no > > > > routes between some nodes. > > > > It reports when the route exists between source and > > > destination LIDs. > > > > > > > > Signed-off-by: Doron Shoham > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org general at lists.openfabrics.org> > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Sep 14 08:18:46 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 11:18:46 -0400 Subject: [ofa-general] [PATCH] opensm/opensm.8.in: Indicate default rule for Default partition In-Reply-To: <4AAE5766.9090504@gmail.com> References: <20090914111805.GA1996@comcast.net> <4AAE5766.9090504@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 10:47 AM, Eli Dorfman (Voltaire) < dorfman.eli at gmail.com> wrote: > Hal Rosenstock wrote: > > Also, similar change to doc/partition-config.txt > > > > Signed-off-by: Hal Rosenstock > > --- > > diff --git a/opensm/doc/partition-config.txt > b/opensm/doc/partition-config.txt > > index f855268..ae3d6f6 100644 > > --- a/opensm/doc/partition-config.txt > > +++ b/opensm/doc/partition-config.txt > > @@ -3,10 +3,13 @@ OpenSM Partition configuration > > > > The default name of OpenSM partitions configuration file is > > '/etc/opensm/partitions.conf'. The default may be changed by > > -using --Pconfig (-P) option with OpenSM. > > +using the --Pconfig (-P) option with OpenSM. > > > > The default partition will be created by OpenSM unconditionally even > > when partition configuration file does not exist or cannot be accessed. > > +Effectively, this amounts to the same as if the following line appears > in the > > +partitions config file: > > +Default=0x7fff : ALL=limited, SELF=full ; > > When partition configuration file is missing (is_config=0) the opensm sets > all ports to full membership. > See opensm/osm_prtn.c > status = osm_prtn_make_default(p_log, p_subn, !is_config); Thanks; I'll update the patch accordingly and repost. -- Hal > > > > > The default partition has P_Key value 0x7fff. OpenSM's port will have > > full membership in default partition. All other end ports will have > > diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in > > index fcdc168..caee2ef 100644 > > --- a/opensm/man/opensm.8.in > > +++ b/opensm/man/opensm.8.in > > @@ -1,4 +1,4 @@ > > -.TH OPENSM 8 "September 3, 2009" "OpenIB" "OpenIB Management" > > +.TH OPENSM 8 "September 13, 2009" "OpenIB" "OpenIB Management" > > > > .SH NAME > > opensm \- InfiniBand subnet manager and administration (SM/SA) > > @@ -418,11 +418,15 @@ logrotate purposes. > > .SH PARTITION CONFIGURATION > > .PP > > The default name of OpenSM partitions configuration file is > > -\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be > changed by using > > ---Pconfig (-P) option with OpenSM. > > +\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be > changed > > +by using the --Pconfig (-P) option with OpenSM. > > > > The default partition will be created by OpenSM unconditionally even > > when partition configuration file does not exist or cannot be accessed. > > +Effectively, this amounts to the same as if the following line appears > in the > > +partitions config file: > > + > > +Default=0x7fff : ALL=limited, SELF=full ; > > > > The default partition has P_Key value 0x7fff. OpenSM\'s port will have > > full membership in default partition. All other end ports will have > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hnrose at comcast.net Mon Sep 14 08:42:37 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 14 Sep 2009 11:42:37 -0400 Subject: [ofa-general] [PATCHv2] opensm/opensm.8.in: Indicate default rule for Default partition Message-ID: <20090914154236.GA11092@comcast.net> Also, similar change to doc/partition-config.txt Signed-off-by: Hal Rosenstock --- Changes since v1: Fixed Default rule for non SM ports based on comment from Eli diff --git a/opensm/doc/partition-config.txt b/opensm/doc/partition-config.txt index f855268..cb3bcf7 100644 --- a/opensm/doc/partition-config.txt +++ b/opensm/doc/partition-config.txt @@ -3,14 +3,23 @@ OpenSM Partition configuration The default name of OpenSM partitions configuration file is '/etc/opensm/partitions.conf'. The default may be changed by -using --Pconfig (-P) option with OpenSM. +using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. -The default partition has P_Key value 0x7fff. OpenSM's port will have -full membership in default partition. All other end ports will have -limited membership. +The default partition has P_Key value 0x7fff. OpenSM's port will always +have full membership in default partition. All other end ports will have +full membership if the partition configuration file is not found or cannot +be accessed, or limited membership if the file exists and can be accessed +but there is no rule for the Default partition. + +Effectively, this amounts to the same as if one of the following rules +below appear in the partition configuration file: +In the case of no rule for the Default partition: +Default=0x7fff : ALL=limited, SELF=full ; +In the case of no partition configuration file or file cannot be accessed: +Default=0x7fff : ALL=full ; File Format diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index fcdc168..03002c0 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -1,4 +1,4 @@ -.TH OPENSM 8 "September 3, 2009" "OpenIB" "OpenIB Management" +.TH OPENSM 8 "September 14, 2009" "OpenIB" "OpenIB Management" .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -418,15 +418,29 @@ logrotate purposes. .SH PARTITION CONFIGURATION .PP The default name of OpenSM partitions configuration file is -\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed by using ---Pconfig (-P) option with OpenSM. +\fB\%@OPENSM_CONFIG_DIR@/@PARTITION_CONFIG_FILE@\fP. The default may be changed +by using the --Pconfig (-P) option with OpenSM. The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. -The default partition has P_Key value 0x7fff. OpenSM\'s port will have -full membership in default partition. All other end ports will have -limited membership. +The default partition has P_Key value 0x7fff. OpenSM\'s port will always +have full membership in default partition. All other end ports will have +full membership if the partition configuration file is not found or cannot +be accessed, or limited membership if the file exists and can be accessed +but there is no rule for the Default partition. + +Effectively, this amounts to the same as if one of the following rules +below appear in the partition configuration file. + +In the case of no rule for the Default partition: + +Default=0x7fff : ALL=limited, SELF=full ; + +In the case of no partition configuration file or file cannot be accessed: + +Default=0x7fff : ALL=full ; + File Format From rpearson at systemfabricworks.com Mon Sep 14 09:49:15 2009 From: rpearson at systemfabricworks.com (Robert Pearson) Date: Mon, 14 Sep 2009 11:49:15 -0500 Subject: [ofa-general] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git Message-ID: <023301ca355b$4b86c520$e2944f60$@com> The following fragment occurs in both programs: 551                 case 's': 552                         size = strtol(optarg, NULL, 0); 553                         break; 554 555                 case 'm': 556                         mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0)); 557                         if (mtu < 0) { 558                                 usage(argv[0]); 559                                 return 1; 560                         } 561 562                 case 'r': 563                         rx_depth = strtol(optarg, NULL, 0); 564                         break; 565 Case m is missing a handy break statement. Setting mtu also sets rx_depth. Can workaround by following -m by -r which resets rx_depth. From rdreier at cisco.com Mon Sep 14 09:52:10 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Sep 2009 09:52:10 -0700 Subject: [ofa-general] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git In-Reply-To: <023301ca355b$4b86c520$e2944f60$@com> (Robert Pearson's message of "Mon, 14 Sep 2009 11:49:15 -0500") References: <023301ca355b$4b86c520$e2944f60$@com> Message-ID: > Case m is missing a handy break statement. Setting mtu also sets rx_depth. > Can workaround by following -m by -r which resets rx_depth. Yes, looks that way. Care to send a patch? - R. From rpearson at systemfabricworks.com Mon Sep 14 09:59:23 2009 From: rpearson at systemfabricworks.com (Robert Pearson) Date: Mon, 14 Sep 2009 11:59:23 -0500 Subject: [ofa-general] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git In-Reply-To: References: <023301ca355b$4b86c520$e2944f60$@com> Message-ID: <023a01ca355c$b61010d0$22303270$@com> Will do. -----Original Message----- From: Roland Dreier [mailto:rdreier at cisco.com] Sent: Monday, September 14, 2009 11:52 AM To: Robert Pearson Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git > Case m is missing a handy break statement. Setting mtu also sets rx_depth. > Can workaround by following -m by -r which resets rx_depth. Yes, looks that way. Care to send a patch? - R. From rpearson at systemfabricworks.com Mon Sep 14 10:39:20 2009 From: rpearson at systemfabricworks.com (Robert Pearson) Date: Mon, 14 Sep 2009 12:39:20 -0500 Subject: [ofa-general] [PATCH] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git Message-ID: <025801ca3562$4a8d93e0$dfa8bba0$@com> Here you go. Fix missing break statements in rc_pingpong.c and uc_pingpong.c Signed-off-by: Bob Pearson --- diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c index 26fa45c..d4115e4 100644 --- a/examples/rc_pingpong.c +++ b/examples/rc_pingpong.c @@ -558,6 +558,7 @@ int main(int argc, char *argv[]) usage(argv[0]); return 1; } + break; case 'r': rx_depth = strtol(optarg, NULL, 0); diff --git a/examples/uc_pingpong.c b/examples/uc_pingpong.c index c09c8c1..404b059 100644 --- a/examples/uc_pingpong.c +++ b/examples/uc_pingpong.c @@ -546,6 +546,7 @@ int main(int argc, char *argv[]) usage(argv[0]); return 1; } + break; case 'r': rx_depth = strtol(optarg, NULL, 0); From weiny2 at llnl.gov Mon Sep 14 10:48:34 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Sep 2009 10:48:34 -0700 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909132320m7e33d79cx959fba7e6addb513@mail.gmail.com> Message-ID: <20090914104834.1da077f6.weiny2@llnl.gov> On Mon, 14 Sep 2009 09:19:06 -0400 Hal Rosenstock wrote: > On Mon, Sep 14, 2009 at 2:20 AM, Keshetti Mahesh > wrote: > > > I have a small question. If there are all 5 Gbps (maximum supported > > speed) ports > > except one with 10 Gbps in a subnet then what is the expected behavior > > of OpenSM > > while setting active link speed ? > > > It depends on the peer port and the link between them. > > > > Does OpenSM force the port with 10 > > Gbps to operate > > at 5 Gbps or not ? > > > > SM (including OpenSM) sets PortInfo enabled components based on peer ports' > supported components and link negotiation determines the active components. > > So in the case where one port supports 10 Gbps speed and it's peer port only > supports 5, the SM sets LinkSpeedEnabled components on the peer ports to 5 > Gbps (encoded as 3). In the case where the peer port supports 10 Gbps, it is > set to 10 Gbps (encoded as 5 or 7 depending on what is supported). The link > then negotiates to one of the enabled speeds and sets LinkSpeedActive > accoridingly. Hal is correct. Just to be clear, the examples below are running in a cluster which I "forced" to SDR using OpenSM's "force_link_speed" option. This just happened to be the easiest way for me to show you the output of a link which is running at sub-optimal speeds. Ira > > -- Hal > > > > > > > -- > > Keshetti Mahesh > > > > On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > > > Also, iblinkinfo will report links which it finds capable of either > > faster or wider operation. iblinkinfo checks both ends of the link as Hal > > mentions. It reports this with output like. > > > > > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > > > ... > > > 7 8[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 8 12[ ] > > "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > > > ... > > > > > > Also the portstatus console command in OpenSM will report links which are > > running at "reduced speed or width". Although this does not check the > > remote port. > > > > > > OpenSM $ help portstatus > > > portstatus [ca|switch|router] > > > summarize port status > > > [ca|switch|router] -- limit the results to the node type specified > > > OpenSM $ portstatus > > > "ALL" port status: > > > 115 port(s) scanned on 9 nodes in 26 us > > > 85 down > > > 30 active > > > 32 at 4X > > > 22 at 2.5 Gbps > > > 8 at 5.0 Gbps > > > 2 at 10.0 Gbps > > > > > > Possible issues: > > > 2 disabled > > > 0x0008f10400411b18 5 (ISR9024D Voltaire) > > > 0x0005ad0000092106 13 (Cisco Switch SFS7000D) > > > 6 with reduced speed > > > 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) > > > 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) > > > 0x0005ad0000092106 21 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 20 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 9 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > > > > > > > Ira > > > > > > On Thu, 10 Sep 2009 09:23:35 -0400 > > > Hal Rosenstock wrote: > > > > > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh > > >> wrote: > > >> > > >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > > >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > > >> > Reports error/warning messages if the LinkSpeedActive is configured as > > >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > > >> > > > >> > > >> ibportstate checks for more than this in terms of speed (and width) > > >> anomalies. > > >> > > >> Would it be better for these scripts to use that tool now ? > > Alternatively, > > >> the additional speed/width anomaly checks could be implemented in these > > >> scripts but it does involve checking the peer port so there's a little > > more > > >> to it. > > >> > > >> -- Hal > > >> > > >> > > >> > > > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > > >> > --- > > >> > infiniband-diags/scripts/ibcheckportspeed.in | 146 > > >> > ++++++++++++++++++++++++++ > > >> > infiniband-diags/scripts/ibcheckportwidth.in | 2 +- > > >> > infiniband-diags/scripts/ibcheckspeed.in | 135 > > >> > ++++++++++++++++++++++++ > > >> > 3 files changed, 282 insertions(+), 1 deletions(-) > > >> > create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > > >> > create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > > >> > > > >> > > >> > > > > > > > > > -- > > > Ira Weiny > > > Math Programmer/Computer Scientist > > > Lawrence Livermore National Lab > > > 925-423-8008 > > > weiny2 at llnl.gov > > > > > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From weiny2 at llnl.gov Mon Sep 14 11:02:21 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Sep 2009 11:02:21 -0700 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> Message-ID: <20090914110221.7e33b737.weiny2@llnl.gov> On Fri, 11 Sep 2009 09:32:39 +0530 Keshetti Mahesh wrote: > My badness. I have not used 'iblinkinfo' before. > So, I guess there is no need for the above script. Apart from that, I feel > there should be a program/script which will first scan the fabric to find the > maximum common supported width/speed and then report the warning messages > of the links/ports which are configured with active width/speed less > than the found > value. Is there any tool already exists which does the same ? Not that I know of. While I could see the usefulness of such a tool in some environments I have gone down the path of making the OFED diags more generic and then writing some wrappers for our local needs. Currently I have a script which runs iblinkinfo with the "-l" option and then returns total number of links at SDR, DDR, QDR as well as the number of links at 1, 4, or 12X. I then leave it up to the sys admin to know if their cluster is homo or heterogenious and how many links should be at what speeds. They can then use iblinkinfo to identify which links are incorrect for their particular installation. Ira > > - > Keshetti Mahesh > > On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > > Also, iblinkinfo will report links which it finds capable of either faster or wider operation.  iblinkinfo checks both ends of the link as Hal mentions.  It reports this with output like. > > > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > > ... > >           7    8[  ] ==( 4X 2.5 Gbps Active/  LinkUp)==>       8   12[  ] "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > > ... > > > > Also the portstatus console command in OpenSM will report links which are running at "reduced speed or width".  Although this does not check the remote port. > > > > OpenSM $ help portstatus > > portstatus [ca|switch|router] > > summarize port status > >   [ca|switch|router] -- limit the results to the node type specified > > OpenSM $ portstatus > > "ALL" port status: > >   115 port(s) scanned on 9 nodes in 26 us > >   85 down > >   30 active > >   32 at 4X > >   22 at 2.5 Gbps > >   8 at 5.0 Gbps > >   2 at 10.0 Gbps > > > > Possible issues: > >   2 disabled > >      0x0008f10400411b18 5 (ISR9024D Voltaire) > >      0x0005ad0000092106 13 (Cisco Switch SFS7000D) > >   6 with reduced speed > >      0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) > >      0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) > >      0x0005ad0000092106 21 (Cisco Switch SFS7000D) > >      0x0005ad0000092106 20 (Cisco Switch SFS7000D) > >      0x0005ad0000092106 9 (Cisco Switch SFS7000D) > >      0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > > > > Ira > > > > On Thu, 10 Sep 2009 09:23:35 -0400 > > Hal Rosenstock wrote: > > > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh > >> wrote: > >> > >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > >> > Reports error/warning messages if the LinkSpeedActive is configured as > >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > >> > > >> > >> ibportstate checks for more than this in terms of speed (and width) > >> anomalies. > >> > >> Would it be better for these scripts to use that tool now ? Alternatively, > >> the additional speed/width anomaly checks could be implemented in these > >> scripts but it does involve checking the peer port so there's a little more > >> to it. > >> > >> -- Hal > >> > >> > >> > > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > >> > --- > >> >  infiniband-diags/scripts/ibcheckportspeed.in |  146 > >> > ++++++++++++++++++++++++++ > >> >  infiniband-diags/scripts/ibcheckportwidth.in |    2 +- > >> >  infiniband-diags/scripts/ibcheckspeed.in     |  135 > >> > ++++++++++++++++++++++++ > >> >  3 files changed, 282 insertions(+), 1 deletions(-) > >> >  create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > >> >  create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > >> > > >> > >> > > > > > > -- > > Ira Weiny > > Math Programmer/Computer Scientist > > Lawrence Livermore National Lab > > 925-423-8008 > > weiny2 at llnl.gov > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From rdreier at cisco.com Mon Sep 14 11:20:41 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Sep 2009 11:20:41 -0700 Subject: [ofa-general] Re: [PATCH] Do not use enum object types for bitfields In-Reply-To: <20090730225816.GA32073@obsidianresearch.com> (Jason Gunthorpe's message of "Thu, 30 Jul 2009 16:58:16 -0600") References: <20090730225816.GA32073@obsidianresearch.com> Message-ID: OK, let's try this out and see how it goes... applied, thanks. From rdreier at cisco.com Mon Sep 14 11:23:27 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Sep 2009 11:23:27 -0700 Subject: [ofa-general] Re: [PATCH mthca] Update function prototypes to match ibverbs In-Reply-To: <20090723160429.GC22400@obsidianresearch.com> (Jason Gunthorpe's message of "Thu, 23 Jul 2009 10:04:29 -0600") References: <20090723160429.GC22400@obsidianresearch.com> Message-ID: thanks, applied both this and mlx4 version. seems that libcxgb3, libipathverbs and libnes would want similar treatment? - R. From rdreier at cisco.com Mon Sep 14 11:25:38 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Sep 2009 11:25:38 -0700 Subject: [ofa-general] Re: [PATCH] bug in ibv_rc_pingpong.c and ibv_uc_pingpong.c in ofed_1_4/libibverbs.git In-Reply-To: <025801ca3562$4a8d93e0$dfa8bba0$@com> (Robert Pearson's message of "Mon, 14 Sep 2009 12:39:20 -0500") References: <025801ca3562$4a8d93e0$dfa8bba0$@com> Message-ID: your mailer seems to have mangled whitespace but I applied it by hand, thanks. From andy.grover at oracle.com Mon Sep 14 11:28:48 2009 From: andy.grover at oracle.com (Andy Grover) Date: Mon, 14 Sep 2009 11:28:48 -0700 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <200909131229.29887.jackm@dev.mellanox.co.il> References: <4A8E4854.2060909@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> <200909131229.29887.jackm@dev.mellanox.co.il> Message-ID: <4AAE8B60.2060001@oracle.com> Jack Morgenstein wrote: > That said, I notice that there is a bug in rds on our Fedora Core11 (2.6.29-based). > Andy, you need to put rds_to_2_6_28.patch into a 2.6.29 kernel_patches/backports directory. > There isn't one yet. > > I can do this for you, but I suggest a rename to rds_to_2_6_29.patch for the file (in this directory > only). > Summary: I will create kernel_patches/backport/2.6.29 directory, and put rds_to_2_6_29.patch inside. > rds_to_2_6_29.patch is identical to rds_to_2_6_28.patch > NOTE: as of now, there is no need for a separate 2.6.29_FC11 backports directory. If the need > arises, we will create it. > > If you ACK, I will take care of it. Sounds great, ACK, thanks. Regards -- Andy From jgunthorpe at obsidianresearch.com Mon Sep 14 11:35:36 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Mon, 14 Sep 2009 12:35:36 -0600 Subject: [ofa-general] Re: [PATCH mthca] Update function prototypes to match ibverbs In-Reply-To: References: <20090723160429.GC22400@obsidianresearch.com> Message-ID: <20090914183536.GB25981@obsidianresearch.com> On Mon, Sep 14, 2009 at 11:23:27AM -0700, Roland Dreier wrote: > thanks, applied both this and mlx4 version. seems that libcxgb3, > libipathverbs and libnes would want similar treatment? Hmm, yes.. Say, have you thought much about bringing libibverbs, libmlx4, libmthca, libcxgb3, libnes and libibcm into a single package/repository, with a single configure script, etc? I have to say the current arrangement seems to be a big pain for everyone. It is a PITA to develop on all of them, and PITA to compile all of them.. The rpms/debs can still come out as seperate libs like they should but keeping the source together would be really nice - little things like using shave in configure would apply more broadly/etc. Plus I think such a thing would help reduce the need for OFED if installation from source was easier! :) If you are interested I would be happy to throw up a prototype git repository. Regards, Jason From Rafael.Tinoco at Sun.COM Mon Sep 14 11:47:15 2009 From: Rafael.Tinoco at Sun.COM (Rafael David Tinoco) Date: Mon, 14 Sep 2009 15:47:15 -0300 Subject: [ofa-general] Has anyone seen these errors in openmpi running linpack with big N problem sizes ? Message-ID: <00e201ca356b$c88a4c80$599ee580$%Tinoco@Sun.COM> This seems to be happening only with high number of nodes and big problem sizes (16G per host). -----Original Message----- From: Rafael David Tinoco [mailto:Rafael.Tinoco at Sun.COM] Sent: Monday, September 14, 2009 3:06 PM To: 'Rafael David Tinoco' Subject: Has anyone seen these errors in openmpi running linpack with big N problem sizes ? Hello, Im getting weird mpi problems running linpack with big N problem sizes. Has anyone seen this ? Im running: 2.6.18-128.el5 + OFED 1.4.1 On all my nodes, testing with linpack 85 nodes so far. My nodes are in one C48 in mesh topology. Tks Tinoco -------------------------------------------------------------------------- The InfiniBand retry count between two MPI processes has been exceeded. "Retry count" is defined in the InfiniBand spec 1.2 (section 12.7.38): The total number of times that the sender wishes the receiver to retry timeout, packet sequence, etc. errors before posting a completion error. This error typically means that there is something awry within the InfiniBand fabric itself. You should note the hosts on which this error has occurred; it has been observed that rebooting or removing a particular host from the job can sometimes resolve this issue. Two MCA parameters can be used to control Open MPI's behavior with respect to the retry count: * btl_openib_ib_retry_count - The number of times the sender will attempt to retry (defaulted to 7, the maximum value). * btl_openib_ib_timeout - The local ACK timeout parameter (defaulted to 10). The actual timeout value used is calculated as: 4.096 microseconds * (2^btl_openib_ib_timeout) See the InfiniBand spec 1.2 (section 12.7.34) for more details. Below is some information about the host that raised the error and the peer to which it was connected: Local host: b03n10 Local device: mlx4_0 Peer host: b02n05 You may need to consult with your system administrator to get this problem fixed. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec has exited due to process rank 499 with PID 5570 on node b03n10 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). -------------------------------------------------------------------------- [[59611,1],499][btl_openib_component.c:2929:handle_wc] from b03n10 to: b02n05 error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 348242776 opcode 0 vendor error 129 qp_idx 3 Rafael David Tinoco - Sun Microsystems Systems Engineer - High Performance Computing Rafael.Tinoco at Sun.COM - 55.11.5187.2194 From vst at vlnb.net Mon Sep 14 11:51:11 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 14 Sep 2009 22:51:11 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: <4AAE909F.6030202@vlnb.net> Chris Worley, on 09/11/2009 11:50 PM wrote: > I've definitely removed the switch/firmware from being the cause. > > I'm thinking the reason you can't repeat the test may be latency > related. We get ~50usecs average latency (on small block sizes), > which can't be achieved using regular SSD's (and rotating drives are > nowhere close). Maybe a ramdisk would help repeat the issue. I think you should try to reproduce the problem with ramdisk or nullio. By so you will eliminate possible influence of the SSD backend. Vlad From hal.rosenstock at gmail.com Mon Sep 14 11:51:03 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Sep 2009 14:51:03 -0400 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <20090914110221.7e33b737.weiny2@llnl.gov> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> <20090914110221.7e33b737.weiny2@llnl.gov> Message-ID: On Mon, Sep 14, 2009 at 2:02 PM, Ira Weiny wrote: > On Fri, 11 Sep 2009 09:32:39 +0530 > Keshetti Mahesh wrote: > > > My badness. I have not used 'iblinkinfo' before. > > So, I guess there is no need for the above script. Apart from that, I > feel > > there should be a program/script which will first scan the fabric to find > the > > maximum common supported width/speed and then report the warning messages > > of the links/ports which are configured with active width/speed less > > than the found > > value. Is there any tool already exists which does the same ? > > Not that I know of. > ibportstate does this but is on a per port basis. This could be readily scripted (ad hoc or in tree) for this purpose. -- Hal > > While I could see the usefulness of such a tool in some environments I have > gone down the path of making the OFED diags more generic and then writing > some wrappers for our local needs. Currently I have a script which runs > iblinkinfo with the "-l" option and then returns total number of links at > SDR, DDR, QDR as well as the number of links at 1, 4, or 12X. I then leave > it up to the sys admin to know if their cluster is homo or heterogenious and > how many links should be at what speeds. They can then use iblinkinfo to > identify which links are incorrect for their particular installation. > > Ira > > > > > - > > Keshetti Mahesh > > > > On Thu, Sep 10, 2009 at 9:32 PM, Ira Weiny wrote: > > > Also, iblinkinfo will report links which it finds capable of either > faster or wider operation. iblinkinfo checks both ends of the link as Hal > mentions. It reports this with output like. > > > > > > Switch 0x0005ad0000092106 Cisco Switch SFS7000D: > > > ... > > > 7 8[ ] ==( 4X 2.5 Gbps Active/ LinkUp)==> 8 12[ > ] "MT47396 Infiniscale-III Mellanox Technologies" ( Could be 5.0 Gbps) > > > ... > > > > > > Also the portstatus console command in OpenSM will report links which > are running at "reduced speed or width". Although this does not check the > remote port. > > > > > > OpenSM $ help portstatus > > > portstatus [ca|switch|router] > > > summarize port status > > > [ca|switch|router] -- limit the results to the node type specified > > > OpenSM $ portstatus > > > "ALL" port status: > > > 115 port(s) scanned on 9 nodes in 26 us > > > 85 down > > > 30 active > > > 32 at 4X > > > 22 at 2.5 Gbps > > > 8 at 5.0 Gbps > > > 2 at 10.0 Gbps > > > > > > Possible issues: > > > 2 disabled > > > 0x0008f10400411b18 5 (ISR9024D Voltaire) > > > 0x0005ad0000092106 13 (Cisco Switch SFS7000D) > > > 6 with reduced speed > > > 0x0008f10500200220 33 (Voltaire 4036 - 36 QDR ports switch) > > > 0x0008f10500200220 19 (Voltaire 4036 - 36 QDR ports switch) > > > 0x0005ad0000092106 21 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 20 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 9 (Cisco Switch SFS7000D) > > > 0x0005ad0000092106 8 (Cisco Switch SFS7000D) > > > > > > > > > Ira > > > > > > On Thu, 10 Sep 2009 09:23:35 -0400 > > > Hal Rosenstock wrote: > > > > > >> On Thu, Sep 10, 2009 at 9:02 AM, Keshetti Mahesh > > >> wrote: > > >> > > >> > Added 'ibcheckspeed' and 'ibcheckportspeed': Similar to > > >> > 'ibcheckwidth/ibcheckportwidth' in functionality and implementation. > > >> > Reports error/warning messages if the LinkSpeedActive is configured > as > > >> > 2.5 Gbps when the LinkSpeedSupported is more than 2.5 Gbps. > > >> > > > >> > > >> ibportstate checks for more than this in terms of speed (and width) > > >> anomalies. > > >> > > >> Would it be better for these scripts to use that tool now ? > Alternatively, > > >> the additional speed/width anomaly checks could be implemented in > these > > >> scripts but it does involve checking the peer port so there's a little > more > > >> to it. > > >> > > >> -- Hal > > >> > > >> > > >> > > > >> > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> > > >> > --- > > >> > infiniband-diags/scripts/ibcheckportspeed.in | 146 > > >> > ++++++++++++++++++++++++++ > > >> > infiniband-diags/scripts/ibcheckportwidth.in | 2 +- > > >> > infiniband-diags/scripts/ibcheckspeed.in | 135 > > >> > ++++++++++++++++++++++++ > > >> > 3 files changed, 282 insertions(+), 1 deletions(-) > > >> > create mode 100644 infiniband-diags/scripts/ibcheckportspeed.in > > >> > create mode 100644 infiniband-diags/scripts/ibcheckspeed.in > > >> > > > >> > > >> > > > > > > > > > -- > > > Ira Weiny > > > Math Programmer/Computer Scientist > > > Lawrence Livermore National Lab > > > 925-423-8008 > > > weiny2 at llnl.gov > > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at dev.mellanox.co.il Mon Sep 14 12:42:21 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 14 Sep 2009 22:42:21 +0300 Subject: [ofa-general] Does the CMA user space support join multicast for IPv6 too? Message-ID: <4AAE9C9D.3010508@mellanox.co.il> Hi Sean, Does rdma_join_multicast supports IPv6 addresses? If yes from which version on the librdmacm? Thanks Tziporet From kliteyn at dev.mellanox.co.il Mon Sep 14 15:25:12 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 15 Sep 2009 01:25:12 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> <4AAE5BA3.20101@gmail.com> Message-ID: <4AAEC2C8.6020204@dev.mellanox.co.il> Hal Rosenstock wrote: > > > On Mon, Sep 14, 2009 at 11:05 AM, Eli Dorfman (Voltaire) > > wrote: > > Hal Rosenstock wrote: > > > > > > On Mon, Sep 14, 2009 at 10:32 AM, Eli Dorfman (Voltaire) > > > >> wrote: > > > > Hal Rosenstock wrote: > > > > > > > > > On Sun, Sep 13, 2009 at 3:55 AM, Doron Shoham > > > > > > > > >>> wrote: > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > On Thu, Sep 10, 2009 at 7:56 AM, Doron Shoham > > > > > > > >> > > > > > > > > >>>> wrote: > > > > > > > > ibcheckroutes validates route between all hosts > in the > > fabric. > > > > This script finds all leaf switches (switches > that are > > > connected to > > > > HCAs) > > > > > > > > > > This script parses the output of ibnetdiscoer. > > > It finds all leaf switches (from the topology file > > > generated by ibnetdiscover). > > > The it checks if a route exists between all leaf switches > > > using ibtracert. > > > > > > > > > Why leaf switches (and not CAs) ? How are they determined > (from the > > > ibnetdiscover output) ? > > > > > > > > How are the leaf switches determined (from core switches) in the > > ibnetdiscover output ? Is it any switch which has an attached CA > versus > > any switch which has no attached CAs ? > yes > > > > > > > > > because there are much less combinations (routes) of leaf > switches > > than CAs. > > > > > > So is the check is that there are routes between all the leaf > switches ? > yes > > > > > > And since we assume that opensm routing builds lid matrix > based on > > switch connectivity than if two switches have route between each > > other then all CAs that are connected to them will have route to > > each other. > > > > > > I can't parse this sentence. Also, this should have nothing to do > with > > OpenSM as it is SM independent AFAIT. > since we check routes between leaf switches LIDs, for opensm this > also assures that we have route between CAs that are attached to them. > > > Not quite. It assures there should be a path but not necessarily a route > as all the LFTs are not checked with the CA port LIDs. One such example is ftree routing. First it creates all the CA-2-CA routes, and only after this it creates switch-2-switch routing. So in case of ftree routing, having full connectivity between leaf switches doesn't imply anything w.r.t. CA-2-CA connectivity. -- Yevgeny > -- Hal > > > > > > > > > In ibnetdiscover you can see to which switch (LID) each CA is > connected. > > > > > > Sure. > > > > > > > > > > > > > > > > > > > > > > > CAs or HCAs ? > > > CAs > > > > > > > > What about switch port 0s ? > > > It checks connectivity only between leaf switches (not all > > switches). > > > I assume that traffic is generated only between CAs and > therefor > > > connectivity between other switches (not leaf switches) > does not > > > important. > > > > > > > > > It's important for a couple of reasons: first PMA access on > > switches and > > > secondly it's an IBA requirement although some OpenSM routing > > protocols > > > ignore this. IMO it should be an option (not the default) > to add these > > > LIDs in too to the ones checked. > > > > Ok, we can this option once this patch is applied. > > > > > > I have some other specific comments on the patch. > > > > > > Also it may be better to provide the switch LID(s) from which > PM is > > running to reduce number of tested routes. > > > > > > This is in the vein of only checking leaf switch connectivity but > is not > > the IBA general requirement. > > > > -- Hal > > > > > > > > Eli > > > > > > > > > > > > > > > > > > > > > > > > > > and runs ibtracert between them. > > > > When using various routing algorithms (e.g. up-down), > > > > > > > > > > > > With which routing algorithms has this been tried ? > > > I assume that from complexity perspective, the routing > algorithms > > > calculate > > > routes only between leaf switches and not between all CAs. > > > Then it adds one hop for all CAs connected to the leaf > switches. > > > > > > > > > It depends on the routing algorithm (some violate this) but > the basic > > > IBA requirement is: > > > * > > > > > > C14-62.1.4: > > > > > > *From every endport within the subnet, the SM *shall > *provide at least > > > one reversible path to every other endport. > > > > > > -- Hal > > > > > > > > > I've tested it with up-down but it really doesn't > matter which > > > routing algorithm you are using. > > > It just check the routes between leaf switches (and if > the routing > > > algorithm behave as above, it means that it checks all CAs > > > connectivity). > > > > > > > > > > > -- Hal > > > > > > > > > > > > if fabric topology is not suitable there will be no > > > > routes between some nodes. > > > > It reports when the route exists between source and > > > destination LIDs. > > > > > > > > Signed-off-by: Doron Shoham > > > > > > >> > > > > > > > > >>>> > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From worleys at gmail.com Mon Sep 14 16:03:04 2009 From: worleys at gmail.com (Chris Worley) Date: Mon, 14 Sep 2009 17:03:04 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AAE909F.6030202@vlnb.net> References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/11/2009 11:50 PM wrote: >> >> I've definitely removed the switch/firmware from being the cause. >> >> I'm thinking the reason you can't repeat the test may be latency >> related.  We get ~50usecs average latency (on small block sizes), >> which can't be achieved using regular SSD's (and rotating drives are >> nowhere close).  Maybe a ramdisk would help repeat the issue. > > I think you should try to reproduce the problem with ramdisk or nullio. By > so you will eliminate possible influence of the SSD backend. W/ 12GB RAM in the target, I created a 7GB ramdisk: mount -t ramfs -o size=7g ramfs /mnt/ dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices Then, on the initiator, I tested it... and it hung during sequential 8KB block reads: fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 --randrepeat=0 \ --group_reporting --ioengine=libaio --filename=/dev/sde --name=test --loops=10000 --runtime=600 Note that I was running the SM on the target this time too. Thanks, Chris > > Vlad > From Rafael.Tinoco at Sun.COM Mon Sep 14 17:03:40 2009 From: Rafael.Tinoco at Sun.COM (Rafael David Tinoco) Date: Mon, 14 Sep 2009 21:03:40 -0300 Subject: [ofa-general] OpenSM with MESH topology Message-ID: <015f01ca3597$fc498190$f4dc84b0$%Tinoco@Sun.COM> Hello Im having some trouble with opensm making my ports congestioned, running linpack with big problem sizes. Does anybody have an opensm.conf for MESH topology ? Tks Tinoco From Rafael.Tinoco at Sun.COM Mon Sep 14 18:01:10 2009 From: Rafael.Tinoco at Sun.COM (Rafael David Tinoco) Date: Mon, 14 Sep 2009 22:01:10 -0300 Subject: [ofa-general] OpenSM with MESH topology In-Reply-To: <015f01ca3597$fc498190$f4dc84b0$%Tinoco@Sun.COM> References: <015f01ca3597$fc498190$f4dc84b0$%Tinoco@Sun.COM> Message-ID: <016501ca35a0$0485af70$0d910e50$%Tinoco@Sun.COM> Never mind, problem solved. Tks -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Rafael David Tinoco Sent: Monday, September 14, 2009 9:04 PM To: 'OFED mailing list' Subject: [ofa-general] OpenSM with MESH topology Hello Im having some trouble with opensm making my ports congestioned, running linpack with big problem sizes. Does anybody have an opensm.conf for MESH topology ? Tks Tinoco _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From bart.vanassche at gmail.com Mon Sep 14 23:10:36 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Sep 2009 08:10:36 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley wrote: > > On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin wrote: > > Chris Worley, on 09/11/2009 11:50 PM wrote: > >> > >> I've definitely removed the switch/firmware from being the cause. > >> > >> I'm thinking the reason you can't repeat the test may be latency > >> related.  We get ~50usecs average latency (on small block sizes), > >> which can't be achieved using regular SSD's (and rotating drives are > >> nowhere close).  Maybe a ramdisk would help repeat the issue. > > > > I think you should try to reproduce the problem with ramdisk or nullio. By > > so you will eliminate possible influence of the SSD backend. > > W/ 12GB RAM in the target, I created a 7GB ramdisk: > > mount -t ramfs -o size=7g ramfs /mnt/ > dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 > echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk > echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices > > Then, on the initiator, I tested it... and it hung during sequential > 8KB block reads: > > fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 > --randrepeat=0 \ >   --group_reporting --ioengine=libaio --filename=/dev/sde --name=test > --loops=10000 --runtime=600 > > Note that I was running the SM on the target this time too. Which Linux distro was installed on the inititiator and on the target ? And if applicable, which OFED version ? Which kernel messages were logged by SRPT around the time the issue occurred (after having enabled SRPT logging first) ? Bart. From rdreier at cisco.com Tue Sep 15 01:27:57 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Sep 2009 01:27:57 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090915155231.DB86.A69D9226@jp.fujitsu.com> (KOSAKI Motohiro's message of "Tue, 15 Sep 2009 16:03:04 +0900 (JST)") References: <20090911064019.GZ4973@obsidianresearch.com> <20090915155231.DB86.A69D9226@jp.fujitsu.com> Message-ID: > - I guess you have your MPI implementaion w/ ummunotify, right? Yes, Jeff Squyres (cc'ed) has an Open MPI prototype (mercurial tree at http://bitbucket.org/jsquyres/ummunot/). > - I guess you have test sevaral pattern, right? > if so, can we see your test result? Open MPI has a pretty extensive automated test fabric -- I don't have a link handy but I believe all the tests that pass with unmodified Open MPI currently still pass with ummunotify. Maybe Jeff has a link. > - I think you can explain your MPI advantage/disadvantage against > current OpenMPI (or mpich et al). The advantage is as Jeff explained in his blog post (http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking/), namely the performance improvement of memory registration caching without the reliability problems caused by previous approaches to caching such as trying to hook malloc etc (which are fragile because the great diversity of MPI-using codes find ways to mess up all previous userspace-only approaches). > - I guess your patch dramatically improve MPI implementaion, but > it's not free. it request some limitation to MPI application, right? Not that I know of, beyond already existing limitations. > - I imagine multi thread and fork. Is there another linmitaion? There are no new limitations on multi-threaded codes or on use of fork that I know of. Of course, buggy code that does something like passing a buffer to MPI in one thread and then freeing that buffer from another thread before MPI is done with it is still buggy; but ummunotify actually increases the ability of the MPI implementation to detect such bugs and give useful diagnostic information. > - In past discuttion, you said ummunotify user should not use > multi threading. you also think user should not fork? I don't recall where I said ummunotify users should not be multithreaded. I don't know of any problem with that. Also code using ummunotify can fork -- ummunotify simply does not fix issues with copy-on-write for buffers that are in use, just as it does not fix multithreaded code that has a race between using a buffer and freeing the same buffer. Hope this clarifies things. - Roland From dorfman.eli at gmail.com Tue Sep 15 02:35:52 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Tue, 15 Sep 2009 12:35:52 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AAEC2C8.6020204@dev.mellanox.co.il> References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> <4AAE5BA3.20101@gmail.com> <4AAEC2C8.6020204@dev.mellanox.co.il> Message-ID: <4AAF5FF8.9030105@gmail.com> Yevgeny Kliteynik wrote: >> >> Not quite. It assures there should be a path but not necessarily a >> route as all the LFTs are not checked with the CA port LIDs. > > One such example is ftree routing. > First it creates all the CA-2-CA routes, and only after this > it creates switch-2-switch routing. So in case of ftree routing, > having full connectivity between leaf switches doesn't imply > anything w.r.t. CA-2-CA connectivity. If that is the case then I think ftree routing may be dramatically improved. This was also the implementation for minhop long time ago and was improved to calculate minhops between switches and then extend by 1 for all CAs that are attached to the switch. We can add a flag to check routes between ALL-2-ALL CA-2-CA SW-2-SW LeafSW-2-LeafSW From vlad at lists.openfabrics.org Tue Sep 15 03:12:08 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 15 Sep 2009 03:12:08 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090915-0200 daily build status Message-ID: <20090915101208.763FEE61FB4@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sashak at voltaire.com Tue Sep 15 03:08:13 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Sep 2009 13:08:13 +0300 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info In-Reply-To: References: <20090906154901.GF25241@me> Message-ID: <20090915100813.GL17481@me> On 08:45 Mon 14 Sep , Hal Rosenstock wrote: > > Does this mean consolidate_ipv6_snm_req does not work now ? No, it doesn't. As you may remember 'consolidate_ipv6_snm_req' workaround does nothing with MGIDs to MLID mapping, but instead enforces all IPv6 SNM matching requests to join a single multicast group (MGID). Sasha From sashak at voltaire.com Tue Sep 15 03:31:29 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Sep 2009 13:31:29 +0300 Subject: [ofa-general] [PATCH v2] opensm: port object reference in mcm ports list In-Reply-To: <20090907154747.GO25241@me> References: <20090907154747.GO25241@me> Message-ID: <20090915103129.GM17481@me> This adds port object reference to port related structures list in multicast group. In this way we are saving couple of lookups over port guid table and simplifying some interfaces. Signed-off-by: Sasha Khapyorsky --- Changes from v1: Unify osm_mgrp_remove_port() and osm_mgrp_add_port() function calls opensm/include/opensm/osm_mcm_port.h | 23 ++++++++----- opensm/include/opensm/osm_multicast.h | 23 +++++++------ opensm/opensm/osm_mcast_mgr.c | 54 ++----------------------------- opensm/opensm/osm_mcm_port.c | 10 +++--- opensm/opensm/osm_multicast.c | 30 +++++++++--------- opensm/opensm/osm_sa.c | 24 +++++++------- opensm/opensm/osm_sa_mcmember_record.c | 26 +++++---------- 7 files changed, 70 insertions(+), 120 deletions(-) diff --git a/opensm/include/opensm/osm_mcm_port.h b/opensm/include/opensm/osm_mcm_port.h index c2b18de..74b6615 100644 --- a/opensm/include/opensm/osm_mcm_port.h +++ b/opensm/include/opensm/osm_mcm_port.h @@ -46,6 +46,7 @@ #include #include #include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -71,6 +72,7 @@ BEGIN_C_DECLS */ typedef struct osm_mcm_port { cl_map_item_t map_item; + osm_port_t *port; ib_gid_t port_gid; uint8_t scope_state; boolean_t proxy_join; @@ -80,6 +82,9 @@ typedef struct osm_mcm_port { * map_item * Map Item for qmap linkage. Must be first element!! * +* port +* Reference to the parent port. +* * port_gid * GID of the member port. * @@ -106,19 +111,19 @@ typedef struct osm_mcm_port { * * SYNOPSIS */ -osm_mcm_port_t *osm_mcm_port_new(IN const ib_gid_t * const p_port_gid, - IN const uint8_t scope_state, - IN const boolean_t proxy_join); +osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t * port, IN ib_member_rec_t *mcmr, + IN boolean_t proxy_join); /* * PARAMETERS -* p_port_gid -* [in] Pointer to the GID of the port to add to the multicast group. +* port +* [in] Pointer to the port object. +* GID of the port to add to the multicast group. * -* scope_state -* [in] scope state of the join request +* mcmr +* [in] Pointer to MCMember record of the join request * -* proxy_join -* [in] proxy_join state analyzed from the request +* proxy_join +* [in] proxy_join state analyzed from the request * * RETURN VALUES * Pointer to the allocated and initialized MCM Port object. diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index e6221cb..7fcbb1a 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -309,20 +309,21 @@ static inline ib_net16_t osm_mgrp_get_mlid(IN const osm_mgrp_t * const p_mgrp) * SYNOPSIS */ osm_mcm_port_t *osm_mgrp_add_port(osm_subn_t *subn, osm_log_t *log, - IN osm_mgrp_t * mgrp, - IN const ib_gid_t * port_gid, - IN const uint8_t join_state, - IN boolean_t proxy); + IN osm_mgrp_t * mgrp, IN osm_port_t *port, + IN ib_member_rec_t *mcmr, IN boolean_t proxy); /* * PARAMETERS * mgrp * [in] Pointer to an osm_mgrp_t object to initialize. * -* port_gid -* [in] Pointer to the GID of the port to add to the multicast group. +* port +* [in] Pointer to an osm_port_t object * -* join_state -* [in] The join state for this port in the group. +* mcmr +* [in] Pointer to MCMember record received for the join +* +* proxy +* [in] The proxy join state for this port in the group. * * RETURN VALUES * IB_SUCCESS @@ -402,9 +403,9 @@ void osm_mgrp_delete_port(IN osm_subn_t * subn, IN osm_log_t * log, * SEE ALSO *********/ -int osm_mgrp_remove_port(osm_subn_t *subn, osm_log_t *log, osm_mgrp_t *mgrp, - osm_mcm_port_t *mcm, uint8_t join_state); -void osm_mgrp_cleanup(osm_subn_t *subn, osm_mgrp_t *mpgr); +int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, + osm_mcm_port_t * mcm_port, ib_member_rec_t * mcmr); +void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mpgr); END_C_DECLS #endif /* _OSM_MULTICAST_H_ */ diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..3894677 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -132,7 +132,6 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, float avg_hops = 0; uint32_t hops = 0; uint32_t num_ports = 0; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; const cl_qmap_t *p_mcm_tbl; @@ -148,23 +147,7 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); p_mcm_port = (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - /* - Acquire the port object for this port guid, then create - the new worker object to build the list. - */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A18: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - hops += osm_switch_get_port_least_hops(p_sw, p_port); + hops += osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); num_ports++; } @@ -190,7 +173,6 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, { uint32_t max_hops = 0; uint32_t hops = 0; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; const cl_qmap_t *p_mcm_tbl; @@ -206,23 +188,7 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); p_mcm_port = (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - /* - Acquire the port object for this port guid, then create - the new worker object to build the list. - */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1A: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - hops = osm_switch_get_port_least_hops(p_sw, p_port); + hops = osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); if (hops > max_hops) max_hops = hops; } @@ -714,7 +680,6 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, osm_mgrp_t * p_mgrp) { const cl_qmap_t *p_mcm_tbl; - const osm_port_t *p_port; const osm_mcm_port_t *p_mcm_port; uint32_t num_ports; cl_qlist_t port_list; @@ -781,23 +746,12 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, Acquire the port object for this port guid, then create the new worker object to build the list. */ - p_port = osm_get_port_by_guid(sm->p_subn, - ib_gid_get_guid(&p_mcm_port-> - port_gid)); - if (!p_port) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A09: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(ib_gid_get_guid - (&p_mcm_port->port_gid))); - continue; - } - - p_wobj = mcast_work_obj_new(p_port); + p_wobj = mcast_work_obj_new(p_mcm_port->port); if (p_wobj == NULL) { OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A10: " "Insufficient memory to route port 0x%016" PRIx64 "\n", - cl_ntoh64(osm_port_get_guid(p_port))); + cl_ntoh64(osm_port_get_guid(p_mcm_port->port))); continue; } diff --git a/opensm/opensm/osm_mcm_port.c b/opensm/opensm/osm_mcm_port.c index b6b6149..9381bff 100644 --- a/opensm/opensm/osm_mcm_port.c +++ b/opensm/opensm/osm_mcm_port.c @@ -50,17 +50,17 @@ /********************************************************************** **********************************************************************/ -osm_mcm_port_t *osm_mcm_port_new(IN const ib_gid_t * const p_port_gid, - IN const uint8_t scope_state, - IN const boolean_t proxy_join) +osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t *port, IN ib_member_rec_t *mcmr, + IN boolean_t proxy_join) { osm_mcm_port_t *p_mcm; p_mcm = malloc(sizeof(*p_mcm)); if (p_mcm) { memset(p_mcm, 0, sizeof(*p_mcm)); - p_mcm->port_gid = *p_port_gid; - p_mcm->scope_state = scope_state; + p_mcm->port = port; + p_mcm->port_gid = mcmr->port_gid; + p_mcm->scope_state = mcmr->scope_state; p_mcm->proxy_join = proxy_join; } diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index bcf8b0b..f548990 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -131,23 +131,18 @@ static void mgrp_send_notice(osm_subn_t * subn, osm_log_t * log, /********************************************************************** **********************************************************************/ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, - IN osm_mgrp_t * mgrp, - IN const ib_gid_t * port_gid, - IN const uint8_t join_state, - IN boolean_t proxy) + IN osm_mgrp_t * mgrp, osm_port_t *port, + IN ib_member_rec_t *mcmr, IN boolean_t proxy) { - ib_net64_t port_guid; osm_mcm_port_t *mcm_port; cl_map_item_t *prev_item; - uint8_t prev_join_state = 0; + uint8_t prev_join_state = 0, join_state = mcmr->scope_state; uint8_t prev_scope; - mcm_port = osm_mcm_port_new(port_gid, join_state, proxy); + mcm_port = osm_mcm_port_new(port, mcmr, proxy); if (!mcm_port) return NULL; - port_guid = port_gid->unicast.interface_id; - /* prev_item = cl_qmap_insert(...) Pointer to the item in the map with the specified key. If insertion @@ -155,7 +150,7 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, specified key already exists in the map, the pointer to that item is returned. */ - prev_item = cl_qmap_insert(&mgrp->mcm_port_tbl, port_guid, + prev_item = cl_qmap_insert(&mgrp->mcm_port_tbl, port->guid, &mcm_port->map_item); /* if already exists - revert the insertion and only update join state */ @@ -186,11 +181,11 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, /********************************************************************** **********************************************************************/ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, - osm_mcm_port_t * mcm_port, uint8_t join_state) + osm_mcm_port_t * mcm_port, ib_member_rec_t *mcmr) { int ret; - uint8_t port_join_state; - uint8_t new_join_state; + uint8_t join_state = mcmr->scope_state & 0xf; + uint8_t port_join_state, new_join_state; /* * according to the same o15-0.1.14 we get the stored @@ -207,8 +202,10 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, "updating port 0x%" PRIx64 " JoinState 0x%x -> 0x%x\n", cl_ntoh64(mcm_port->port_gid.unicast.interface_id), port_join_state, new_join_state); + mcmr->scope_state = mcm_port->scope_state; ret = 0; } else { + mcmr->scope_state = mcm_port->scope_state; cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); OSM_LOG(log, OSM_LOG_DEBUG, "removing port 0x%" PRIx64 "\n", cl_ntoh64(mcm_port->port_gid.unicast.interface_id)); @@ -228,11 +225,14 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, ib_net64_t port_guid) { + ib_member_rec_t mcmrec; cl_map_item_t *item = cl_qmap_get(&mgrp->mcm_port_tbl, port_guid); - if (item != cl_qmap_end(&mgrp->mcm_port_tbl)) + if (item != cl_qmap_end(&mgrp->mcm_port_tbl)) { + mcmrec.scope_state = 0xf; osm_mgrp_remove_port(subn, log, mgrp, (osm_mcm_port_t *) item, - 0xf); + &mcmrec); + } osm_mgrp_cleanup(subn, mgrp); } diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index fcc3f27..02737c2 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -995,26 +995,26 @@ int osm_sa_db_file_load(osm_opensm_t * p_osm) if (!p_mgrp) rereg_clients = 1; } else if (p_mgrp && !strncmp(p, "mcm_port", 8)) { - ib_gid_t port_gid; + ib_member_rec_t mcmr; ib_net64_t guid; - uint8_t scope_state; - boolean_t proxy_join; + osm_port_t *port; + boolean_t proxy; PARSE_AHEAD(p, net64, " port_gid=0x", - &port_gid.unicast.prefix); + &mcmr.port_gid.unicast.prefix); PARSE_AHEAD(p, net64, ":0x", - &port_gid.unicast.interface_id); - PARSE_AHEAD(p, net8, " scope_state=0x", &scope_state); + &mcmr.port_gid.unicast.interface_id); + PARSE_AHEAD(p, net8, " scope_state=0x", &mcmr.scope_state); PARSE_AHEAD(p, net8, " proxy_join=0x", &val); - proxy_join = val; + proxy = val; - guid = port_gid.unicast.interface_id; - if (cl_qmap_get(&p_mgrp->mcm_port_tbl, - port_gid.unicast.interface_id) == + guid = mcmr.port_gid.unicast.interface_id; + port = osm_get_port_by_guid(&p_osm->subn, guid); + if (port && + cl_qmap_get(&p_mgrp->mcm_port_tbl, guid) == cl_qmap_end(&p_mgrp->mcm_port_tbl)) osm_mgrp_add_port(&p_osm->subn, &p_osm->log, - p_mgrp, &port_gid, - scope_state, proxy_join); + p_mgrp, port, &mcmr, proxy); } else if (!strncmp(p, "Service Record:", 15)) { ib_service_record_t s_rec; uint32_t modified_time, lease_period; diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 5fc1064..a51c839 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -137,6 +137,7 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) requester gids. **********************************************************************/ static ib_api_status_t add_new_mgrp_port(osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, + IN osm_port_t *port, IN ib_member_rec_t * p_recvd_mcmember_rec, IN osm_mad_addr_t * p_mad_addr, @@ -172,10 +173,8 @@ static ib_api_status_t add_new_mgrp_port(osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, "Create new port with proxy_join TRUE\n"); } - *pp_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, - &p_recvd_mcmember_rec->port_gid, - p_recvd_mcmember_rec->scope_state, - proxy_join); + *pp_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, port, + p_recvd_mcmember_rec, proxy_join); if (*pp_mcmr_port == NULL) { OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B06: " "osm_mgrp_add_port failed\n"); @@ -410,7 +409,6 @@ static boolean_t validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, if the requester GID == PortGID */ res = osm_get_gid_by_mad_addr(sa->p_log, sa->p_subn, p_mad_addr, &request_gid); - if (res != IB_SUCCESS) { OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "Could not find port for requested address\n"); @@ -443,8 +441,7 @@ static boolean_t validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "ProxyJoin but port not in partition. stored:" "0x%016" PRIx64 " request:0x%016" PRIx64 "\n", - cl_ntoh64((*pp_mcm_port)->port_gid.unicast. - interface_id), + cl_ntoh64((*pp_mcm_port)->port->guid), cl_ntoh64(p_mad_addr->addr_type.gsi.grh_info. src_gid.unicast.interface_id)); return FALSE; @@ -1018,16 +1015,9 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } - /* store state - we'll need it if the port is removed */ - mcmember_rec.scope_state = p_mcm_port->scope_state; - - /* remove port or update join state */ - removed = - osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, p_mcm_port, - p_recvd_mcmember_rec->scope_state & 0x0F); - if (!removed) - mcmember_rec.scope_state = p_mcm_port->scope_state; - + /* remove port and/or update join state */ + removed = osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, + p_mcm_port, &mcmember_rec); CL_PLOCK_RELEASE(sa->p_lock); /* we can leave if port was deleted from MCG */ @@ -1225,7 +1215,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) } /* create or update existing port (join-state will be updated) */ - status = add_new_mgrp_port(sa, p_mgrp, p_recvd_mcmember_rec, + status = add_new_mgrp_port(sa, p_mgrp, p_port, p_recvd_mcmember_rec, osm_madw_get_mad_addr_ptr(p_madw), &p_mcmr_port); if (status != IB_SUCCESS) { -- 1.6.4.2 From hal.rosenstock at gmail.com Tue Sep 15 04:26:30 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 15 Sep 2009 07:26:30 -0400 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info In-Reply-To: <20090915100813.GL17481@me> References: <20090906154901.GF25241@me> <20090915100813.GL17481@me> Message-ID: On Tue, Sep 15, 2009 at 6:08 AM, Sasha Khapyorsky wrote: > On 08:45 Mon 14 Sep , Hal Rosenstock wrote: > > > > Does this mean consolidate_ipv6_snm_req does not work now ? > > No, it doesn't. As you may remember 'consolidate_ipv6_snm_req' > workaround does nothing with MGIDs to MLID mapping, but instead > enforces all IPv6 SNM matching requests to join a single multicast > group (MGID). > Is consolidate_ipv6_snm_req working for you ? -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hnrose at comcast.net Tue Sep 15 04:24:56 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 15 Sep 2009 07:24:56 -0400 Subject: [ofa-general] [PATCH] perftest: Remove unneeded executable permissions Message-ID: <20090915112456.GB14060@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/COPYING b/COPYING old mode 100755 new mode 100644 diff --git a/Makefile b/Makefile old mode 100755 new mode 100644 diff --git a/README b/README old mode 100755 new mode 100644 diff --git a/clock_test.c b/clock_test.c old mode 100755 new mode 100644 diff --git a/get_clock.c b/get_clock.c old mode 100755 new mode 100644 diff --git a/get_clock.h b/get_clock.h old mode 100755 new mode 100644 diff --git a/perftest.spec b/perftest.spec old mode 100755 new mode 100644 diff --git a/rdma_bw.c b/rdma_bw.c old mode 100755 new mode 100644 diff --git a/rdma_lat.c b/rdma_lat.c old mode 100755 new mode 100644 diff --git a/read_bw.c b/read_bw.c old mode 100755 new mode 100644 diff --git a/read_lat.c b/read_lat.c old mode 100755 new mode 100644 diff --git a/send_bw.c b/send_bw.c old mode 100755 new mode 100644 diff --git a/send_lat.c b/send_lat.c old mode 100755 new mode 100644 diff --git a/write_bw.c b/write_bw.c old mode 100755 new mode 100644 diff --git a/write_bw_postlist.c b/write_bw_postlist.c old mode 100755 new mode 100644 diff --git a/write_lat.c b/write_lat.c old mode 100755 new mode 100644 From hnrose at comcast.net Tue Sep 15 04:24:05 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 15 Sep 2009 07:24:05 -0400 Subject: [ofa-general] [PATCH] perftest: Make rdma_lat, rdma_bw, and clock_test executable names rdma neutral Message-ID: <20090915112405.GA14060@comcast.net> Since rdma_lat and rdma_bw use RDMA CM, they can be used with both IB and iWARP so make their executable names neutral (by removing ib_) IB only tests only require linking with libibverbs Also, spec file change for executable name changes Signed-off-by: Hal Rosenstock --- diff --git a/Makefile b/Makefile index 8042531..83c22c3 100755 --- a/Makefile +++ b/Makefile @@ -1,7 +1,8 @@ -TESTS = write_bw_postlist rdma_lat rdma_bw send_lat send_bw write_lat write_bw read_lat read_bw +RDMACM_TESTS = rdma_lat rdma_bw +TESTS = write_bw_postlist send_lat send_bw write_lat write_bw read_lat read_bw UTILS = clock_test -all: ${TESTS} ${UTILS} +all: ${RDMACM_TESTS} ${TESTS} ${UTILS} CFLAGS += -Wall -g -D_GNU_SOURCE -O2 EXTRA_FILES = get_clock.c @@ -10,11 +11,18 @@ EXTRA_HEADERS = get_clock.h LOADLIBES += LDFLAGS += -${TESTS}: LOADLIBES += -libverbs -lrdmacm +${RDMACM_TESTS} ${UTILS}: LOADLIBES += -libverbs -lrdmacm +${TESTS}: LOADLIBES += -libverbs -${TESTS} ${UTILS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} +${RDMACM_TESTS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} + $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $< ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o $@ +${TESTS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $< ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o ib_$@ +${UTILS}: %: %.c ${EXTRA_FILES} ${EXTRA_HEADERS} + $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $< ${EXTRA_FILES} $(LOADLIBES) $(LDLIBS) -o rdma_$@ + clean: - $(foreach fname,${TESTS} ${UTILS}, rm -f ib_${fname}) + $(foreach fname,${RDMACM_TESTS} ${UTILS}, rm -f ${fname}) + $(foreach fname,${TESTS}, rm -f ib_${fname}) .DELETE_ON_ERROR: .PHONY: all clean diff --git a/perftest.spec b/perftest.spec index bd234e1..81ca90a 100755 --- a/perftest.spec +++ b/perftest.spec @@ -23,8 +23,8 @@ export CFLAGS="$RPM_OPT_FLAGS" chmod -x runme %install -install -D -m 0755 ib_rdma_lat $RPM_BUILD_ROOT%{_bindir}/ib_rdma_lat -install -D -m 0755 ib_rdma_bw $RPM_BUILD_ROOT%{_bindir}/ib_rdma_bw +install -D -m 0755 rdma_lat $RPM_BUILD_ROOT%{_bindir}/rdma_lat +install -D -m 0755 rdma_bw $RPM_BUILD_ROOT%{_bindir}/rdma_bw install -D -m 0755 ib_write_lat $RPM_BUILD_ROOT%{_bindir}/ib_write_lat install -D -m 0755 ib_write_bw $RPM_BUILD_ROOT%{_bindir}/ib_write_bw install -D -m 0755 ib_send_lat $RPM_BUILD_ROOT%{_bindir}/ib_send_lat @@ -32,7 +32,7 @@ install -D -m 0755 ib_send_bw $RPM_BUILD_ROOT%{_bindir}/ib_send_bw install -D -m 0755 ib_read_lat $RPM_BUILD_ROOT%{_bindir}/ib_read_lat install -D -m 0755 ib_read_bw $RPM_BUILD_ROOT%{_bindir}/ib_read_bw install -D -m 0755 ib_write_bw_postlist $RPM_BUILD_ROOT%{_bindir}/ib_write_bw_postlist -install -D -m 0755 ib_clock_test $RPM_BUILD_ROOT%{_bindir}/ib_clock_test +install -D -m 0755 rdma_clock_test $RPM_BUILD_ROOT%{_bindir}/rdma_clock_test %clean rm -rf ${RPM_BUILD_ROOT} @@ -43,6 +43,8 @@ rm -rf ${RPM_BUILD_ROOT} %_bindir/* %changelog +* Sat Apr 18 2009 - hal.rosenstock at gmail.com +- Change executable names for rdma_lat, rdma_bw, and clock_test * Mon Jul 09 2007 - hvogel at suse.de - Use correct version * Wed Jul 04 2007 - hvogel at suse.de From keshetti.mahesh at gmail.com Tue Sep 15 04:39:29 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Tue, 15 Sep 2009 17:09:29 +0530 Subject: [ofa-general] [PATCH][TRIVIAL] infiniband-diags/: Cosmetic changes, mostly typos Message-ID: <829ded920909150439h585f05e3hf0174713ecd2faf4@mail.gmail.com> Cosmetic changes Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> --- diff --git a/infiniband-diags/scripts/ibcheckwidth.in b/infiniband-diags/scripts/ibcheckwidth.in index 6b723c5..cbb154c 100644 --- a/infiniband-diags/scripts/ibcheckwidth.in +++ b/infiniband-diags/scripts/ibcheckwidth.in @@ -77,7 +77,7 @@ BEGIN { function check_node(lid) { nodechecked=1 - if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' '$verbose' " lid)) { + if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ badnode=1 return @@ -113,7 +113,7 @@ function check_node(lid) } sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) - if (system("'$IBPATH'/ibcheckportwidth'"$ca_info"' '$gflags' '$verbose' " lid " " port)) { + if (system("'$IBPATH'/ibcheckportwidth '"$ca_info"' '$gflags' '$verbose' " lid " " port)) { if (!'$v' && oldlid != lid) { print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" oldlid = lid diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 76e74f7..018bc9a 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -349,7 +349,7 @@ int main(int argc, char **argv) get_port_info(&peerportid, data, peerlocalportnum, port_op); if (err < 0) - IBERROR("smp query peer portinfofailed"); + IBERROR("smp query peer portinfo failed"); mad_decode_field(data, IB_PORT_LINK_WIDTH_ENABLED_F, &peerlwe); diff --git a/infiniband-diags/man/ibcheckportwidth.8 b/infiniband-diags/man/ibcheckportwidth.8 index 85c06fc..c368467 100644 --- a/infiniband-diags/man/ibcheckportwidth.8 +++ b/infiniband-diags/man/ibcheckportwidth.8 @@ -4,7 +4,7 @@ ibcheckportwidth \- validate IB port for 1x link width .SH SYNOPSIS -.B ibcheckport +.B ibcheckportwidth [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] -- 1.6.4.2 -- Keshetti Mahesh From dorfman.eli at gmail.com Tue Sep 15 05:28:51 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Tue, 15 Sep 2009 15:28:51 +0300 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> <20090914110221.7e33b737.weiny2@llnl.gov> Message-ID: <4AAF8883.1020900@gmail.com> Hal Rosenstock wrote: > > > On Mon, Sep 14, 2009 at 2:02 PM, Ira Weiny > wrote: > > On Fri, 11 Sep 2009 09:32:39 +0530 > Keshetti Mahesh > wrote: > > > My badness. I have not used 'iblinkinfo' before. > > So, I guess there is no need for the above script. Apart from > that, I feel > > there should be a program/script which will first scan the fabric > to find the > > maximum common supported width/speed and then report the warning > messages > > of the links/ports which are configured with active width/speed less > > than the found > > value. Is there any tool already exists which does the same ? > > Not that I know of. > > > ibportstate does this but is on a per port basis. This could be readily > scripted (ad hoc or in tree) for this purpose. > But it would be very slow for large fabrics. I think it would be better to add this option to iblinkinfo code. Also it would be useful to find all ports in Disable state. Eli From jsquyres at cisco.com Tue Sep 15 05:38:08 2009 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 15 Sep 2009 08:38:08 -0400 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090915155231.DB86.A69D9226@jp.fujitsu.com> References: <20090911064019.GZ4973@obsidianresearch.com> <20090915155231.DB86.A69D9226@jp.fujitsu.com> Message-ID: <003720AA-6630-4718-B7F9-8392A8422220@cisco.com> On Sep 15, 2009, at 3:03 AM, KOSAKI Motohiro wrote: > - I guess you have your MPI implementaion w/ ummunotify, right? > - I guess you have test sevaral pattern, right? > if so, can we see your test result? > Roland's answers to the rest of these questions were spot-on, so I thought I'd just throw in a quick reply to the above questions: yes, we have a prototype Open MPI implementation with code that uses ummunotify (http://bitbucket.org/jsquyres/ummunot/). I just finished fixing a high-priority (but unrelated) bug in Open MPI, so merging the prototype ummunotify code into the upstream Open MPI repository is now at the top of my priority list. We have done quite a bit of testing with ummunotify, but since the code is not yet in the Open MPI mainline, most of the testing has been manual (not through our automated testing system). As far as we can tell, everything is working properly with Open MPI + ummunotify. We also anticipate that other MPI implementations will be able to use ummunotify, potentially using Open MPI as a reference ummunotify implementation. FWIW: we went through a bunch of design and implementation iterations with Roland to get code that everyone was happy with: - Roland likes it (and anticipated that the kernel community would be receptive to) - we like it - performs correctly Hope that helps. -- Jeff Squyres jsquyres at cisco.com From keshetti.mahesh at gmail.com Tue Sep 15 05:41:23 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Tue, 15 Sep 2009 18:11:23 +0530 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <4AAF8883.1020900@gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> <20090914110221.7e33b737.weiny2@llnl.gov> <4AAF8883.1020900@gmail.com> Message-ID: <829ded920909150541l5505ac0etf18d1937cb54eff1@mail.gmail.com> > Also it would be useful to find all ports in Disable state. iblinkinfo already does this with '-d' option. -- Keshetti Mahesh From dorfman.eli at gmail.com Tue Sep 15 06:04:24 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Tue, 15 Sep 2009 16:04:24 +0300 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <829ded920909150541l5505ac0etf18d1937cb54eff1@mail.gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> <20090914110221.7e33b737.weiny2@llnl.gov> <4AAF8883.1020900@gmail.com> <829ded920909150541l5505ac0etf18d1937cb54eff1@mail.gmail.com> Message-ID: <4AAF90D8.6070403@gmail.com> Keshetti Mahesh wrote: >> Also it would be useful to find all ports in Disable state. > > iblinkinfo already does this with '-d' option. > It shows all the port that are in Down state - either cable disconnected or port disabled (by ibportstate). Eli From kosaki.motohiro at jp.fujitsu.com Tue Sep 15 00:03:04 2009 From: kosaki.motohiro at jp.fujitsu.com (KOSAKI Motohiro) Date: Tue, 15 Sep 2009 16:03:04 +0900 (JST) Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <20090911064019.GZ4973@obsidianresearch.com> Message-ID: <20090915155231.DB86.A69D9226@jp.fujitsu.com> > > > So.. What is the problem with fork? The semantics of what should > > happen seem natural enough to me, the PD doesn't get copied to the > > child, so the MR stays with the parent. COW events on the pinned > > region must be resolved so that the physical page stays with the > > process that has pinned it - the pin is logically released in the > > child because the MR doesn't exist because the PD doesn't exist. > > This is getting away from the problem that ummunotify is solving, but > handling a COW fault generated by the parent by doing the copy in the > child seems like a pretty major, tricky change to make. The child may > have forked 100 more times in the meantime, meaning we now have to > change 101 memory maps ... the cost of page faults goes through the roof > probably... Ummm... Perhaps my first question was wrong. I'm not intent to NAK your patch. I merely want to know your patch detail... ok, I ask you again as another word. - I guess you have your MPI implementaion w/ ummunotify, right? - I guess you have test sevaral pattern, right? if so, can we see your test result? - I think you can explain your MPI advantage/disadvantage against current OpenMPI (or mpich et al). - I guess your patch dramatically improve MPI implementaion, but it's not free. it request some limitation to MPI application, right? - I imagine multi thread and fork. Is there another linmitaion? - In past discuttion, you said ummunotify user should not use multi threading. you also think user should not fork? From pavel at ucw.cz Tue Sep 15 04:34:34 2009 From: pavel at ucw.cz (Pavel Machek) Date: Tue, 15 Sep 2009 13:34:34 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: Message-ID: <20090915113434.GF1328@ucw.cz> Hi! > Linus, please consider pulling from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This will get "ummunotify," a new character device that allows a > userspace library to register for MMU notifications; this is > particularly useful for MPI implementions (message passing libraries > used in HPC) to be able to keep track of what wacky things consumers > do to their memory mappings. My colleague Jeff Squyres from the Open > MPI project posted a blog entry about why MPI wants this: > > http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking/ > > His summary of ummunotify: > > "It???s elegant, doesn???t require strange linker tricks, and seems to > work in all cases. Yay!" > > This code went through several review iterations on lkml and was in > -mm and -next for quite a few weeks. Andrew is OK with merging it (I > think -- Andrew please correct me if I misunderstood you). I don't remember seeing discussion of this on lkml. Yes it is in -next... Basically it allows app to 'trace itself'? ...with interesting mmap() interface, exporting int to userspace, hoping it behaves atomically...? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html From rdreier at cisco.com Tue Sep 15 07:57:56 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Sep 2009 07:57:56 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090915113434.GF1328@ucw.cz> (Pavel Machek's message of "Tue, 15 Sep 2009 13:34:34 +0200") References: <20090915113434.GF1328@ucw.cz> Message-ID: > I don't remember seeing discussion of this on lkml. Yes it is in > -next... eg http://lkml.org/lkml/2009/7/31/197 and followups, or search for v2 and earlier patches. > Basically it allows app to 'trace itself'? ...with interesting mmap() > interface, exporting int to userspace, hoping it behaves atomically...? Yes, it allows app to trace what the kernel does to memory mappings. I don't believe there's any real issue to atomicity of mmap'ed memory, since userspace really just tests whether read value is == to old read value or not. - R. From worleys at gmail.com Tue Sep 15 08:50:55 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 15 Sep 2009 09:50:55 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche wrote: > On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley wrote: >> >> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin wrote: >> > Chris Worley, on 09/11/2009 11:50 PM wrote: >> >> >> >> I've definitely removed the switch/firmware from being the cause. >> >> >> >> I'm thinking the reason you can't repeat the test may be latency >> >> related.  We get ~50usecs average latency (on small block sizes), >> >> which can't be achieved using regular SSD's (and rotating drives are >> >> nowhere close).  Maybe a ramdisk would help repeat the issue. >> > >> > I think you should try to reproduce the problem with ramdisk or nullio. By >> > so you will eliminate possible influence of the SSD backend. >> >> W/ 12GB RAM in the target, I created a 7GB ramdisk: >> >> mount -t ramfs -o size=7g ramfs /mnt/ >> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >> >> Then, on the initiator, I tested it... and it hung during sequential >> 8KB block reads: >> >> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >> --randrepeat=0 \ >>   --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >> --loops=10000 --runtime=600 >> >> Note that I was running the SM on the target this time too. > > Which Linux distro was installed on the inititiator and on the target > ? And if applicable, which OFED version ? Which kernel messages were > logged by SRPT around the time the issue occurred (after having > enabled SRPT logging first) ? As logging hadn't helped this issue previously, I've not been enabling it. That plus the kernel hacks needed to invoke logging, it's not worth enabling. This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. I couldn't get ramdisks working w/ SCST in RHEL5.2. When running: echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk I get the error: dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities ... which doesn't occur in the Ubuntu kernel, so I've been unable to test RHEL kernels w/ ramdisks. In general, this problem occurs w/ 8KB and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks w/ RHEL kernels. Chris > > Bart. > From vst at vlnb.net Tue Sep 15 09:39:06 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Tue, 15 Sep 2009 20:39:06 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: <4AAFC32A.5050003@vlnb.net> Chris Worley, on 09/15/2009 03:03 AM wrote: > On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/11/2009 11:50 PM wrote: >>> I've definitely removed the switch/firmware from being the cause. >>> >>> I'm thinking the reason you can't repeat the test may be latency >>> related. We get ~50usecs average latency (on small block sizes), >>> which can't be achieved using regular SSD's (and rotating drives are >>> nowhere close). Maybe a ramdisk would help repeat the issue. >> I think you should try to reproduce the problem with ramdisk or nullio. By >> so you will eliminate possible influence of the SSD backend. > > W/ 12GB RAM in the target, I created a 7GB ramdisk: > > mount -t ramfs -o size=7g ramfs /mnt/ > dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 > echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk > echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices > > Then, on the initiator, I tested it... and it hung during sequential > 8KB block reads: > > fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 > --randrepeat=0 \ > --group_reporting --ioengine=libaio --filename=/dev/sde --name=test > --loops=10000 --runtime=600 > > Note that I was running the SM on the target this time too. Should you try then with iSCSI with IPoIB? It will eliminate the SRP stack. > Thanks, > > Chris >> Vlad >> > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From vst at vlnb.net Tue Sep 15 09:43:25 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Tue, 15 Sep 2009 20:43:25 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: <4AAFC42D.4030708@vlnb.net> Chris Worley, on 09/15/2009 07:50 PM wrote: > On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche > wrote: >> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley wrote: >>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin wrote: >>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>> I've definitely removed the switch/firmware from being the cause. >>>>> >>>>> I'm thinking the reason you can't repeat the test may be latency >>>>> related. We get ~50usecs average latency (on small block sizes), >>>>> which can't be achieved using regular SSD's (and rotating drives are >>>>> nowhere close). Maybe a ramdisk would help repeat the issue. >>>> I think you should try to reproduce the problem with ramdisk or nullio. By >>>> so you will eliminate possible influence of the SSD backend. >>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>> >>> mount -t ramfs -o size=7g ramfs /mnt/ >>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>> >>> Then, on the initiator, I tested it... and it hung during sequential >>> 8KB block reads: >>> >>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>> --randrepeat=0 \ >>> --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>> --loops=10000 --runtime=600 >>> >>> Note that I was running the SM on the target this time too. >> Which Linux distro was installed on the inititiator and on the target >> ? And if applicable, which OFED version ? Which kernel messages were >> logged by SRPT around the time the issue occurred (after having >> enabled SRPT logging first) ? > > As logging hadn't helped this issue previously, I've not been enabling > it. That plus the kernel hacks needed to invoke logging, it's not > worth enabling. > > This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. > > I couldn't get ramdisks working w/ SCST in RHEL5.2. When running: > > echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk > > I get the error: > > dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities > > ... which doesn't occur in the Ubuntu kernel, so I've been unable to > test RHEL kernels w/ ramdisks. In general, this problem occurs w/ 8KB > and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks > w/ RHEL kernels. Use ramfs instead. > Chris >> Bart. >> > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From weiny2 at llnl.gov Tue Sep 15 09:41:57 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Sep 2009 09:41:57 -0700 Subject: [ofa-general] Re: [PATCH v2] infiniband-diags/scripts: Add 'ibcheckspeed' and 'ibcheckportspeed' to scripts In-Reply-To: <4AAF90D8.6070403@gmail.com> References: <829ded920909100602h78614ac0jd4eb1ee8d7a3779b@mail.gmail.com> <20090910090213.6888b7d5.weiny2@llnl.gov> <829ded920909102102o49f037cbhc53a849f1fcfdaa4@mail.gmail.com> <20090914110221.7e33b737.weiny2@llnl.gov> <4AAF8883.1020900@gmail.com> <829ded920909150541l5505ac0etf18d1937cb54eff1@mail.gmail.com> <4AAF90D8.6070403@gmail.com> Message-ID: <20090915094157.1bd6d06b.weiny2@llnl.gov> On Tue, 15 Sep 2009 16:04:24 +0300 "Eli Dorfman (Voltaire)" wrote: > Keshetti Mahesh wrote: > >> Also it would be useful to find all ports in Disable state. > > > > iblinkinfo already does this with '-d' option. > > > It shows all the port that are in Down state - either cable disconnected or port disabled (by ibportstate). > > Eli > The -l option was designed to show all the info for each link on a single line to be grepped. So to look for Disabled links on a 1152 node ftree cluster. 09:37:31 > time ./iblinkinfo -l | grep Disable real 0m3.041s user 0m0.026s sys 0m0.039s Eli, If you have data for this being slow on a larger cluster I would be interested to know. Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From worleys at gmail.com Tue Sep 15 09:52:05 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 15 Sep 2009 10:52:05 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AAFC32A.5050003@vlnb.net> References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> <4AAFC32A.5050003@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 10:39 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/15/2009 03:03 AM wrote: >> >> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>> >>>> I've definitely removed the switch/firmware from being the cause. >>>> >>>> I'm thinking the reason you can't repeat the test may be latency >>>> related.  We get ~50usecs average latency (on small block sizes), >>>> which can't be achieved using regular SSD's (and rotating drives are >>>> nowhere close).  Maybe a ramdisk would help repeat the issue. >>> >>> I think you should try to reproduce the problem with ramdisk or nullio. >>> By >>> so you will eliminate possible influence of the SSD backend. >> >> W/ 12GB RAM in the target, I created a 7GB ramdisk: >> >> mount -t ramfs -o size=7g ramfs /mnt/ >> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >> >> Then, on the initiator, I tested it... and it hung during sequential >> 8KB block reads: >> >> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >> --randrepeat=0 \ >>   --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >> --loops=10000 --runtime=600 >> >> Note that I was running the SM on the target this time too. > > Should you try then with iSCSI with IPoIB? It will eliminate the SRP stack. I don't see this issue w/ iSCSI over IPoIB... I also don't see the performance. Chris > >> Thanks, >> >> Chris >>> >>> Vlad >>> >> >> >> ------------------------------------------------------------------------------ >> Come build with us! The BlackBerry® Developer Conference in SF, CA >> is the only developer event you need to attend this year. Jumpstart your >> developing skills, take BlackBerry mobile applications to market and stay >> ahead of the curve. Join us from November 9-12, 2009. Register now! >> http://p.sf.net/sfu/devconf >> _______________________________________________ >> Scst-devel mailing list >> Scst-devel at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > From worleys at gmail.com Tue Sep 15 09:53:21 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 15 Sep 2009 10:53:21 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AAFC42D.4030708@vlnb.net> References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/15/2009 07:50 PM wrote: >> >> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >> wrote: >>> >>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley wrote: >>>> >>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>> wrote: >>>>> >>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>> >>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>> >>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>> related.  We get ~50usecs average latency (on small block sizes), >>>>>> which can't be achieved using regular SSD's (and rotating drives are >>>>>> nowhere close).  Maybe a ramdisk would help repeat the issue. >>>>> >>>>> I think you should try to reproduce the problem with ramdisk or nullio. >>>>> By >>>>> so you will eliminate possible influence of the SSD backend. >>>> >>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>> >>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>> >>>> Then, on the initiator, I tested it... and it hung during sequential >>>> 8KB block reads: >>>> >>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>> --randrepeat=0 \ >>>>  --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>> --loops=10000 --runtime=600 >>>> >>>> Note that I was running the SM on the target this time too. >>> >>> Which Linux distro was installed on the inititiator and on the target >>> ? And if applicable, which OFED version ? Which kernel messages were >>> logged by SRPT around the time the issue occurred (after having >>> enabled SRPT logging first) ? >> >> As logging hadn't helped this issue previously, I've not been enabling >> it.  That plus the kernel hacks needed to invoke logging, it's not >> worth enabling. >> >> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >> >> I couldn't get ramdisks working w/ SCST in RHEL5.2.  When running: >> >> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >> >> I get the error: >> >> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >> capabilities >> >> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >> test RHEL kernels w/ ramdisks.  In general, this problem occurs w/ 8KB >> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >> w/ RHEL kernels. > > Use ramfs instead. Do you mean: mount -t ramfs -o size=7g ramfs /mnt/ ? That's what I'm doing. Chris > >> Chris >>> >>> Bart. >>> >> >> >> ------------------------------------------------------------------------------ >> Come build with us! The BlackBerry® Developer Conference in SF, CA >> is the only developer event you need to attend this year. Jumpstart your >> developing skills, take BlackBerry mobile applications to market and stay >> ahead of the curve. Join us from November 9-12, 2009. Register now! >> http://p.sf.net/sfu/devconf >> _______________________________________________ >> Scst-devel mailing list >> Scst-devel at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scst-devel >> > > From vst at vlnb.net Tue Sep 15 09:57:56 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Tue, 15 Sep 2009 20:57:56 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> Message-ID: <4AAFC794.7090205@vlnb.net> Chris Worley, on 09/15/2009 08:53 PM wrote: > On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/15/2009 07:50 PM wrote: >>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>> wrote: >>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley wrote: >>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>> wrote: >>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>> >>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>> related. We get ~50usecs average latency (on small block sizes), >>>>>>> which can't be achieved using regular SSD's (and rotating drives are >>>>>>> nowhere close). Maybe a ramdisk would help repeat the issue. >>>>>> I think you should try to reproduce the problem with ramdisk or nullio. >>>>>> By >>>>>> so you will eliminate possible influence of the SSD backend. >>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>> >>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>> >>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>> 8KB block reads: >>>>> >>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>>> --randrepeat=0 \ >>>>> --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>>> --loops=10000 --runtime=600 >>>>> >>>>> Note that I was running the SM on the target this time too. >>>> Which Linux distro was installed on the inititiator and on the target >>>> ? And if applicable, which OFED version ? Which kernel messages were >>>> logged by SRPT around the time the issue occurred (after having >>>> enabled SRPT logging first) ? >>> As logging hadn't helped this issue previously, I've not been enabling >>> it. That plus the kernel hacks needed to invoke logging, it's not >>> worth enabling. >>> >>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>> >>> I couldn't get ramdisks working w/ SCST in RHEL5.2. When running: >>> >>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>> >>> I get the error: >>> >>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>> capabilities >>> >>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>> test RHEL kernels w/ ramdisks. In general, this problem occurs w/ 8KB >>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>> w/ RHEL kernels. >> Use ramfs instead. > > Do you mean: > > mount -t ramfs -o size=7g ramfs /mnt/ You should then create a file on it and use it. > ? > > That's what I'm doing. > > Chris >>> Chris >>>> Bart. >>>> >>> >>> ------------------------------------------------------------------------------ >>> Come build with us! The BlackBerry® Developer Conference in SF, CA >>> is the only developer event you need to attend this year. Jumpstart your >>> developing skills, take BlackBerry mobile applications to market and stay >>> ahead of the curve. Join us from November 9-12, 2009. Register now! >>> http://p.sf.net/sfu/devconf >>> _______________________________________________ >>> Scst-devel mailing list >>> Scst-devel at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scst-devel >>> >> > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From worleys at gmail.com Tue Sep 15 10:01:02 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 15 Sep 2009 11:01:02 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AAFC794.7090205@vlnb.net> References: <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/15/2009 08:53 PM wrote: >> >> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>> >>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>> wrote: >>>>> >>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>> wrote: >>>>>> >>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>> wrote: >>>>>>> >>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>> >>>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>>> >>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>> related.  We get ~50usecs average latency (on small block sizes), >>>>>>>> which can't be achieved using regular SSD's (and rotating drives are >>>>>>>> nowhere close).  Maybe a ramdisk would help repeat the issue. >>>>>>> >>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>> nullio. >>>>>>> By >>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>> >>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>> >>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>> >>>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>>> 8KB block reads: >>>>>> >>>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>>>> --randrepeat=0 \ >>>>>>  --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>>>> --loops=10000 --runtime=600 >>>>>> >>>>>> Note that I was running the SM on the target this time too. >>>>> >>>>> Which Linux distro was installed on the inititiator and on the target >>>>> ? And if applicable, which OFED version ? Which kernel messages were >>>>> logged by SRPT around the time the issue occurred (after having >>>>> enabled SRPT logging first) ? >>>> >>>> As logging hadn't helped this issue previously, I've not been enabling >>>> it.  That plus the kernel hacks needed to invoke logging, it's not >>>> worth enabling. >>>> >>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>>> >>>> I couldn't get ramdisks working w/ SCST in RHEL5.2.  When running: >>>> >>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>> >>>> I get the error: >>>> >>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>> capabilities >>>> >>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>> test RHEL kernels w/ ramdisks.  In general, this problem occurs w/ 8KB >>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>> w/ RHEL kernels. >>> >>> Use ramfs instead. >> >> Do you mean: >> >> mount -t ramfs -o size=7g ramfs /mnt/ > > You should then create a file on it and use it. That's what I'm doing, I believe. From above: >>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the following kernel messages: dev_vdisk: Registering virtual FILEIO device ramdisk scst: Processing thread started, PID 9629 scst: Processing thread started, PID 9630 scst: Processing thread started, PID 9631 scst: Processing thread started, PID 9632 scst: Processing thread started, PID 9633 dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities scst: ***ERROR***: New device handler's vdisk attach() failed: -22 scst: Processing thread PID 9629 finished scst: Processing thread PID 9630 finished scst: Processing thread PID 9631 finished scst: Processing thread PID 9632 finished scst: Processing thread PID 9633 finished scst: Failed to attach to virtual device ramdisk Chris > >> ? >> >> That's what I'm doing. >> >> Chris >>>> >>>> Chris >>>>> >>>>> Bart. >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Come build with us! The BlackBerry® Developer Conference in SF, CA >>>> is the only developer event you need to attend this year. Jumpstart your >>>> developing skills, take BlackBerry mobile applications to market and >>>> stay >>>> ahead of the curve. Join us from November 9-12, 2009. Register >>>> now! >>>> http://p.sf.net/sfu/devconf >>>> _______________________________________________ >>>> Scst-devel mailing list >>>> Scst-devel at lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scst-devel >>>> >>> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > > From sean.hefty at intel.com Tue Sep 15 10:08:15 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Sep 2009 10:08:15 -0700 Subject: [ofa-general] RE: Does the CMA user space support join multicast for IPv6 too? In-Reply-To: <4AAE9C9D.3010508@mellanox.co.il> References: <4AAE9C9D.3010508@mellanox.co.il> Message-ID: >Does rdma_join_multicast supports IPv6 addresses? >If yes from which version on the librdmacm? Hmm... I don't think so. It looks like the librdmacm and rdma_cm kernel modules could support it with a small change. The kernel module calls ip_ib_mc_map() to map IP addresses to MGIDs, which only works with IPv4. Does ipoib map IPv6 multicast addresses to MGIDs directly? - Sean From vst at vlnb.net Tue Sep 15 10:10:15 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Tue, 15 Sep 2009 21:10:15 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> Message-ID: <4AAFCA77.6050305@vlnb.net> Chris Worley, on 09/15/2009 09:01 PM wrote: > On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/15/2009 08:53 PM wrote: >>> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >>> wrote: >>>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>>> wrote: >>>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>>> wrote: >>>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>>> wrote: >>>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>>>> >>>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>>> related. We get ~50usecs average latency (on small block sizes), >>>>>>>>> which can't be achieved using regular SSD's (and rotating drives are >>>>>>>>> nowhere close). Maybe a ramdisk would help repeat the issue. >>>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>>> nullio. >>>>>>>> By >>>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>>> >>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>>> >>>>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>>>> 8KB block reads: >>>>>>> >>>>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>>>>> --randrepeat=0 \ >>>>>>> --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>>>>> --loops=10000 --runtime=600 >>>>>>> >>>>>>> Note that I was running the SM on the target this time too. >>>>>> Which Linux distro was installed on the inititiator and on the target >>>>>> ? And if applicable, which OFED version ? Which kernel messages were >>>>>> logged by SRPT around the time the issue occurred (after having >>>>>> enabled SRPT logging first) ? >>>>> As logging hadn't helped this issue previously, I've not been enabling >>>>> it. That plus the kernel hacks needed to invoke logging, it's not >>>>> worth enabling. >>>>> >>>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>>>> >>>>> I couldn't get ramdisks working w/ SCST in RHEL5.2. When running: >>>>> >>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>> >>>>> I get the error: >>>>> >>>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>>> capabilities >>>>> >>>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>>> test RHEL kernels w/ ramdisks. In general, this problem occurs w/ 8KB >>>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>>> w/ RHEL kernels. >>>> Use ramfs instead. >>> Do you mean: >>> >>> mount -t ramfs -o size=7g ramfs /mnt/ >> You should then create a file on it and use it. > > That's what I'm doing, I believe. From above: > >>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices > > ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the > following kernel messages: > > dev_vdisk: Registering virtual FILEIO device ramdisk > scst: Processing thread started, PID 9629 > scst: Processing thread started, PID 9630 > scst: Processing thread started, PID 9631 > scst: Processing thread started, PID 9632 > scst: Processing thread started, PID 9633 > dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities > scst: ***ERROR***: New device handler's vdisk attach() failed: -22 > scst: Processing thread PID 9629 finished > scst: Processing thread PID 9630 finished > scst: Processing thread PID 9631 finished > scst: Processing thread PID 9632 finished > scst: Processing thread PID 9633 finished > scst: Failed to attach to virtual device ramdisk > > Chris >>> ? >>> >>> That's what I'm doing. That's strange. I'm doing it all the time, although with not so old kernels as 2.6.18. >>> Chris >>>>> Chris >>>>>> Bart. From rdreier at cisco.com Tue Sep 15 10:44:41 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Sep 2009 10:44:41 -0700 Subject: [ofa-general] RE: Does the CMA user space support join multicast for IPv6 too? In-Reply-To: (Sean Hefty's message of "Tue, 15 Sep 2009 10:08:15 -0700") References: <4AAE9C9D.3010508@mellanox.co.il> Message-ID: > Does ipoib map IPv6 multicast addresses to MGIDs directly? No, the IPv6 network stack is responsible for doing the mapping and passing the HW (ie IPoIB) address in... cf ipv6_ib_mc_map() in include/net/if_inet6.h From worleys at gmail.com Tue Sep 15 13:51:17 2009 From: worleys at gmail.com (Chris Worley) Date: Tue, 15 Sep 2009 14:51:17 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AAFCA77.6050305@vlnb.net> References: <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 11:10 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/15/2009 09:01 PM wrote: >> >> On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/15/2009 08:53 PM wrote: >>>> >>>> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >>>> wrote: >>>>> >>>>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>>>> >>>>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>>>> wrote: >>>>>>> >>>>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>>>> wrote: >>>>>>>> >>>>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>>>> >>>>>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>>>>> >>>>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>>>> related.  We get ~50usecs average latency (on small block sizes), >>>>>>>>>> which can't be achieved using regular SSD's (and rotating drives >>>>>>>>>> are >>>>>>>>>> nowhere close).  Maybe a ramdisk would help repeat the issue. >>>>>>>>> >>>>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>>>> nullio. >>>>>>>>> By >>>>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>>>> >>>>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>>>> >>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>>>> >>>>>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>>>>> 8KB block reads: >>>>>>>> >>>>>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>>>>>> --randrepeat=0 \ >>>>>>>>  --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>>>>>> --loops=10000 --runtime=600 >>>>>>>> >>>>>>>> Note that I was running the SM on the target this time too. >>>>>>> >>>>>>> Which Linux distro was installed on the inititiator and on the target >>>>>>> ? And if applicable, which OFED version ? Which kernel messages were >>>>>>> logged by SRPT around the time the issue occurred (after having >>>>>>> enabled SRPT logging first) ? >>>>>> >>>>>> As logging hadn't helped this issue previously, I've not been enabling >>>>>> it.  That plus the kernel hacks needed to invoke logging, it's not >>>>>> worth enabling. >>>>>> >>>>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>>>>> >>>>>> I couldn't get ramdisks working w/ SCST in RHEL5.2.  When running: >>>>>> >>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>> >>>>>> I get the error: >>>>>> >>>>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>>>> capabilities >>>>>> >>>>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>>>> test RHEL kernels w/ ramdisks.  In general, this problem occurs w/ 8KB >>>>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>>>> w/ RHEL kernels. >>>>> >>>>> Use ramfs instead. >>>> >>>> Do you mean: >>>> >>>> mount -t ramfs -o size=7g ramfs /mnt/ >>> >>> You should then create a file on it and use it. >> >> That's what I'm doing, I believe.  From above: >> >>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >> >> ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the >> following kernel messages: >> >> dev_vdisk: Registering virtual FILEIO device ramdisk >> scst: Processing thread started, PID 9629 >> scst: Processing thread started, PID 9630 >> scst: Processing thread started, PID 9631 >> scst: Processing thread started, PID 9632 >> scst: Processing thread started, PID 9633 >> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >> capabilities >> scst: ***ERROR***: New device handler's vdisk attach() failed: -22 >> scst: Processing thread PID 9629 finished >> scst: Processing thread PID 9630 finished >> scst: Processing thread PID 9631 finished >> scst: Processing thread PID 9632 finished >> scst: Processing thread PID 9633 finished >> scst: Failed to attach to virtual device ramdisk >> >> Chris >>>> >>>> ? >>>> >>>> That's what I'm doing. > > That's strange. I'm doing it all the time, although with not so old kernels > as 2.6.18. In lots of testing today, I've seen this panic twice on the Ubuntu 8.10 targets: [ 330.155992] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 357.207046] ib_srpt: srpt_xmit_response: tag= 17 channel in bad state 2 [ 357.207052] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 357.207100] ib_srpt: srpt_xmit_response: tag= 47 channel in bad state 2 [ 357.207104] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 357.241429] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 357.250234] ------------[ cut here ]------------ [ 357.250537] ib_srpt: srpt_xmit_response: tag= 26 channel in bad state 2 [ 357.250539] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 357.250550] ib_srpt: srpt_xmit_response: tag= 38 channel in bad state 2 [ 357.250553] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 357.250560] ib_srpt: srpt_xmit_response: tag= 27 channel in bad state 2 [ 357.301253] kernel BUG at /root/scst/scst/src/scst_targ.c:3089! [ 357.301253] invalid opcode: 0000 [1] SMP [ 357.301253] CPU 0 ... [ 357.301253] RIP: 0010:[] [] scst_tgt_cmd_done+0x26/0x30 [scst] [ 357.301253] RSP: 0018:ffff88039ad27b50 EFLAGS: 00010297 [ 357.301253] RAX: 0000000000000200 RBX: ffff8803ad9c68f8 RCX: 0000000000000000 [ 357.301253] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: ffff8803ad9c68f8 [ 357.301253] RBP: ffff88039ad27b50 R08: 0000000000000000 R09: 0000000000000000 [ 357.301253] R10: ffff88039ad277c0 R11: ffff88041ad278cf R12: ffff8803c2972180 [ 357.301253] R13: ffff88039ada0000 R14: 0000000000000001 R15: ffff8803fb00c2b0 [ 357.301253] FS: 0000000000000000(0000) GS:ffffffff807dd000(0000) knlGS:0000000000000000 [ 357.301253] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 357.301253] CR2: 00007f9281e64000 CR3: 0000000000201000 CR4: 00000000000006e0 [ 357.301253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 357.301253] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 357.301253] Process ib_cm/0 (pid: 8299, threadinfo ffff88039ad26000, task ffff88039ad40000) [ 357.301253] Stack: ffff88039ad27b80 ffffffffa04c0c47 ffff88039a8db900 ffff8803c2972180 [ 357.301253] ffff8803fb00c240 ffff8803fb00c284 ffff88039ad27bc0 ffffffffa04c0d93 [ 357.301253] ffff88042a4959c0 ffff88042a9d7800 ffff88042544da00 ffff88042a9d7898 [ 357.301253] Call Trace: [ 357.301253] [] srpt_abort_scst_cmd+0xd7/0x160 [ib_srpt] [ 357.301253] [] srpt_release_channel+0xc3/0x190 [ib_srpt] [ 357.301253] [] srpt_find_and_release_channel+0x22/0x30 [ib_srpt] [ 357.301253] [] srpt_cm_handler+0x6d/0xbb8 [ib_srpt] [ 357.301253] [] ? try_to_wake_up+0x126/0x2f0 [ 357.301253] [] ? default_wake_function+0xd/0x10 [ 357.301253] [] ? autoremove_wake_function+0x16/0x40 [ 357.301253] [] ? __wake_up_common+0x5a/0x90 [ 357.301253] [] ? __wake_up+0x4e/0x70 [ 357.301253] [] ? __queue_work+0x41/0x50 [ 357.301253] [] ? queue_work_on+0x4d/0x60 [ 357.301253] [] ? queue_work+0x1f/0x30 [ 357.301253] [] ? queue_delayed_work+0x2d/0x40 [ 357.301253] [] ? wait_for_response+0xd5/0xe0 [ib_mad] [ 357.301253] [] cm_process_work+0x27/0x130 [ib_cm] [ 357.301253] [] cm_drep_handler+0xf1/0x180 [ib_cm] [ 357.301253] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 357.301253] [] cm_work_handler+0x105/0x1b8 [ib_cm] [ 357.301253] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 357.301253] [] run_workqueue+0xc2/0x1a0 [ 357.301253] [] worker_thread+0xaf/0x130 [ 357.301253] [] ? autoremove_wake_function+0x0/0x40 [ 357.301253] [] ? worker_thread+0x0/0x130 [ 357.301253] [] kthread+0x4e/0x90 [ 357.301253] [] child_rip+0xa/0x11 [ 357.301253] [] ? kthread+0x0/0x90 [ 357.301253] [] ? child_rip+0x0/0x11 [ 357.301253] [ 357.301253] [ 357.301253] Code: 00 00 00 00 00 55 48 89 e5 e8 a7 cc d9 df 83 7f 28 78 75 17 80 67 2d f7 c7 47 28 0d 00 00 00 ba 01 00 00 00 e8 8c fc ff ff c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 41 54 53 e8 74 cc d9 [ 357.301253] RIP [] scst_tgt_cmd_done+0x26/0x30 [scst] [ 357.301253] RSP [ 358.302745] ---[ end trace a7f20725e9471e16 ]--- [ 384.258076] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 384.276974] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 384.290055] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 411.329136] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 411.348297] ib_srpt: srpt_xmit_response: tag= 24 channel in bad state 2 [ 411.348301] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 411.348323] ib_srpt: srpt_xmit_response: tag= 20 channel in bad state 2 [ 411.348326] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 411.348331] ib_srpt: srpt_xmit_response: tag= 42 channel in bad state 2 [ 411.349705] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 411.621319] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 411.629805] ib_srpt: srpt_xmit_response: tag= 38 channel in bad state 2 [ 411.636690] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 411.636698] ------------[ cut here ]------------ [ 411.636699] WARNING: at /root/scst/srpt/src/ib_srpt.c:924 srpt_abort_scst_cmd+0xac/0x160 [ib_srpt]() ... [ 411.636735] [ 411.636736] Call Trace: [ 411.636742] [] warn_on_slowpath+0x64/0x90 [ 411.636747] [] ? free_hot_page+0x10/0x20 [ 411.636750] [] ? __slab_free+0x10/0x120 [ 411.636753] [] ? __phys_addr+0x9/0x50 [ 411.636757] [] ? swiotlb_unmap_sg_attrs+0x9c/0xc0 [ 411.636759] [] srpt_abort_scst_cmd+0xac/0x160 [ib_srpt] [ 411.636762] [] srpt_release_channel+0xc3/0x190 [ib_srpt] [ 411.636764] [] srpt_find_and_release_channel+0x22/0x30 [ib_srpt] [ 411.636766] [] srpt_cm_handler+0x6d/0xbb8 [ib_srpt] [ 411.636769] [] ? try_to_wake_up+0x126/0x2f0 [ 411.636771] [] ? default_wake_function+0xd/0x10 [ 411.636774] [] ? autoremove_wake_function+0x16/0x40 [ 411.636776] [] ? __wake_up_common+0x5a/0x90 [ 411.636778] [] ? __wake_up+0x4e/0x70 [ 411.636781] [] ? __queue_work+0x41/0x50 [ 411.636783] [] ? queue_work_on+0x4d/0x60 [ 411.636784] [] ? queue_work+0x1f/0x30 [ 411.636785] [] ? queue_delayed_work+0x2d/0x40 [ 411.636790] [] ? wait_for_response+0xd5/0xe0 [ib_mad] [ 411.636794] [] cm_process_work+0x27/0x130 [ib_cm] [ 411.636797] [] cm_drep_handler+0xf1/0x180 [ib_cm] [ 411.636799] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 411.636802] [] cm_work_handler+0x105/0x1b8 [ib_cm] [ 411.636804] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 411.636806] [] run_workqueue+0xc2/0x1a0 [ 411.636808] [] worker_thread+0xaf/0x130 [ 411.636809] [] ? autoremove_wake_function+0x0/0x40 [ 411.636811] [] ? worker_thread+0x0/0x130 [ 411.636812] [] kthread+0x4e/0x90 [ 411.636815] [] child_rip+0xa/0x11 [ 411.636816] [] ? kthread+0x0/0x90 [ 411.636817] [] ? child_rip+0x0/0x11 [ 411.636818] [ 411.636819] ---[ end trace a7f20725e9471e16 ]--- [ 411.636838] ------------[ cut here ]------------ [ 411.636839] kernel BUG at /root/scst/scst/src/scst_targ.c:3089! [ 411.636840] invalid opcode: 0000 [2] SMP [ 411.636841] CPU 3 ... [ 411.636863] RIP: 0010:[] [] scst_tgt_cmd_done+0x26/0x30 [scst] [ 411.636874] RSP: 0018:ffff88039ad4fb50 EFLAGS: 00010297 [ 411.636875] RAX: 0000000000000200 RBX: ffff880071d34558 RCX: 00000000000213e2 [ 411.636876] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: ffff880071d34558 [ 411.636877] RBP: ffff88039ad4fb50 R08: 0000000000000006 R09: 0000000000000000 [ 411.636878] R10: ffff88039ad4f7c0 R11: ffff88041ad4f8cf R12: ffff88039af49240 [ 411.636879] R13: ffff88039ada0000 R14: 0000000000000001 R15: ffff8803ad8b97f0 [ 411.636881] FS: 0000000000000000(0000) GS:ffff88042e4f7080(0000) knlGS:0000000000000000 [ 411.636882] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 411.636883] CR2: 00007f9281e64000 CR3: 0000000000201000 CR4: 00000000000006e0 [ 411.636884] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 411.636885] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 411.636887] Process ib_cm/3 (pid: 8302, threadinfo ffff88039ad4e000, task ffff88039ad44350) [ 411.636888] Stack: ffff88039ad4fb80 ffffffffa04c0c47 ffff88039a995a80 ffff88039af49240 [ 411.636890] ffff8803ad8b9780 ffff8803ad8b97c4 ffff88039ad4fbc0 ffffffffa04c0d93 [ 411.636891] ffff88042a4959c0 ffff88042896f300 ffff88041801d800 ffff88042896f398 [ 411.636893] Call Trace: [ 411.636895] [] srpt_abort_scst_cmd+0xd7/0x160 [ib_srpt] [ 411.636897] [] srpt_release_channel+0xc3/0x190 [ib_srpt] [ 411.636899] [] srpt_find_and_release_channel+0x22/0x30 [ib_srpt] [ 411.636901] [] srpt_cm_handler+0x6d/0xbb8 [ib_srpt] [ 411.636903] [] ? try_to_wake_up+0x126/0x2f0 [ 411.636905] [] ? default_wake_function+0xd/0x10 [ 411.636906] [] ? autoremove_wake_function+0x16/0x40 [ 411.636908] [] ? __wake_up_common+0x5a/0x90 [ 411.636909] [] ? __wake_up+0x4e/0x70 [ 411.636911] [] ? __queue_work+0x41/0x50 [ 411.636912] [] ? queue_work_on+0x4d/0x60 [ 411.636914] [] ? queue_work+0x1f/0x30 [ 411.636915] [] ? queue_delayed_work+0x2d/0x40 [ 411.636918] [] ? wait_for_response+0xd5/0xe0 [ib_mad] [ 411.636921] [] cm_process_work+0x27/0x130 [ib_cm] [ 411.636923] [] cm_drep_handler+0xf1/0x180 [ib_cm] [ 411.636926] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 411.636928] [] cm_work_handler+0x105/0x1b8 [ib_cm] [ 411.636930] [] ? cm_work_handler+0x0/0x1b8 [ib_cm] [ 411.636932] [] run_workqueue+0xc2/0x1a0 [ 411.636934] [] worker_thread+0xaf/0x130 [ 411.636935] [] ? autoremove_wake_function+0x0/0x40 [ 411.636937] [] ? worker_thread+0x0/0x130 [ 411.636938] [] kthread+0x4e/0x90 [ 411.636939] [] child_rip+0xa/0x11 [ 411.636941] [] ? kthread+0x0/0x90 [ 411.636942] [] ? child_rip+0x0/0x11 [ 411.636943] [ 411.636943] [ 411.636943] Code: 00 00 00 00 00 55 48 89 e5 e8 a7 cc d9 df 83 7f 28 78 75 17 80 67 2d f7 c7 47 28 0d 00 00 00 ba 01 00 00 00 e8 8c fc ff ff c9 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 41 54 53 e8 74 cc d9 [ 411.636954] RIP [] scst_tgt_cmd_done+0x26/0x30 [scst] [ 411.636960] RSP [ 411.636966] ---[ end trace a7f20725e9471e16 ]--- [ 412.361070] ib_srpt: srpt_xmit_response: tag= 8 channel in bad state 2 [ 412.367835] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 412.376335] ib_srpt: srpt_xmit_response: tag= 16 channel in bad state 2 [ 412.383183] scst: ***ERROR***: Target driver ib_srpt xmit_response() returned fatal error [ 412.391677] ib_srpt: srpt_xmit_response: tag= 47 channel in bad state 2 [ 456.831960] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 456.850851] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 456.863891] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 483.902158] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 483.921040] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 483.934101] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 510.971720] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 510.990587] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 511.003643] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 543.542817] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000055:0x24710000000055, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 543.563208] scst: Using security group "Default" for initiator "0x00247100000000550024710000000055" [ 544.186067] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 544.204962] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 544.218003] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 571.256063] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 571.274934] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 571.287930] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 598.330955] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 598.349836] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 598.362854] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 625.400865] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 625.419753] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 625.432786] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 652.470722] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 652.489625] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 652.502814] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 679.540621] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 679.559503] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 679.572514] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 706.610483] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 706.629376] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 706.642369] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 733.680405] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 733.699243] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 733.712237] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 760.750170] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 760.769019] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 760.782038] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 800.108971] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 800.127840] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 800.140847] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 827.178873] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 827.197731] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 827.210717] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 854.248829] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 854.267662] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 854.280645] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 881.318906] ib_srpt: received SRP_LOGIN_REQ with i_port_id 0x24710000000046:0x24710000000046, t_port_id 0x24717124000028:0x24717124000028 and length 260 on port 1 (guid=0xfe80000000000000:0x24717124000029) [ 881.337874] ib_srpt: disconnected session 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has been received. [ 881.351139] scst: Using security group "Default" for initiator "0x00247100000000460024710000000046" [ 893.147951] BUG: unable to handle kernel paging request at ffffe3e00e700cc0 [ 893.155156] IP: [] kfree+0x49/0x100 [ 893.157386] PGD 0 [ 893.157386] Oops: 0000 [3] SMP [ 893.157386] CPU 2 ... [ 893.157386] RIP: 0010:[] [] kfree+0x49/0x100 [ 893.157386] RSP: 0018:ffff88042e4d7ce0 EFLAGS: 00010216 [ 893.157386] RAX: 000001e00e700cc0 RBX: ffffe3e00e700cc0 RCX: 0100000000002081 [ 893.157386] RDX: ffffe20000000000 RSI: ffff88039accdf00 RDI: 000000039c033000 [ 893.157386] RBP: ffff88042e4d7d10 R08: 0000000000000000 R09: 0000000000000001 [ 893.157386] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8803c09d7c20 [ 893.157386] R13: 000000039c033000 R14: ffff88039a108240 R15: ffff880071dc4558 [ 893.157386] FS: 0000000000000000(0000) GS:ffff88042fc02d00(0000) knlGS:0000000000000000 [ 893.157386] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 893.157386] CR2: ffffe3e00e700cc0 CR3: 0000000000201000 CR4: 00000000000006e0 [ 893.157386] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 893.157386] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 893.157386] Process swapper (pid: 0, threadinfo ffff88042e4d2000, task ffff88042e4cace0) [ 893.157386] Stack: ffff88042e4d7cf0 0000000000000001 ffff8803c09d7c20 ffff88039accdf00 [ 893.157386] ffff88039a108240 ffff880071dc4558 ffff88042e4d7d80 ffffffffa04bf301 [ 893.157386] 0000000000000000 ffffffff802e2430 ffffe20001c77100 0000000000000202 [ 893.157386] Call Trace: [ 893.157386] [] srpt_reset_ioctx+0x41/0xf0 [ib_srpt] [ 893.157386] [] ? __slab_free+0x10/0x120 [ 893.157386] [] ? scst_destroy_put_cmd+0x47/0x80 [scst] [ 893.157386] [] srpt_on_free_cmd+0x7d/0xa0 [ib_srpt] [ 893.157386] [] scst_free_cmd+0x39/0x1a0 [scst] [ 893.157386] [] scst_process_active_cmd+0x144d/0x1bd0 [scst] [ 893.157386] [] ? srpt_completion+0x4e/0x230 [ib_srpt] [ 893.157386] [] scst_do_job_active+0x6d/0x90 [scst] [ 893.157386] [] scst_cmd_tasklet+0x27/0x40 [scst] [ 893.157386] [] ? mlx4_eq_int+0x7d/0x2a0 [mlx4_core] [ 893.157386] [] tasklet_action+0x86/0x110 [ 893.157386] [] __do_softirq+0x8c/0x100 [ 893.157386] [] call_softirq+0x1c/0x30 [ 893.157386] [] do_softirq+0x65/0xa0 [ 893.157386] [] irq_exit+0x95/0xa0 [ 893.157386] [] do_IRQ+0x8b/0x100 [ 893.157386] [] ret_from_intr+0x0/0x29 [ 893.157386] [] ? mwait_idle+0x52/0x60 [ 893.157386] [] ? mwait_idle+0x9/0x60 [ 893.157386] [] ? cpu_idle+0x75/0x110 [ 893.157386] [] ? start_secondary+0x97/0xc2 [ 893.157386] [ 893.157386] [ 893.157386] Code: 01 f3 ff 48 83 ff 10 49 89 fd 0f 86 85 00 00 00 e8 0d 1b f5 ff 48 c1 e8 0c 48 ba 00 00 00 00 00 e2 ff ff 48 c1 e0 06 48 8d 1c 10 <48> 8b 03 f6 c4 40 74 07 48 8b 5b 10 48 8b 03 84 c0 0f 89 87 00 [ 893.157386] RIP [] kfree+0x49/0x100 [ 893.157386] RSP [ 893.157386] CR2: ffffe3e00e700cc0 [ 893.157386] ---[ end trace a7f20725e9471e16 ]--- [ 893.533612] Kernel panic - not syncing: Aiee, killing interrupt handler! [ 893.540006] ------------[ cut here ]------------ [ 893.540617] WARNING: at /build/buildd/linux-2.6.27/kernel/smp.c:332 smp_call_function_mask+0x22c/0x240() ... [ 893.630002] [ 893.630002] Call Trace: [ 893.632644] [] warn_on_slowpath+0x64/0x90 [ 893.640002] [] ? __enqueue_entity+0x93/0xa0 [ 893.643556] [] ? enqueue_entity+0xd9/0x260 [ 893.650002] [] ? enqueue_task_fair+0x59/0x60 [ 893.660002] [] ? enqueue_task+0x50/0x60 [ 893.661747] [] ? resched_task+0x2d/0x90 [ 893.670002] [] smp_call_function_mask+0x22c/0x240 [ 893.672916] [] ? stop_this_cpu+0x0/0x40 [ 893.680003] [] ? mutex_unlock+0x9/0x20 [ 893.690003] [] ? crash_kexec+0x74/0x100 [ 893.690515] [] ? autoremove_wake_function+0x16/0x40 [ 893.700002] [] ? __wake_up_common+0x5a/0x90 [ 893.703245] [] smp_call_function+0x20/0x30 [ 893.710002] [] native_smp_send_stop+0x28/0x60 [ 893.720003] [] panic+0xb4/0x177 [ 893.721449] [] ? __blocking_notifier_call_chain+0x21/0x90 [ 893.730002] [] ? blocking_notifier_call_chain+0x16/0x20 [ 893.740002] [] do_exit+0x3f5/0x410 [ 893.740638] [] oops_begin+0x0/0xb0 [ 893.750002] [] do_page_fault+0x270/0x790 [ 893.751374] [] ? get_page_from_freelist+0x2a6/0x380 [ 893.760003] [] ? ktime_get_ts+0x61/0x70 [ 893.764197] [] ? sched_clock_cpu+0xcc/0x160 [ 893.770002] [] error_exit+0x0/0x70 [ 893.780003] [] ? kfree+0x49/0x100 [ 893.780898] [] srpt_reset_ioctx+0x41/0xf0 [ib_srpt] [ 893.790003] [] ? __slab_free+0x10/0x120 [ 893.793126] [] ? scst_destroy_put_cmd+0x47/0x80 [scst] [ 893.800002] [] srpt_on_free_cmd+0x7d/0xa0 [ib_srpt] [ 893.810002] [] scst_free_cmd+0x39/0x1a0 [scst] [ 893.812078] [] scst_process_active_cmd+0x144d/0x1bd0 [scst] [ 893.820002] [] ? srpt_completion+0x4e/0x230 [ib_srpt] [ 893.830005] [] scst_do_job_active+0x6d/0x90 [scst] [ 893.833429] [] scst_cmd_tasklet+0x27/0x40 [scst] [ 893.840003] [] ? mlx4_eq_int+0x7d/0x2a0 [mlx4_core] [ 893.850003] [] tasklet_action+0x86/0x110 [ 893.852828] [] __do_softirq+0x8c/0x100 [ 893.860004] [] call_softirq+0x1c/0x30 [ 893.864467] [] do_softirq+0x65/0xa0 [ 893.870005] [] irq_exit+0x95/0xa0 [ 893.880004] [] do_IRQ+0x8b/0x100 [ 893.880448] [] ret_from_intr+0x0/0x29 [ 893.890003] [] ? mwait_idle+0x52/0x60 [ 893.893008] [] ? mwait_idle+0x9/0x60 [ 893.900003] [] ? cpu_idle+0x75/0x110 [ 893.902676] [] ? start_secondary+0x97/0xc2 [ 893.910004] [ 893.913956] ---[ end trace a7f20725e9471e16 ]--- This may have been due to low memory, as I was using most target memory for the ramdisk. I do seem to be able to push the issue harder with: fio --rw=randrw --bs=1k --numjobs=64 --iodepth=64 --sync=0 \ --direct=1 --randrepeat=0 --group_reporting --ioengine=libaio \ --filename=/dev/sdp --name=test --loops=10000 --runtime=1600 \ --rwmixread=100 Chris From valdes at anl.gov Tue Sep 15 15:46:43 2009 From: valdes at anl.gov (John Valdes) Date: Tue, 15 Sep 2009 17:46:43 -0500 Subject: [ofa-general] SRP on RHEL 5.3/OFED 1.3 vs RHEL 5.1/OFED 1.2? In-Reply-To: References: <20090522184006.GE26282@starfish.mcs.anl.gov> Message-ID: <20090915224643.GG25432@starfish.mcs.anl.gov> Hi Ifan, Sorry for the delay in replying. > Seeing your post and i was curious whether you have found the answer to > your problem. > I am currently facing the same problem on RHEL 5.3 + OFED 1.4 > connecting to DDN 9900. > > Appreciate if you could share your finding so far. After various tests, we concluded that the problem was with our PCI-X HCAs: >> * Cisco SFS-HCA-X2T7-A1 IB HCA (aka Mellanox Cougar Cub), 133 MHz >> PCI-X, 128 MB memory, Firmware v3.5.917, dual port (port 1 attached to DDN) It was either some problem with using the PCI-X on our new servers, or some problem with the firmware on the HCAs themselves. Whatever it was, it was causing the HCA to reset with a "catastrophic" error, as reported by the kernel module. We tried downgrading to OFED 1.2 (the version that had been working on our old servers under RHEL 5.1) but saw the same problems. We also tried replacing the PCI-X HCAs with PCI-e HCAs, and the latter worked fine with both OFED 1.2 and 1.3. We never tried OFED 1.4, but I would guess that we'd see the same issue. So the "fix" was to replace the HCAs. What HCA are you using? John From richard.frank at oracle.com Tue Sep 15 16:04:13 2009 From: richard.frank at oracle.com (Richard Frank) Date: Tue, 15 Sep 2009 19:04:13 -0400 Subject: [ofa-general] Distributing MSI-X interrupts over multiple cores / CPUs ? Message-ID: <4AB01D6D.1040705@oracle.com> Is it possible to have a single MSI-X vector serviced by multiple cores / CPUs ? If we set a single bit in the irq/smp_affinity mask we do see that interrupts occur on the corresponding core / CPU. However, if we set multiple bits - enabling multiple cores / CPUs - it appears that only the first core (bit) in the map services interrupts.. Is this expected behavior ? Are there any known issues / and or configuration steps that we must do for this to work ? We are running a 2.6.18 kernel.. with OFED 1.3.1 and OFED 1.4.2 on multiple different Intel based hardware platforms. Tests with irqbalance disabled and enabled - have the same results. # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 2427539089 0 0 0 0 0 0 0 IO-APIC-edge timer 1: 2 0 0 0 0 0 0 0 IO-APIC-edge i8042 6: 5 0 0 0 0 0 0 0 IO-APIC-edge floppy 7: 2 0 0 0 0 0 0 0 IO-APIC-edge parport0 8: 0 0 0 0 0 0 0 0 IO-APIC-edge rtc 9: 0 0 0 0 0 0 0 0 IO-APIC-level acpi 14: 60 9018 0 0 21238940 180 0 0 IO-APIC-edge ide0 58: 7222 1282 0 0 588494 458 0 0 IO-APIC-level uhci_hcd:usb3, libata 66: 663325 18918 0 0 0 0 0 0 PCI-MSI-X mlx4_core (async) 74: 8887677 19753999 0 0 0 0 0 0 PCI-MSI-X mlx4_core (comp) 82: 208 0 0 0 0 0 5643776 4798 PCI-MSI eth1 90: 66 0 0 5 1232918 0 68 0 PCI-MSI eth2 169: 22 0 0 0 0 0 0 0 IO-APIC-level aic79xx 177: 0 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb4 185: 15 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2, aic79xx NMI: 20144 9082 11178 11356 10249 8772 11643 10659 LOC: 2427207846 2427207756 2427207702 2427207613 2427207551 2427207458 2427207407 2427207314 From jgunthorpe at obsidianresearch.com Tue Sep 15 17:02:47 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Tue, 15 Sep 2009 18:02:47 -0600 Subject: [ofa-general] Distributing MSI-X interrupts over multiple cores / CPUs ? In-Reply-To: <4AB01D6D.1040705@oracle.com> References: <4AB01D6D.1040705@oracle.com> Message-ID: <20090916000247.GN25981@obsidianresearch.com> On Tue, Sep 15, 2009 at 07:04:13PM -0400, Richard Frank wrote: > We are running a 2.6.18 kernel.. with OFED 1.3.1 and OFED 1.4.2 on > multiple different Intel based > hardware platforms. Well, a quick glance at the MSI-X code in 2.6.18 shows the trouble, I think: unsigned int dest_cpu = first_cpu(cpu_mask); Probably breaks it good. dest_cpu ends up in the MSI register programming and wrecks it. For most APIC configurations it should be a bitmask of cpus that are participating. Looks to my eyes like 2.6.31 is all fixed up. Alot of work has been done in this area over the years. One major improvment since 2.6.18 is that all the interrupt message programming for MSI, MSI-X and IO-APIC is now centralized and far more uniform. So if it works for IO-APIC it should work for MSI. Before it was a bit disjoint. Jason From keshetti.mahesh at gmail.com Tue Sep 15 20:42:14 2009 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Wed, 16 Sep 2009 09:12:14 +0530 Subject: [ofa-general] [PATCH] 'ibcheckportwidth' : Exit if LWS is 1X Message-ID: <829ded920909152042y227a709u6e74ee05d6ff05@mail.gmail.com> Fix: Modify ibcheckportwidth to exit if LWS is 1X instead of processing next lines. Trivial: ibcheckportwidth man page cosmetic change Signed-off-by: Keshetti Mahesh --- infiniband-diags/man/ibcheckportwidth.8      |    2 +- infiniband-diags/scripts/ibcheckportwidth.in |    2 +- diff --git a/infiniband-diags/man/ibcheckportwidth.8 b/infiniband-diags/man/ibcheckportwidth.8 index 85c06fc..c368467 100644 --- a/infiniband-diags/man/ibcheckportwidth.8 +++ b/infiniband-diags/man/ibcheckportwidth.8 @@ -4,7 +4,7 @@  ibcheckportwidth \- validate IB port for 1x link width  .SH SYNOPSIS -.B ibcheckport +.B ibcheckportwidth  [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port]  [\-t(imeout) timeout_ms]   diff --git a/infiniband-diags/scripts/ibcheckportwidth.in b/infiniband-diags/scripts/ibcheckportwidth.in index 32c5c5e..60a0892 100644 --- a/infiniband-diags/scripts/ibcheckportwidth.in +++ b/infiniband-diags/scripts/ibcheckportwidth.in @@ -103,7 +103,7 @@ function blue(s)  }  # Only check LinkWidthActive if LinkWidthSupported is not 1X -/^LinkWidthSupported/{ if ($2 != "1X") { next } } +/^LinkWidthSupported/{ if ($2 == "1X") { exit } } -- 1.6.4.2 -- Keshetti Mahesh From bart.vanassche at gmail.com Tue Sep 15 23:38:52 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 08:38:52 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 5:50 PM, Chris Worley wrote: > I couldn't get ramdisks working w/ SCST in RHEL5.2.  When running: > > echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk > > I get the error: > > dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required capabilities > > ... which doesn't occur in the Ubuntu kernel, so I've been unable to > test RHEL kernels w/ ramdisks.  In general, this problem occurs w/ 8KB > and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks > w/ RHEL kernels. The following should work on RHEL systems: echo "open ramdisk /dev/ram0" > /proc/scsi_tgt/vdisk/vdisk Bart. From bart.vanassche at gmail.com Tue Sep 15 23:42:19 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 08:42:19 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> <4AAE909F.6030202@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 5:50 PM, Chris Worley wrote: > On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche > wrote: >> Which Linux distro was installed on the inititiator and on the target >> ? And if applicable, which OFED version ? Which kernel messages were >> logged by SRPT around the time the issue occurred (after having >> enabled SRPT logging first) ? > > This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. A 2.6.27-14-server kernel with or without SCST patches ? Bart. From bart.vanassche at gmail.com Wed Sep 16 00:03:58 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 09:03:58 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> Message-ID: On Tue, Sep 15, 2009 at 10:51 PM, Chris Worley wrote: > In lots of testing today, I've seen this panic twice on the Ubuntu 8.10 targets: > > [  330.155992] ib_srpt: disconnected session > 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has > been received. The above message means that an initiator logged in, did not log out and logged in again. Has one of the initiator systems e.g. been power cycled while an SRP connection was active ? > [  357.207046] ib_srpt: srpt_xmit_response: tag= 17 channel in bad state 2 This means that an attempt was made by SRPT to transmit a response for a channel in the state 2 (disconnecting). This must be analyzed further, just like the bug report triggered from srpt_abort_scst_cmd(). [ ... ] > [  411.636699] WARNING: at /root/scst/srpt/src/ib_srpt.c:924 > srpt_abort_scst_cmd+0xac/0x160 [ib_srpt]() > ... This message has been triggered by the statement WARN_ON("unexpected cmd state"). It must be analyzed whether this is a consequence of what went wrong before or whether this is a separate issue. [ ... ] > This may have been due to low memory, as I was using most target > memory for the ramdisk. The kernel warning message in ib_srpt.c at line 924 should never be triggered, not even under low memory circumstances. I'll have a look at this anyway. Thanks for the detailed report. Bart. From vlad at lists.openfabrics.org Wed Sep 16 03:07:45 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 16 Sep 2009 03:07:45 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090916-0200 daily build status Message-ID: <20090916100746.26CDFE6208B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From eli at mellanox.co.il Wed Sep 16 04:03:03 2009 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Sep 2009 14:03:03 +0300 Subject: [ofa-general] [PATCH] mlx4: confiugre cache line size Message-ID: <20090916110302.GA32767@mtls03> ConnectX can work more efficiently if the CPU cache line size is confiugred to it at INIT_HCA. This patch configures cache line size for systems that report it. Signed-off-by: Eli Cohen --- drivers/net/mlx4/fw.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 20526ce..aa38c06 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -699,6 +699,7 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) #define INIT_HCA_IN_SIZE 0x200 #define INIT_HCA_VERSION_OFFSET 0x000 #define INIT_HCA_VERSION 2 +#define INIT_HCA_CACHELINE_SZ_OFFSET 0x0e #define INIT_HCA_FLAGS_OFFSET 0x014 #define INIT_HCA_QPC_OFFSET 0x020 #define INIT_HCA_QPC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x10) @@ -736,6 +737,12 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) *((u8 *) mailbox->buf + INIT_HCA_VERSION_OFFSET) = INIT_HCA_VERSION; +#if defined(cache_line_size) + *((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) = + order_base_2(cache_line_size() / 16) << 5; +#endif + + #if defined(__LITTLE_ENDIAN) *(inbox + INIT_HCA_FLAGS_OFFSET / 4) &= ~cpu_to_be32(1 << 1); #elif defined(__BIG_ENDIAN) -- 1.6.4.3 From bart.vanassche at gmail.com Wed Sep 16 05:44:28 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 14:44:28 +0200 Subject: [ofa-general] Merge process for OFED patches Message-ID: Hello Roland, I noticed that there are seven SRP patches (bug fixes) present in OFED 1.4.1 that are not present in mainstream Linux kernels up to and including version 2.6.30. Do you know whether it is documented anywhere which process is followed for merging such patches in the mainstream Linux kernel ? Thanks, Bart. From bart.vanassche at gmail.com Wed Sep 16 06:08:01 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 15:08:01 +0200 Subject: [ofa-general] Re: Merge process for OFED patches In-Reply-To: References: Message-ID: On Wed, Sep 16, 2009 at 2:44 PM, Bart Van Assche wrote: > I noticed that there are six SRP patches (bug fixes) present in OFED > 1.4.1 that are not present in mainstream Linux kernels up to and > including version 2.6.30. Do you know whether it is documented > anywhere which process is followed for merging such patches in the > mainstream Linux kernel ? By the way, these are the SRP patches included in OFED 1.4.1: * ofa_kernel-1.4.1/kernel_patches/fixes/srp_1_recreate_at_reconnect.patch * ofa_kernel-1.4.1/kernel_patches/fixes/srp_2_disconnect_without_wait.patch * ofa_kernel-1.4.1/kernel_patches/fixes/srp_3_reset_req_with_status.patch * ofa_kernel-1.4.1/kernel_patches/fixes/srp_4_dev_loss_tmo.patch * ofa_kernel-1.4.1/kernel_patches/fixes/srp_5_async_event_handler.patch * ofa_kernel-1.4.1/kernel_patches/fixes/srp_6_target_in_out_of_fabric.patch Bart. From dorons at voltaire.com Wed Sep 16 07:06:29 2009 From: dorons at voltaire.com (Doron Shoham) Date: Wed, 16 Sep 2009 17:06:29 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AA8E97E.1090109@voltaire.com> References: <4AA8E97E.1090109@voltaire.com> Message-ID: <4AB0F0E5.2000305@voltaire.com> Add ibcheckroutes script. ibcheckroutes validates route between all leaf switches, switches or CAs in the fabric. Signed-off-by: Doron Shoham --- infiniband-diags/Makefile.am | 4 +- infiniband-diags/configure.in | 1 + infiniband-diags/man/ibcheckroutes.8 | 46 ++++++++++ infiniband-diags/scripts/ibcheckroutes.in | 138 +++++++++++++++++++++++++++++ 4 files changed, 187 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/ibcheckroutes.8 create mode 100644 infiniband-diags/scripts/ibcheckroutes.in diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 1cdb60e..57363c4 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ - scripts/check_lft_balance.pl + scripts/check_lft_balance.pl scripts/ibcheckroutes noinst_LIBRARIES = libcommon.a @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ - man/check_lft_balance.8 + man/check_lft_balance.8 man/ibcheckroutes.8 BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 3ef35cc..aa178c5 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ scripts/ibcheckportwidth \ scripts/ibcheckstate \ scripts/ibcheckwidth \ + scripts/ibcheckroutes \ scripts/ibclearcounters \ scripts/ibclearerrors \ scripts/ibdatacounts \ diff --git a/infiniband-diags/man/ibcheckroutes.8 b/infiniband-diags/man/ibcheckroutes.8 new file mode 100644 index 0000000..e4cb9cb --- /dev/null +++ b/infiniband-diags/man/ibcheckroutes.8 @@ -0,0 +1,46 @@ +.TH IBCHECKPORT 8 "September 10, 2009" "OpenIB" "OpenIB Diagnostics" + +.SH NAME +ibcheckroutes \- validate routes between all hosts in fabric + +.SH SYNOPSIS +.B ibcheckroutes +[\-l] [\-s] [\-c] [\-n topology-file ] [\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] + +.SH DESCRIPTION +.PP +ibcheckroutes is a script which uses a full topology file that was created by ibnetdiscover, +scans the network to validate route between all leaf switches, switches or CAs in the fabric. + +.SH OPTIONS +.PP +\-n Use topology-file. +.PP +\-l Check routes between all leaf switches. +.PP +\-s Check routes between all switches. +.PP +\-c Check routes between all CAs. +.PP +\-h Show help. +.PP +\-N Use mono rather than color mode. +.PP +\-b Suppress output. +.PP +\-e Show errors only. +.PP +\-C Use the specified ca_name. +.PP +\-P Use the specified ca_port. +.PP +\-t Override the default timeout for the solicited mads. + +.SH SEE ALSO +.BR ibnetdiscover(8), +.BR ibtracert(8) + +.SH AUTHOR +.TP +Doron Shoham +.RI < dorons at voltaire.com > diff --git a/infiniband-diags/scripts/ibcheckroutes.in b/infiniband-diags/scripts/ibcheckroutes.in new file mode 100644 index 0000000..0b1af63 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckroutes.in @@ -0,0 +1,138 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo -e Usage: `basename $0` "[-l] [-s] [-c] [-h] [-N] [-b] [-e] [-n topology-file ] \ +[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms]" + echo -e " Validate routes between all leaf switches, switches or CAs in the fabric" + echo -e " -n - Use topology-file" + echo -e " -l - Check routes between all leaf switches" + echo -e " -s - Check routes between all switches" + echo -e " -c - Check routes between all CAs" + echo -e " -h - Show help" + echo -e " -N - Use mono rather than color mode" + echo -e " -b - Suppress output" + echo -e " -e - Show errors only" + echo -e " -C - Use the specified ca_name" + echo -e " -P - Use the specified ca_port" + echo -e " -t - Override the default timeout for the solicited mads" + exit -1 +} + +function user_abort() { + echo "Aborted" + exit 1 +} + +function green() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[OK]\n" $1 + return + fi + printf "\033[1;032m${res_col}[OK]\033[0;39m\n" $1 +} + +function red() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[FAILED]\n" "$1" + return + fi + printf "\033[31m${res_col}[FAILED]\033[0m\n" "$1" +} + +trap user_abort SIGINT SIGTERM + +bw="" +brief=0 +error=0 +ca_info="" +st=0 +method="leaf" +topofile=/tmp/net +discover=1 +res_col="%-20.20s" + +function get_opts() { + while getopts P:C:t:n:beNhlsc o; do + case "$o" in + n) + topofile="$OPTARG" + discover=0 + ;; + l) + method="leaf" + ;; + s) + method="sw" + ;; + c) + method="ca" + ;; + h) + usage + ;; + N) + bw="yes" + ;; + b) + brief=1 + ;; + e) + error=1 + ;; + P | C | t | timeout) + ca_info="$ca_info -$o $OPTARG" + ;; + *) + usage + ;; + esac + done +} + +get_opts $* + +if [ $discover -eq 1 ]; then + $IBPATH/ibnetdiscover $ca_info > $topofile +fi + +# find LIDs to check +case $method in +leaf) + [ $brief -eq 0 ] && echo -e "Checking routes between all Leaf Switches" + LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) print v}' $topofile)) + ;; +sw) + [ $brief -eq 0 ] && echo -e "Checking routes between all Switches" + LIDS=($(awk '/^Switch/ {a[$(NF-2)]=$(NF-2)} END{for(v in a) print v}' $topofile)) + ;; +ca) + [ $brief -eq 0 ] && echo -e "Checking routes between all CAs" + LIDS=($(awk '/# lid /{lmc=$7; e=2^lmc+$5; for(i=$5; i Destination lid" +for((s=0; s /dev/null + if [ $? -eq 0 ]; then + [ $brief -eq 0 ] && [ $error -eq 0 ] && green "${LIDS[$s]}-->${LIDS[$d]}" + else + [ $brief -eq 0 ] && red "${LIDS[$s]}-->${LIDS[$d]}" + st=1 + fi + done +done + +exit $st -- 1.5.4 From rdreier at cisco.com Wed Sep 16 07:21:43 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Sep 2009 07:21:43 -0700 Subject: [ofa-general] Re: Merge process for OFED patches In-Reply-To: (Bart Van Assche's message of "Wed, 16 Sep 2009 14:44:28 +0200") References: Message-ID: > I noticed that there are seven SRP patches (bug fixes) present in OFED > 1.4.1 that are not present in mainstream Linux kernels up to and > including version 2.6.30. Do you know whether it is documented > anywhere which process is followed for merging such patches in the > mainstream Linux kernel ? Documentation/SubmittingPatches? Seriously, someone has to send the patches for them to get applied. From bart.vanassche at gmail.com Wed Sep 16 07:39:54 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 16 Sep 2009 16:39:54 +0200 Subject: [ofa-general] Re: Merge process for OFED patches In-Reply-To: References: Message-ID: On Wed, Sep 16, 2009 at 4:21 PM, Roland Dreier wrote: >  > I noticed that there are seven SRP patches (bug fixes) present in OFED >  > 1.4.1 that are not present in mainstream Linux kernels up to and >  > including version 2.6.30. Do you know whether it is documented >  > anywhere which process is followed for merging such patches in the >  > mainstream Linux kernel ? > > Documentation/SubmittingPatches? > > Seriously, someone has to send the patches for them to get applied. This should be done by the original patch author. I would like to contact the author of the fourth patch. But unfortunately I could not find any author information in that patch. Bart. From ogerlitz at voltaire.com Wed Sep 16 07:46:37 2009 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 16 Sep 2009 17:46:37 +0300 Subject: [ofa-general] Re: Merge process for OFED patches In-Reply-To: References: Message-ID: <4AB0FA4D.1020009@voltaire.com> Bart Van Assche wrote: > I would like to contact the author of the fourth patch. But unfortunately I could not find any author information in that patch. yes, non signed and unreviewed patches is a common practice of ofed, does this create legal issues? maybe that would be the way to stop this? Or. From chien.tin.tung at intel.com Wed Sep 16 08:05:02 2009 From: chien.tin.tung at intel.com (Tung, Chien Tin) Date: Wed, 16 Sep 2009 08:05:02 -0700 Subject: [ofa-general] RE: Merge process for OFED patches In-Reply-To: References: Message-ID: <60BEFF3FBD4C6047B0F13F205CAFA383038D854F67@azsmsx501.amr.corp.intel.com> >I would like to contact the author of the fourth patch. But >unfortunately I could not find any author information in that patch. Here is the info: git log kernel_patches/fixes/srp_4_dev_loss_tmo.patch commit c97ac3a3c509b6fd1fd511e44e81699c21704629 Author: Vu Pham Date: Tue Dec 9 10:34:38 2008 +0200 SRP: Fixed the following issue: https://bugs.openfabrics.org/show_bug.cgi?id=1395 Signed-off-by: Vu Pham commit 304c1ab9f7256f9673394e5b579c196055804f48 Author: Vu Pham Date: Thu Nov 27 11:13:03 2008 +0200 SRP: Fixed the following issues: https://bugs.openfabrics.org/show_bug.cgi?id=1377 https://bugs.openfabrics.org/show_bug.cgi?id=1395 Signed-off-by: Vu Pham commit 245a9d91612a1fd3ada1703dea0bb49af379f457 Author: John Gregor Date: Mon Sep 15 17:49:27 2008 -0700 Refreshed all "fixes" patches Did a "quilt refresh" on all of the patches under kernel_patches/fixes. This has several effects: 1. It regularizes the style of all the patches. 2. It resets all the line numbers and contexts. 3. It gets rid of the whitespace problems that required 'patch -l'. 4. It allows stacked-git to use the patches. The flags I used on refresh were: --no-timestamps --diffstat --sort --strip-trailing-whitespace I've confirmed that the resulting tree is identical (with the exception of the trailing whitespace removal). Signed-off-by: John Gregor commit e79bfe1ad72c06d19a12a3bb9ca856f6509e0109 Author: Vu Pham Date: Sun Jun 15 15:08:10 2008 +0300 SRP: Added backport patches for kernels <= 2.6.25. Signed-off-by: Vu Pham Chien From worleys at gmail.com Wed Sep 16 08:11:15 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 16 Sep 2009 09:11:15 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> Message-ID: On Wed, Sep 16, 2009 at 1:03 AM, Bart Van Assche wrote: > On Tue, Sep 15, 2009 at 10:51 PM, Chris Worley wrote: >> In lots of testing today, I've seen this panic twice on the Ubuntu 8.10 targets: >> >> [  330.155992] ib_srpt: disconnected session >> 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has >> been received. > > The above message means that an initiator logged in, did not log out > and logged in again. Has one of the initiator systems e.g. been power > cycled while an SRP connection was active ? Yes. Once ib_srp is hung on one device, I re-login to get the device and test again. I can't log out of the previous, as it's hung... this "hanging" is the issue I'm having. When I re-login, I get a new device... i.e. I hung /dev/sdc and re-login to get /dev/sdd... then test that until it hangs. Using the ramdisks makes the problem much easier to trigger, and occasionally causes the panic, especially using: fio --rw=randrw --bs=1k --numjobs=64 --iodepth=64 --sync=0 \ --direct=1 --randrepeat=0 --group_reporting --ioengine=libaio \ --filename=/dev/sdp --name=test --loops=10000 --runtime=1600 \ --rwmixread=100 ...on the initiator to cause it. Chris > >> [  357.207046] ib_srpt: srpt_xmit_response: tag= 17 channel in bad state 2 > > This means that an attempt was made by SRPT to transmit a response for > a channel in the state 2 (disconnecting). This must be analyzed > further, just like the bug report triggered from > srpt_abort_scst_cmd(). > > [ ... ] > >> [  411.636699] WARNING: at /root/scst/srpt/src/ib_srpt.c:924 >> srpt_abort_scst_cmd+0xac/0x160 [ib_srpt]() >> ... > > This message has been triggered by the statement WARN_ON("unexpected > cmd state"). It must be analyzed whether this is a consequence of what > went wrong before or whether this is a separate issue. > > [ ... ] > >> This may have been due to low memory, as I was using most target >> memory for the ramdisk. > > The kernel warning message in ib_srpt.c at line 924 should never be > triggered, not even under low memory circumstances. I'll have a look > at this anyway. > > Thanks for the detailed report. > > Bart. > From hal.rosenstock at gmail.com Wed Sep 16 08:23:05 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 16 Sep 2009 11:23:05 -0400 Subject: [ofa-general] [PATCH] opensm: use mgrp pointer in port mcm_info In-Reply-To: References: <20090906154901.GF25241@me> <20090915100813.GL17481@me> Message-ID: On Tue, Sep 15, 2009 at 7:26 AM, Hal Rosenstock wrote: > > > On Tue, Sep 15, 2009 at 6:08 AM, Sasha Khapyorsky wrote: > >> On 08:45 Mon 14 Sep , Hal Rosenstock wrote: >> > >> > Does this mean consolidate_ipv6_snm_req does not work now ? >> >> No, it doesn't. As you may remember 'consolidate_ipv6_snm_req' >> workaround does nothing with MGIDs to MLID mapping, but instead >> enforces all IPv6 SNM matching requests to join a single multicast >> group (MGID). >> > > Is consolidate_ipv6_snm_req working for you ? > Never mind; My bad. It's working... -- Hal > > -- Hal > > >> >> Sasha >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Sep 16 09:30:09 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Sep 2009 09:30:09 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: (Roland Dreier's message of "Thu, 10 Sep 2009 21:38:22 -0700") References: Message-ID: Hi Linus, Sorry to hassle you about this, but I would like to know where things stand. I know (from the reflink discussion if nothing else) that you're definitely not bashful about telling people when their code sucks, so this silent treatment has me really flustered. I've been showering and brushing my teeth and everything, honest! Seriously, this code solves a problem that the MPI/HPC people have been complaining about for quite a while, and if possible I'd like to get this upstream. Or if you have a better idea, I'm all ears... Thanks, Roland > Linus, please consider pulling from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This will get "ummunotify," a new character device that allows a > userspace library to register for MMU notifications; this is > particularly useful for MPI implementions (message passing libraries > used in HPC) to be able to keep track of what wacky things consumers > do to their memory mappings. My colleague Jeff Squyres from the Open > MPI project posted a blog entry about why MPI wants this: > > http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking/ > > His summary of ummunotify: > > "It’s elegant, doesn’t require strange linker tricks, and seems to > work in all cases. Yay!" > > This code went through several review iterations on lkml and was in > -mm and -next for quite a few weeks. Andrew is OK with merging it (I > think -- Andrew please correct me if I misunderstood you). > > Roland Dreier (1): > ummunotify: Userspace support for MMU notifications > > Documentation/Makefile | 3 +- > Documentation/ummunotify/Makefile | 7 + > Documentation/ummunotify/ummunotify.txt | 150 ++++++++ > Documentation/ummunotify/umn-test.c | 200 +++++++++++ > drivers/char/Kconfig | 12 + > drivers/char/Makefile | 1 + > drivers/char/ummunotify.c | 566 +++++++++++++++++++++++++++++++ > include/linux/Kbuild | 1 + > include/linux/ummunotify.h | 121 +++++++ > 9 files changed, 1060 insertions(+), 1 deletions(-) > create mode 100644 Documentation/ummunotify/Makefile > create mode 100644 Documentation/ummunotify/ummunotify.txt > create mode 100644 Documentation/ummunotify/umn-test.c > create mode 100644 drivers/char/ummunotify.c > create mode 100644 include/linux/ummunotify.h From torvalds at linux-foundation.org Wed Sep 16 09:40:38 2009 From: torvalds at linux-foundation.org (Linus Torvalds) Date: Wed, 16 Sep 2009 09:40:38 -0700 (PDT) Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: Message-ID: On Wed, 16 Sep 2009, Roland Dreier wrote: > > Sorry to hassle you about this, but I would like to know where things > stand. I know (from the reflink discussion if nothing else) that you're > definitely not bashful about telling people when their code sucks, so > this silent treatment has me really flustered. I've been showering and > brushing my teeth and everything, honest! I just haven't had time to look into the issue, so I'm merging the code that I know I need to merge, and hopefully I'll have a breather later when I can actually look at code and the thread it all spawned., Linus From sbyna at nec-labs.com Wed Sep 16 03:30:54 2009 From: sbyna at nec-labs.com (Suren Byna) Date: Wed, 16 Sep 2009 06:30:54 -0400 Subject: [ofa-general] [hpc-announce] CfP: Special Issue of JPDC on "Data Intensive Computing", Submission: Jan 15th 2010 Message-ID: Call for Papers: Special Issue of Journal of Parallel and Distributed Computing on "Data Intensive Computing" --------------------------------------------------------------------------- Data intensive computing is posing many challenges in exploiting parallelism of current and upcoming computer architectures. Data volumes of applications in the fields of sciences and engineering, finance, media, online information resources, etc. are expected to double every two years over the next decade and further. With this continuing data explosion, it is necessary to store and process data efficiently by utilizing enormous computing power that is available in the form of multi/manycore platforms. There is no doubt in the industry and research community that the importance of data intensive computing has been raising and will continue to be the foremost fields of research. This raise brings up many research issues, in forms of capturing and accessing data effectively and fast, processing it while still achieving high performance and high throughput, and storing it efficiently for future use. Programming for high performance yielding data intensive computing is an important challenging issue. Expressing data access requirements of applications and designing programming language abstractions to exploit parallelism are at immediate need. Application and domain specific optimizations are also parts of a viable solution in data intensive computing. While these are a few examples of issues, research in data intensive computing has become quite intense during the last few years yielding strong results. This special issue of the Journal Parallel and Distributed Computing (JPDC) is seeking original unpublished research articles that describe recent advances and efforts in the design and development of data intensive computing, functionalities and capabilities that will benefit many applications. Topics of interest include (but are not limited to): * Data-intensive applications and their challenges * Storage and file systems * High performance data access toolkits * Fault tolerance, reliability, and availability * Meta-data management * Remote data access * Programming models, abstractions for data intensive computing * Compiler and runtime support * Data capturing, management, and scheduling techniques * Future research challenges of data intensive computing * Performance optimization techniques * Replication, archiving, preservation strategies * Real-time data intensive computing * Network support for data intensive computing * Challenges and solutions in the era of multi/many-core platforms * Stream computing * Green (Power efficient) data intensive computing * Security and protection of sensitive data in collaborative environments Guide for Authors Papers need not be solely abstract or conceptual in nature: proofs and experimental results can be included as appropriate. Authors should follow the JPDC manuscript format as described in the "Information for Authors" at the end of each issue of JPDC or at http://ees.elsevier.com/jpdc/ . The journal version will be reviewed as per JPDC review process for special issues. Important Dates: Paper Submission : January 15, 2010 Notification of Acceptance/Rejection : May 31, 2010 Final Version of the Paper : September 15, 2010 Submission Guidelines All manuscripts and any supplementary material should be submitted through Elsevier Editorial System (EES) at http://ees.elsevier.com/ jpdc . Authors must select "Special Issue: Data Intensive Computing" when they reach the "Article Type" step in the submission process. First time users must register themselves as Author. For the latest details of the JPDC special issue see http://www.cs.iit.edu/~suren/jpdc Guest Editors: Dr. Surendra Byna NEC Labs America E-mail: sbyna at nec-labs.com Prof. Xian-He Sun Illinois Institute of Technology E-mail: sun at cs.iit.edu From vst at vlnb.net Wed Sep 16 11:15:28 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Wed, 16 Sep 2009 22:15:28 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AAE909F.6030202@vlnb.net> <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> Message-ID: <4AB12B40.9050902@vlnb.net> Chris Worley, on 09/16/2009 12:51 AM wrote: > On Tue, Sep 15, 2009 at 11:10 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/15/2009 09:01 PM wrote: >>> On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin >>> wrote: >>>> Chris Worley, on 09/15/2009 08:53 PM wrote: >>>>> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >>>>> wrote: >>>>>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>>>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>>>>> wrote: >>>>>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>>>>> wrote: >>>>>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>>>>> I've definitely removed the switch/firmware from being the cause. >>>>>>>>>>> >>>>>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>>>>> related. We get ~50usecs average latency (on small block sizes), >>>>>>>>>>> which can't be achieved using regular SSD's (and rotating drives >>>>>>>>>>> are >>>>>>>>>>> nowhere close). Maybe a ramdisk would help repeat the issue. >>>>>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>>>>> nullio. >>>>>>>>>> By >>>>>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>>>>> >>>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>>>>> >>>>>>>>> Then, on the initiator, I tested it... and it hung during sequential >>>>>>>>> 8KB block reads: >>>>>>>>> >>>>>>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 --direct=1 >>>>>>>>> --randrepeat=0 \ >>>>>>>>> --group_reporting --ioengine=libaio --filename=/dev/sde --name=test >>>>>>>>> --loops=10000 --runtime=600 >>>>>>>>> >>>>>>>>> Note that I was running the SM on the target this time too. >>>>>>>> Which Linux distro was installed on the inititiator and on the target >>>>>>>> ? And if applicable, which OFED version ? Which kernel messages were >>>>>>>> logged by SRPT around the time the issue occurred (after having >>>>>>>> enabled SRPT logging first) ? >>>>>>> As logging hadn't helped this issue previously, I've not been enabling >>>>>>> it. That plus the kernel hacks needed to invoke logging, it's not >>>>>>> worth enabling. >>>>>>> >>>>>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server kernel. >>>>>>> >>>>>>> I couldn't get ramdisks working w/ SCST in RHEL5.2. When running: >>>>>>> >>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>> >>>>>>> I get the error: >>>>>>> >>>>>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>>>>> capabilities >>>>>>> >>>>>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>>>>> test RHEL kernels w/ ramdisks. In general, this problem occurs w/ 8KB >>>>>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>>>>> w/ RHEL kernels. >>>>>> Use ramfs instead. >>>>> Do you mean: >>>>> >>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>> You should then create a file on it and use it. >>> That's what I'm doing, I believe. From above: >>> >>>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>> ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the >>> following kernel messages: >>> >>> dev_vdisk: Registering virtual FILEIO device ramdisk >>> scst: Processing thread started, PID 9629 >>> scst: Processing thread started, PID 9630 >>> scst: Processing thread started, PID 9631 >>> scst: Processing thread started, PID 9632 >>> scst: Processing thread started, PID 9633 >>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>> capabilities >>> scst: ***ERROR***: New device handler's vdisk attach() failed: -22 >>> scst: Processing thread PID 9629 finished >>> scst: Processing thread PID 9630 finished >>> scst: Processing thread PID 9631 finished >>> scst: Processing thread PID 9632 finished >>> scst: Processing thread PID 9633 finished >>> scst: Failed to attach to virtual device ramdisk >>> >>> Chris >>>>> ? >>>>> >>>>> That's what I'm doing. >> That's strange. I'm doing it all the time, although with not so old kernels >> as 2.6.18. > > In lots of testing today, I've seen this panic twice on the Ubuntu 8.10 targets: > > [ 330.155992] ib_srpt: disconnected session > 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has > been received. > [ 357.207046] ib_srpt: srpt_xmit_response: tag= 17 channel in bad state 2 > [ 357.207052] ib_srpt: disconnected session > 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has > been received. > [ 357.207100] ib_srpt: srpt_xmit_response: tag= 47 channel in bad state 2 > [ 357.207104] scst: ***ERROR***: Target driver ib_srpt > xmit_response() returned fatal error > [ 357.241429] scst: ***ERROR***: Target driver ib_srpt > xmit_response() returned fatal error > [ 357.250234] ------------[ cut here ]------------ > [ 357.250537] ib_srpt: srpt_xmit_response: tag= 26 channel in bad state 2 > [ 357.250539] scst: ***ERROR***: Target driver ib_srpt > xmit_response() returned fatal error > [ 357.250550] ib_srpt: srpt_xmit_response: tag= 38 channel in bad state 2 > [ 357.250553] scst: ***ERROR***: Target driver ib_srpt > xmit_response() returned fatal error > [ 357.250560] ib_srpt: srpt_xmit_response: tag= 27 channel in bad state 2 > > [ 357.301253] kernel BUG at /root/scst/scst/src/scst_targ.c:3089! > [ 357.301253] invalid opcode: 0000 [1] SMP > [ 357.301253] CPU 0 > ... > [ 357.301253] RIP: 0010:[] [] > scst_tgt_cmd_done+0x26/0x30 [scst] > [ 357.301253] RSP: 0018:ffff88039ad27b50 EFLAGS: 00010297 > [ 357.301253] RAX: 0000000000000200 RBX: ffff8803ad9c68f8 RCX: 0000000000000000 > [ 357.301253] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: ffff8803ad9c68f8 > [ 357.301253] RBP: ffff88039ad27b50 R08: 0000000000000000 R09: 0000000000000000 > [ 357.301253] R10: ffff88039ad277c0 R11: ffff88041ad278cf R12: ffff8803c2972180 > [ 357.301253] R13: ffff88039ada0000 R14: 0000000000000001 R15: ffff8803fb00c2b0 > [ 357.301253] FS: 0000000000000000(0000) GS:ffffffff807dd000(0000) > knlGS:0000000000000000 > [ 357.301253] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > [ 357.301253] CR2: 00007f9281e64000 CR3: 0000000000201000 CR4: 00000000000006e0 > [ 357.301253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 357.301253] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 357.301253] Process ib_cm/0 (pid: 8299, threadinfo > ffff88039ad26000, task ffff88039ad40000) > [ 357.301253] Stack: ffff88039ad27b80 ffffffffa04c0c47 > ffff88039a8db900 ffff8803c2972180 > [ 357.301253] ffff8803fb00c240 ffff8803fb00c284 ffff88039ad27bc0 > ffffffffa04c0d93 > [ 357.301253] ffff88042a4959c0 ffff88042a9d7800 ffff88042544da00 > ffff88042a9d7898 > [ 357.301253] Call Trace: > [ 357.301253] [] srpt_abort_scst_cmd+0xd7/0x160 [ib_srpt] > [ 357.301253] [] srpt_release_channel+0xc3/0x190 [ib_srpt] > [ 357.301253] [] > srpt_find_and_release_channel+0x22/0x30 [ib_srpt] > [ 357.301253] [] srpt_cm_handler+0x6d/0xbb8 [ib_srpt] It's because srpt called scst_tgt_cmd_done() when the corresponding command hasn't yet been sent to xmit_response() callback, so srpt should use another function to abort commands in this state. Vlad From worleys at gmail.com Wed Sep 16 12:41:20 2009 From: worleys at gmail.com (Chris Worley) Date: Wed, 16 Sep 2009 13:41:20 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AB12B40.9050902@vlnb.net> References: <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> <4AB12B40.9050902@vlnb.net> Message-ID: On Wed, Sep 16, 2009 at 12:15 PM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/16/2009 12:51 AM wrote: >> >> On Tue, Sep 15, 2009 at 11:10 AM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/15/2009 09:01 PM wrote: >>>> >>>> On Tue, Sep 15, 2009 at 10:57 AM, Vladislav Bolkhovitin >>>> wrote: >>>>> >>>>> Chris Worley, on 09/15/2009 08:53 PM wrote: >>>>>> >>>>>> On Tue, Sep 15, 2009 at 10:43 AM, Vladislav Bolkhovitin >>>>>> wrote: >>>>>>> >>>>>>> Chris Worley, on 09/15/2009 07:50 PM wrote: >>>>>>>> >>>>>>>> On Tue, Sep 15, 2009 at 12:10 AM, Bart Van Assche >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Tue, Sep 15, 2009 at 1:03 AM, Chris Worley >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Mon, Sep 14, 2009 at 12:51 PM, Vladislav Bolkhovitin >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Chris Worley, on 09/11/2009 11:50 PM wrote: >>>>>>>>>>>> >>>>>>>>>>>> I've definitely removed the switch/firmware from being the >>>>>>>>>>>> cause. >>>>>>>>>>>> >>>>>>>>>>>> I'm thinking the reason you can't repeat the test may be latency >>>>>>>>>>>> related.  We get ~50usecs average latency (on small block >>>>>>>>>>>> sizes), >>>>>>>>>>>> which can't be achieved using regular SSD's (and rotating drives >>>>>>>>>>>> are >>>>>>>>>>>> nowhere close).  Maybe a ramdisk would help repeat the issue. >>>>>>>>>>> >>>>>>>>>>> I think you should try to reproduce the problem with ramdisk or >>>>>>>>>>> nullio. >>>>>>>>>>> By >>>>>>>>>>> so you will eliminate possible influence of the SSD backend. >>>>>>>>>> >>>>>>>>>> W/ 12GB RAM in the target, I created a 7GB ramdisk: >>>>>>>>>> >>>>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>>>>>>>> >>>>>>>>>> Then, on the initiator, I tested it... and it hung during >>>>>>>>>> sequential >>>>>>>>>> 8KB block reads: >>>>>>>>>> >>>>>>>>>> fio --rw=read --bs=8k --numjobs=64 --iodepth=64 --sync=0 >>>>>>>>>> --direct=1 >>>>>>>>>> --randrepeat=0 \ >>>>>>>>>>  --group_reporting --ioengine=libaio --filename=/dev/sde >>>>>>>>>> --name=test >>>>>>>>>> --loops=10000 --runtime=600 >>>>>>>>>> >>>>>>>>>> Note that I was running the SM on the target this time too. >>>>>>>>> >>>>>>>>> Which Linux distro was installed on the inititiator and on the >>>>>>>>> target >>>>>>>>> ? And if applicable, which OFED version ? Which kernel messages >>>>>>>>> were >>>>>>>>> logged by SRPT around the time the issue occurred (after having >>>>>>>>> enabled SRPT logging first) ? >>>>>>>> >>>>>>>> As logging hadn't helped this issue previously, I've not been >>>>>>>> enabling >>>>>>>> it.  That plus the kernel hacks needed to invoke logging, it's not >>>>>>>> worth enabling. >>>>>>>> >>>>>>>> This was with Ubuntu 8.10, built-in IB on the 2.6.27-14-server >>>>>>>> kernel. >>>>>>>> >>>>>>>> I couldn't get ramdisks working w/ SCST in RHEL5.2.  When running: >>>>>>>> >>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>> >>>>>>>> I get the error: >>>>>>>> >>>>>>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>>>>>> capabilities >>>>>>>> >>>>>>>> ... which doesn't occur in the Ubuntu kernel, so I've been unable to >>>>>>>> test RHEL kernels w/ ramdisks.  In general, this problem occurs w/ >>>>>>>> 8KB >>>>>>>> and smaller blocks w/ the Ubuntu kernels, and 2KB and smaller blocks >>>>>>>> w/ RHEL kernels. >>>>>>> >>>>>>> Use ramfs instead. >>>>>> >>>>>> Do you mean: >>>>>> >>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>> >>>>> You should then create a file on it and use it. >>>> >>>> That's what I'm doing, I believe.  From above: >>>> >>>>>>>>>> mount -t ramfs -o size=7g ramfs /mnt/ >>>>>>>>>> dd if=/dev/zero of=/mnt/foo bs=1024k count=7000 >>>>>>>>>> echo "open ramdisk /mnt/foo" > /proc/scsi_tgt/vdisk/vdisk >>>>>>>>>> echo "add ramdisk 2" >/proc/scsi_tgt/groups/Default/devices >>>> >>>> ... but the "open", on RHEL5.2 kernel 2.6.18-92.el5, generates the >>>> following kernel messages: >>>> >>>> dev_vdisk: Registering virtual FILEIO device ramdisk >>>> scst: Processing thread started, PID 9629 >>>> scst: Processing thread started, PID 9630 >>>> scst: Processing thread started, PID 9631 >>>> scst: Processing thread started, PID 9632 >>>> scst: Processing thread started, PID 9633 >>>> dev_vdisk: ***ERROR***: Wrong f_op or FS doesn't have required >>>> capabilities >>>> scst: ***ERROR***: New device handler's vdisk attach() failed: -22 >>>> scst: Processing thread PID 9629 finished >>>> scst: Processing thread PID 9630 finished >>>> scst: Processing thread PID 9631 finished >>>> scst: Processing thread PID 9632 finished >>>> scst: Processing thread PID 9633 finished >>>> scst: Failed to attach to virtual device ramdisk >>>> >>>> Chris >>>>>> >>>>>> ? >>>>>> >>>>>> That's what I'm doing. >>> >>> That's strange. I'm doing it all the time, although with not so old >>> kernels >>> as 2.6.18. >> >> In lots of testing today, I've seen this panic twice on the Ubuntu 8.10 >> targets: >> >> [  330.155992] ib_srpt: disconnected session >> 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has >> been received. >> [  357.207046] ib_srpt: srpt_xmit_response: tag= 17 channel in bad state 2 >> [  357.207052] ib_srpt: disconnected session >> 0x00247100000000460024710000000046 because a new SRP_LOGIN_REQ has >> been received. >> [  357.207100] ib_srpt: srpt_xmit_response: tag= 47 channel in bad state 2 >> [  357.207104] scst: ***ERROR***: Target driver ib_srpt >> xmit_response() returned fatal error >> [  357.241429] scst: ***ERROR***: Target driver ib_srpt >> xmit_response() returned fatal error >> [  357.250234] ------------[ cut here ]------------ >> [  357.250537] ib_srpt: srpt_xmit_response: tag= 26 channel in bad state 2 >> [  357.250539] scst: ***ERROR***: Target driver ib_srpt >> xmit_response() returned fatal error >> [  357.250550] ib_srpt: srpt_xmit_response: tag= 38 channel in bad state 2 >> [  357.250553] scst: ***ERROR***: Target driver ib_srpt >> xmit_response() returned fatal error >> [  357.250560] ib_srpt: srpt_xmit_response: tag= 27 channel in bad state 2 >> >> [  357.301253] kernel BUG at /root/scst/scst/src/scst_targ.c:3089! >> [  357.301253] invalid opcode: 0000 [1] SMP >> [  357.301253] CPU 0 >> ... >> [  357.301253] RIP: 0010:[]  [] >> scst_tgt_cmd_done+0x26/0x30 [scst] >> [  357.301253] RSP: 0018:ffff88039ad27b50  EFLAGS: 00010297 >> [  357.301253] RAX: 0000000000000200 RBX: ffff8803ad9c68f8 RCX: >> 0000000000000000 >> [  357.301253] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: >> ffff8803ad9c68f8 >> [  357.301253] RBP: ffff88039ad27b50 R08: 0000000000000000 R09: >> 0000000000000000 >> [  357.301253] R10: ffff88039ad277c0 R11: ffff88041ad278cf R12: >> ffff8803c2972180 >> [  357.301253] R13: ffff88039ada0000 R14: 0000000000000001 R15: >> ffff8803fb00c2b0 >> [  357.301253] FS:  0000000000000000(0000) GS:ffffffff807dd000(0000) >> knlGS:0000000000000000 >> [  357.301253] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b >> [  357.301253] CR2: 00007f9281e64000 CR3: 0000000000201000 CR4: >> 00000000000006e0 >> [  357.301253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [  357.301253] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [  357.301253] Process ib_cm/0 (pid: 8299, threadinfo >> ffff88039ad26000, task ffff88039ad40000) >> [  357.301253] Stack:  ffff88039ad27b80 ffffffffa04c0c47 >> ffff88039a8db900 ffff8803c2972180 >> [  357.301253]  ffff8803fb00c240 ffff8803fb00c284 ffff88039ad27bc0 >> ffffffffa04c0d93 >> [  357.301253]  ffff88042a4959c0 ffff88042a9d7800 ffff88042544da00 >> ffff88042a9d7898 >> [  357.301253] Call Trace: >> [  357.301253]  [] srpt_abort_scst_cmd+0xd7/0x160 >> [ib_srpt] >> [  357.301253]  [] srpt_release_channel+0xc3/0x190 >> [ib_srpt] >> [  357.301253]  [] >> srpt_find_and_release_channel+0x22/0x30 [ib_srpt] >> [  357.301253]  [] srpt_cm_handler+0x6d/0xbb8 [ib_srpt] > > It's because srpt called scst_tgt_cmd_done() when the corresponding command > hasn't yet been sent to xmit_response() callback, so srpt should use another > function to abort commands in this state. Could this be related to the hang (i.e. the command has been aborted before xmit_response has been called... but w/o causing a panic)? Thanks, Chris > > Vlad > > From sean.hefty at intel.com Wed Sep 16 23:09:16 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Sep 2009 23:09:16 -0700 Subject: [ofa-general] [RFC] 0/5: assistant to the IB communication manager Message-ID: The following collection of pseudo-patches implement a new user space package (IB ACM) designed to assist with connection establishment. A description is given below, copied from the acm_notes.txt file included with the package. The complete package is available on git.openfabrics.org/~shefty/ibacm.git and also in svn under branches/winverbs/ulp/ibacm. This is a request for both general and detailed feedback. The IB ACM has had very limited testing. Testing has been restricted to using the provided test utility, and invoking it from the windows version of the librdmacm on a single, small cluster. Calling it from the linux librdmacm is more involved and still under development. Signed-off-by: Sean Hefty --- Assistant for InfiniBand Communication Management (IB ACM) Note: The IB ACM should be considered experimental. Overview -------- The IB ACM package implements and provides a framework for experimental name, address, and route resolution services over InfiniBand. It is intended to address connection setup scalability issues running MPI applications on large clusters. The IB ACM provides information needed to establish a connection, but does not implement the CM protocol. Long term, the IB ACM may support multiple resolution mechanisms. The IB ACM is focused on being scalable and efficient. The current implementation limits network traffic, SA interactions, and centralized services. As a trade-off, it is not expected to support all cluster routing configurations. However, it is anticipated that additional functionality, such as path record caching, can be incorporated into the IB ACM to support a wider range of configurations. The IB ACM package is comprised of three components: the ib_acm service, a libibacm library, and a test/configuration utility - ib_acme. All are userspace components and are available for Linux and Windows. Additional details are given below. Quick Start Guide ----------------- 1. Prerequisites: libibverbs and libibumad must be installed. The IB stack should be running with IPoIB configured 2. Install the IB ACM package This installs libibacm, ib_acm, and ib_acme. 3. Run ib_acme -A -O This will generate IB ACM address and options configuration files. (acm_addr.cfg and acm_opts.cfg) 4. Run ib_acm and leave running 5. Optionally, run ib_acme -s -d -v This will verify that the ib_acm service is running. It also verifies the path is usable on the given cluster. 5. Install librdmacm. 6. Define the following environment variable: RDMA_CM_USE_IB_ACM=1 The librdmacm will automatically use the ib_acm service. On failures, the librdmacm will fall back to normal resolution. Details ------- libibacm: The libibacm is an end-user library with simple interfaces for communicating with the ib_acm service. The libibacm implements the ib_acm client protocol. Although the interfaces to the libibacm are considered experimental, it's expected that existing calls will be supported going forward. For simplicity, all calls operate synchronously and are serialized. Possible future changes to the libibacm would be to process calls in parallel and add asynchronous interfaces. ib_acme: The ib_acme program serves a dual role. It acts as a utility to test ib_acm operation and help verify if the ib_acm is usable for a given cluster configuration. Additionally, it automatically generates ib_acm configuration files to assist with or eliminate manual setup. acm configuration files: The ib_acm service relies on two configuration files. The acm_addr.cfg file contains name and address mappings for each IB endpoint. Although the names in the acm_addr.cfg file can be anything, ib_acme maps the host name and IP addresses to the IB endpoints. The acm_opts.cfg file provides a set of configurable options for the ib_acm service, such as timeout, number of retries, logging level, etc. ib_acme generates the acm_opts.cfg file using static information. A future enhancement would adjust options based on the current system and cluster size. ib_acm: The ib_acm service is responsible for resolving names and addresses to InfiniBand path information and caching such data. It is currently implemented as an executable application, but is a conceptual service or daemon that should execute with administrative privileges. The ib_acm implements a client interface over TCP sockets, which is abstracted by the libibacm library. One or more back-end protocols are used by the ib_acm service to satisfy user requests. Although the ib_acm supports standard SA path record queries on the back-end, it provides an experimental resolution protocol in hope of achieving greater scalability. Conceptually, the ib_acm service implements an ARP like protocol and uses IB multicast records to construct path record data. It makes the assumption that a unicast path between two endpoints is realizable if those endpoints can communicate over a multicast group with similar properties (rate, mtu, etc.) Specifically, all IB endpoints join a number of multicast groups. Multicast groups differ based on rates, mtu, sl, etc., and are prioritized. All participating endpoints must be able to communicate on the lowest priority multicast group. The ib_acm assigns one or more names/addresses to each IB endpoint using the acm_addr.cfg file. Clients provide source and destination names or addresses as input to the service, and receive as output path record data. The service maps a client's source name/address to a local IB endpoint. If the destination name/address is not cached locally, it sends a multicast request out on the lowest priority multicast group on the local endpoint. The request carries a list of multicast groups that the sender can use. The recipient of the request selects the highest priority multicast group that it can use as well and returns that information directly to the sender. The request data is cached by all endpoints that receive the multicast request message. The source endpoint also caches the response and uses the multicast group that was selected to construct path record data, which is returned to the client. The current implementation of the IB ACM has several additional restrictions. The ib_acm is limited in its handling of dynamic changes; the ib_acm must be stopped and restarted if a cluster is reconfigured. Cached data does not timed out and is only updated if a new resolution request is received from a different QPN than a cached request. Support for IPv6 has not been verified. The number of addresses that can be assigned to a single endpoint is limited to 4, and the number of multicast groups that an endpoint can support is limited to 2. From sean.hefty at intel.com Wed Sep 16 23:13:57 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Sep 2009 23:13:57 -0700 Subject: [ofa-general] [RFC] 1/5: ib_acm: linux abstractions In-Reply-To: References: Message-ID: The following abstractions are defined to support the IB ACM running on Linux. Signed-off-by: Sean Hefty --- /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(OSD_H) #define OSD_H #include #include #include #include #include #include #include #include #include #include #include #include #include #define LIB_DESTRUCTOR __attribute__((destructor)) #define CDECL_FUNC #define container_of(ptr, type, field) \ ((type *) ((void *) ptr - offsetof(type, field))) #define min(a, b) (a < b ? a : b) #define max(a, b) (a > b ? a : b) #if __BYTE_ORDER == __LITTLE_ENDIAN #define htonll(x) bswap_64(x) #else #define htonll(x) (x) #endif #define ntohll(x) htonll(x) typedef struct { volatile int val; } atomic_t; #define atomic_inc(v) (__sync_fetch_and_add(&(v)->val, 1) + 1) #define atomic_dec(v) (__sync_fetch_and_sub(&(v)->val, 1) - 1) #define atomic_get(v) ((v)->val) #define atomic_set(v, s) ((v)->val = s) #define stricmp strcasecmp #define strnicmp strncasecmp typedef struct { pthread_cond_t cond; pthread_mutex_t mutex; } event_t; static inline void event_init(event_t *e) { pthread_cond_init(&e->cond, NULL); pthread_mutex_init(&e->mutex, NULL); } #define event_signal(e) pthread_cond_signal(&(e)->cond) static inline int event_wait(event_t *e, int timeout) { struct timeval curtime; struct timespec wait; int ret; gettimeofday(&curtime, NULL); wait.tv_sec = curtime.tv_sec + ((unsigned) timeout) / 1000; wait.tv_nsec = (curtime.tv_usec + (((unsigned) timeout) % 1000) * 1000) * 1000; pthread_mutex_lock(&e->mutex); ret = pthread_cond_timedwait(&e->cond, &e->mutex, &wait); pthread_mutex_unlock(&e->mutex); return ret; } #define lock_t pthread_mutex_t #define lock_init(x) pthread_mutex_init(x, NULL) #define lock_acquire pthread_mutex_lock #define lock_release pthread_mutex_unlock #define osd_init() 0 #define osd_close() #define SOCKET int #define SOCKET_ERROR -1 #define INVALID_SOCKET -1 #define socket_errno() errno #define closesocket close static inline uint64_t time_stamp_us(void) { struct timeval curtime; timerclear(&curtime); gettimeofday(&curtime, NULL); return (uint64_t) curtime.tv_sec * 1000000 + (uint64_t) curtime.tv_usec; } #define time_stamp_ms() (time_stamp_us() / 1000) static inline int beginthread(void (*func)(void *), void *arg) { pthread_t thread; return pthread_create(&thread, NULL, (void *(*)(void*)) func, arg); } #endif /* OSD_H */ /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _DLIST_H_ #define _DLIST_H_ #ifdef __cplusplus extern "C" { #endif typedef struct _DLIST_ENTRY { struct _DLIST_ENTRY *Next; struct _DLIST_ENTRY *Prev; } DLIST_ENTRY; static void DListInit(DLIST_ENTRY *pHead) { pHead->Next = pHead; pHead->Prev = pHead; } static int DListEmpty(DLIST_ENTRY *pHead) { return pHead->Next == pHead; } static void DListInsertAfter(DLIST_ENTRY *pNew, DLIST_ENTRY *pHead) { pNew->Next = pHead->Next; pNew->Prev = pHead; pHead->Next->Prev = pNew; pHead->Next = pNew; } static void DListInsertBefore(DLIST_ENTRY *pNew, DLIST_ENTRY *pHead) { DListInsertAfter(pNew, pHead->Prev); } #define DListInsertHead DListInsertAfter #define DListInsertTail DListInsertBefore static void DListRemove(DLIST_ENTRY *pEntry) { pEntry->Prev->Next = pEntry->Next; pEntry->Next->Prev = pEntry->Prev; } #ifdef __cplusplus } #endif #endif // _DLIST_H_ From sean.hefty at intel.com Wed Sep 16 23:27:34 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Sep 2009 23:27:34 -0700 Subject: [ofa-general] [RFC] 2/5: IB ACM: windows abstractions In-Reply-To: References: Message-ID: The following abstractions are defined to support the IB ACM running on Windows. An attempt was made to limit the number of dependencies on external libraries, such as complib. We add Windows support for the Linux 'search' binary tree interfaces. This is implemented on Windows using complib fleximap, but gets linked in statically. Signed-off-by: Sean Hefty --- /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(OSD_H) #define OSD_H #include #include #include #define __func__ __FUNCTION__ #define LIB_DESTRUCTOR #define CDECL_FUNC __cdecl typedef struct { volatile LONG val; } atomic_t; #define atomic_inc(v) InterlockedIncrement(&(v)->val) #define atomic_dec(v) InterlockedDecrement(&(v)->val) #define atomic_get(v) ((v)->val) #define atomic_set(v, s) ((v)->val = s) #define event_t HANDLE #define event_init(e) *(e) = CreateEvent(NULL, FALSE, FALSE, NULL) #define event_signal(e) SetEvent(*(e)) #define event_wait(e, t) WaitForSingleObject(*(e), t) #define lock_t CRITICAL_SECTION #define lock_init InitializeCriticalSection #define lock_acquire EnterCriticalSection #define lock_release LeaveCriticalSection static __inline int osd_init() { WSADATA wsadata; return WSAStartup(MAKEWORD(2, 2), &wsadata); } static __inline void osd_close() { WSACleanup(); } #define stricmp _stricmp #define strnicmp _strnicmp #define socket_errno WSAGetLastError #define SHUT_RDWR SD_BOTH static __inline UINT64 time_stamp_us(void) { LARGE_INTEGER cnt, freq; QueryPerformanceFrequency(&freq); QueryPerformanceCounter(&cnt); return (UINT64) cnt.QuadPart / freq.QuadPart * 1000000; } #define time_stamp_ms() (time_stamp_us() * 1000) #define getpid() ((int) GetCurrentProcessId()) #define beginthread(func, arg) (int) _beginthread(func, 0, arg) #define container_of CONTAINING_RECORD #endif /* OSD_H */ /* * Copyright (c) 2009 Intel Corp, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _SEARCH_H_ #define _SEARCH_H_ #include //typedef enum //{ // preorder, // postorder, // endorder, // leaf // //} VISIT; void *tsearch(const void *key, void **rootp, int (*compar)(const void *, const void *)); void *tfind(const void *key, void *const *rootp, int (*compar)(const void *, const void *)); /* tdelete returns key if found (not parent), otherwise NULL */ void *tdelete(const void *key, void **rootp, int (*compar)(const void *, const void *)); //void twalk(const void *root, // void (*action)(const void *, VISIT, int)); //void tdestroy(void *root, void (*free_node)(void *nodep)); #endif /* _SEARCH_H_ */ /* * Copyright (c) 2009 Intel Corp., Inc. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include static int (*compare)(const void *, const void *); static intn_t fcompare(const void * const key1, const void * const key2) { return (intn_t) compare((void *) key1, (void *) key2); } void *tsearch(const void *key, void **rootp, int (*compar)(const void *, const void *)) { cl_fmap_item_t *item, *map_item; if (!*rootp) { *rootp = malloc(sizeof(cl_fmap_t)); cl_fmap_init((cl_fmap_t *) *rootp, fcompare); } compare = compar; item = malloc(sizeof(cl_fmap_item_t)); map_item = cl_fmap_insert((cl_fmap_t *) *rootp, key, item); if (map_item != item) free(item); return (void *) &map_item->p_key; } void *tfind(const void *key, void *const *rootp, int (*compar)(const void *, const void *)) { cl_fmap_item_t *item; if (!*rootp) return NULL; compare = compar; item = cl_fmap_get((cl_fmap_t *) *rootp, key); if (item == cl_fmap_end((cl_fmap_t *) *rootp)) return NULL; return (void *) &item->p_key; } /* * Returns NULL if item is not found, or the item itself. This differs * from the tdelete call by not retuning the parent item, but works if * the user is only checking against NULL. */ void *tdelete(const void *key, void **rootp, int (*compar)(const void *, const void *)) { cl_fmap_item_t *item; void *map_key; if (!*rootp) return NULL; compare = compar; item = cl_fmap_remove((cl_fmap_t *) *rootp, key); if (item == cl_fmap_end((cl_fmap_t *) *rootp)) return NULL; map_key = (void *) item->p_key; free(item); return map_key; } //void twalk(const void *root, // void (*action)(const void *, VISIT, int)) //{ //} //void tdestroy(void *root, void (*free_node)(void *nodep)) //{ //} From sean.hefty at intel.com Wed Sep 16 23:45:05 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Sep 2009 23:45:05 -0700 Subject: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: Message-ID: Add an end-user library with simple interfaces for communicating with the ib_acm service. The linux and windows specific files for the library are simple and not shown for this review Signed-off-by: Sean Hefty --- ib_acm.h: defines library interfaces. These are the end-user application interfaces to the ib acm. /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(IB_ACM_H) #define IB_ACM_H #include #if defined(_WIN32) #define LIB_EXPORT __declspec(dllexport) #else #define LIB_EXPORT #endif #ifdef __cplusplus extern "C" { #endif struct ib_acm_dev_addr { uint64_t guid; uint16_t pkey_index; uint8_t port_num; uint8_t reserved[5]; }; struct ib_acm_resolve_data { uint32_t reserved1; uint8_t init_depth; uint8_t resp_resources; uint8_t packet_lifetime; uint8_t mtu; uint8_t reserved2[8]; }; /** * ib_acm_resolve_name - Resolve path data between the specified names. * Description: * Discover path information, including identifying the local device, * between the given the source and destination names. * Notes: * The source and destination names should match entries in acm_addr.cfg * configuration files on their respective systems. Typically, the * source and destination names will refer to system host names * assigned to an Infiniband port. */ LIB_EXPORT int ib_acm_resolve_name(char *src, char *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data); /** * ib_acm_resolve_ip - Resolve path data between the specified addresses. * Description: * Discover path information, including identifying the local device, * between the given the source and destination addresses. * Notes: * The source and destination addresses should match entries in acm_addr.cfg * configuration files on their respective systems. Typically, the * source and destination addresses will refer to IP addresses assigned * to an IPoIB instance. */ LIB_EXPORT int ib_acm_resolve_ip(struct sockaddr *src, struct sockaddr *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data); #define IB_PATH_RECORD_REVERSIBLE 0x80 struct ib_path_record { uint64_t service_id; union ibv_gid dgid; union ibv_gid sgid; uint16_t dlid; uint16_t slid; uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop limit-7:0*/ uint8_t tclass; uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ uint16_t pkey; uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ uint8_t rate; /* rate selector-7:6 rate-5:0 */ uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 */ uint8_t preference; uint8_t reserved[6]; }; /** * ib_acm_resolve_path - Resolve path data meeting specified restrictions * Description: * Discover path information using the provided path record to * restrict the discovery. * Notes: * Uses the provided path record as input into an query for path * information. If successful, fills in any missing information. The * caller must provide at least the source and destination LIDs as input. */ LIB_EXPORT int ib_acm_resolve_path(struct ib_path_record *path); /** * ib_acm_query_path - Resolve path data meeting specified restrictions * Description: * Queries the IB SA for a path record using the provided path record to * restrict the query. * Notes: * Uses the provided path record as input into an SA query for path * information. If successful, fills in any missing information. The * caller must provide at least the source and destination LIDs as input. * Use of this call always results in sending a query to the IB SA. */ LIB_EXPORT int ib_acm_query_path(struct ib_path_record *path); /** * ib_acm_convert_to_path - Convert resolved path data to a path record * Description: * Converts path information returned from resolving a host name or address * to the format of an IB path record. */ LIB_EXPORT int ib_acm_convert_to_path(struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data, struct ib_path_record *path); #ifdef __cplusplus } #endif #endif /* IB_ACM_H */ acm.h: defines the client/application side protocol of the ib acm service /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_H) #define ACM_H #include #define ACM_VERSION 1 #define ACM_OP_MASK 0x0F #define ACM_OP_RESOLVE 0x01 #define ACM_OP_QUERY 0x02 //#define ACM_OP_CM 0x03 //#define ACM_OP_ACK_REQ 0x40 /* optional ack is required */ #define ACM_OP_ACK 0x80 #define ACM_STATUS_SUCCESS 0 #define ACM_STATUS_ENOMEM 1 #define ACM_STATUS_EINVAL 2 #define ACM_STATUS_ENODATA 3 #define ACM_STATUS_ENOTCONN 5 #define ACM_STATUS_ETIMEDOUT 6 #define ACM_STATUS_ESRCADDR 7 #define ACM_STATUS_ESRCTYPE 8 #define ACM_STATUS_EDESTADDR 9 #define ACM_STATUS_EDESTTYPE 10 struct acm_hdr { uint8_t version; uint8_t opcode; uint8_t status; uint8_t param; uint8_t dest_type; uint8_t src_type; uint8_t reserved[2]; uint64_t tid; }; #define ACM_EP_TYPE_NAME 0x01 #define ACM_EP_TYPE_ADDRESS_IP 0x02 #define ACM_EP_TYPE_ADDRESS_IP6 0x03 #define ACM_EP_TYPE_DEVICE 0x10 #define ACM_EP_TYPE_AV 0x20 #define ACM_MAX_ADDRESS 32 union acm_ep_addr { uint8_t addr[ACM_MAX_ADDRESS]; uint8_t name[ACM_MAX_ADDRESS]; struct ib_acm_dev_addr dev; struct ibv_ah_attr av; }; struct acm_resolve_msg { struct acm_hdr hdr; union acm_ep_addr src; union acm_ep_addr dest; struct ib_acm_resolve_data data; }; //struct acm_cm_param //{ // uint32_t qpn; // uint8_t init_depth; // uint8_t resp_resources; // uint8_t retry_cnt; // uint8_t rnr_retry_cnt; // uint16_t src_port; // uint16_t dest_port; // uint8_t reserved[4]; //}; //struct acm_cm_msg //{ // struct acm_hdr hdr; // union acm_ep_addr src; // union acm_ep_addr dest; // struct acm_cm_param param; //}; #define ACM_QUERY_PATH_RECORD 0x01 #define ACM_QUERY_SA 0x80 #define ACM_EP_TYPE_LID 0x01 #define ACM_EP_TYPE_GID 0x02 union acm_query_data { struct ib_path_record path; }; struct acm_query_msg { struct acm_hdr hdr; union acm_query_data data; uint8_t reserved[16]; }; #define ACM_MSG_DATA_SIZE 80 struct acm_msg { struct acm_hdr hdr; uint8_t data[ACM_MSG_DATA_SIZE]; }; #endif /* ACM_H */ libibacm.c: brain dead implementation that handles the client side protocol of the ib acm for the user /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include extern lock_t lock; static SOCKET sock = INVALID_SOCKET; static short server_port = 6125; static int ready; static int acm_init(void) { struct sockaddr_in addr; int ret; ret = osd_init(); if (ret) return ret; sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); if (sock == INVALID_SOCKET) { ret = socket_errno(); goto err1; } memset(&addr, 0, sizeof addr); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); addr.sin_port = htons(server_port); ret = connect(sock, (struct sockaddr *) &addr, sizeof(addr)); if (ret) goto err2; ready = 1; return 0; err2: closesocket(sock); sock = INVALID_SOCKET; err1: osd_close(); return ret; } void LIB_DESTRUCTOR acm_cleanup(void) { if (sock != INVALID_SOCKET) { shutdown(sock, SHUT_RDWR); closesocket(sock); } } static int acm_resolve(uint8_t *src, uint8_t *dest, uint8_t type, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data) { struct acm_resolve_msg msg; int ret; lock_acquire(&lock); if (!ready && (ret = acm_init())) goto out; memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_RESOLVE; msg.hdr.dest_type = type; msg.hdr.src_type = type; switch (type) { case ACM_EP_TYPE_NAME: strncpy((char *) msg.src.name, (char *) src, ACM_MAX_ADDRESS); strncpy((char *) msg.dest.name, (char *) dest, ACM_MAX_ADDRESS); break; case ACM_EP_TYPE_ADDRESS_IP: memcpy(msg.src.addr, &((struct sockaddr_in *) src)->sin_addr, 4); memcpy(msg.dest.addr, &((struct sockaddr_in *) dest)->sin_addr, 4); break; case ACM_EP_TYPE_ADDRESS_IP6: memcpy(msg.src.addr, &((struct sockaddr_in6 *) src)->sin6_addr, 16); memcpy(msg.dest.addr, &((struct sockaddr_in *) dest)->sin_addr, 16); break; case ACM_EP_TYPE_AV: memcpy(&msg.src.av, src, sizeof(msg.src.av)); memcpy(&msg.dest.av, dest, sizeof(msg.dest.av)); break; default: ret = -1; goto out; } ret = send(sock, (char *) &msg, sizeof msg, 0); if (ret != sizeof msg) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret != sizeof msg) goto out; memcpy(dev_addr, &msg.src.dev, sizeof(*dev_addr)); *ah = msg.dest.av; memcpy(data, &msg.data, sizeof(*data)); ret = 0; out: lock_release(&lock); return ret; } LIB_EXPORT int ib_acm_resolve_name(char *src, char *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data) { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_TYPE_NAME, dev_addr, ah, data); } LIB_EXPORT int ib_acm_resolve_ip(struct sockaddr *src, struct sockaddr *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data) { if (((struct sockaddr *) dest)->sa_family == AF_INET) { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_TYPE_ADDRESS_IP, dev_addr, ah, data); } else { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_TYPE_ADDRESS_IP6, dev_addr, ah, data); } } static int acm_query_path(struct ib_path_record *path, uint8_t query_sa) { struct acm_query_msg msg; int ret; lock_acquire(&lock); if (!ready && (ret = acm_init())) goto out; memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_QUERY; msg.hdr.param = ACM_QUERY_PATH_RECORD | query_sa; if (path->dgid.global.interface_id || path->dgid.global.subnet_prefix) { msg.hdr.dest_type = ACM_EP_TYPE_GID; } else if (path->dlid) { msg.hdr.dest_type = ACM_EP_TYPE_LID; } else { ret = -1; goto out; } if (path->sgid.global.interface_id || path->sgid.global.subnet_prefix) { msg.hdr.src_type = ACM_EP_TYPE_GID; } else if (path->slid) { msg.hdr.src_type = ACM_EP_TYPE_LID; } else { ret = -1; goto out; } msg.data.path = *path; ret = send(sock, (char *) &msg, sizeof msg, 0); if (ret != sizeof msg) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret != sizeof msg) goto out; *path = msg.data.path; ret = msg.hdr.status; out: lock_release(&lock); return ret; } LIB_EXPORT int ib_acm_query_path(struct ib_path_record *path) { return acm_query_path(path, ACM_QUERY_SA); } LIB_EXPORT int ib_acm_resolve_path(struct ib_path_record *path) { return acm_query_path(path, 0); } static struct ibv_context *acm_open_device(uint64_t guid) { struct ibv_device **dev_array; struct ibv_context *verbs = NULL; int i, cnt; dev_array = ibv_get_device_list(&cnt); if (!dev_array) return NULL; for (i = 0; i < cnt; i++) { if (guid == ibv_get_device_guid(dev_array[i])) { verbs = ibv_open_device(dev_array[i]); break; } } ibv_free_device_list(dev_array); return verbs; } LIB_EXPORT int ib_acm_convert_to_path(struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data, struct ib_path_record *path) { struct ibv_context *verbs; struct ibv_port_attr attr; int ret; verbs = acm_open_device(dev_addr->guid); if (!verbs) return -1; if (ah->is_global) { path->dgid = ah->grh.dgid; ret = ibv_query_gid(verbs, dev_addr->port_num, ah->grh.sgid_index, &path->sgid); if (ret) goto out; path->flowlabel_hoplimit = htonl(ah->grh.flow_label << 8 | (uint32_t) ah->grh.hop_limit); path->tclass = ah->grh.traffic_class; } path->dlid = htons(ah->dlid); ret = ibv_query_port(verbs, dev_addr->port_num, &attr); if (ret) goto out; path->slid = htons(attr.lid | ah->src_path_bits); path->reversible_numpath = IB_PATH_RECORD_REVERSIBLE | 1; ret = ibv_query_pkey(verbs, dev_addr->port_num, dev_addr->pkey_index, &path->pkey); if (ret) goto out; path->pkey = htons(path->pkey); path->qosclass_sl = htons((uint16_t) ah->sl); path->mtu = (2 << 6) | data->mtu; path->rate = (2 << 6) | ah->static_rate; path->packetlifetime = (2 << 6) | data->packet_lifetime; out: ibv_close_device(verbs); return ret; } From sean.hefty at intel.com Wed Sep 16 23:54:18 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Sep 2009 23:54:18 -0700 Subject: [ofa-general] [RFC] 4/5: IB ACM: ib_acme test/configuration utility In-Reply-To: References: Message-ID: <6D28DCCC059E4D9C8C1B848A51BB0F84@amr.corp.intel.com> Add a test/configuration utility to setup the ib_acm service and verify its operation. Signed-off-by: Sean Hefty --- One of the eventual goals is for the librdmacm library to use the ib acm, so a decision was made to avoid the ib acm package needing to depend on the librdmacm. This lead to OS specific code being needed to map IP addresses to IB endpoints. If anyone has an easier solution for handling this mapping, I'm open to alternatives here. acme.c: OS independent source file /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include static char *dest_addr; static char *src_addr; static char addr_type = 'i'; static int verify; static int make_addr; static int make_opts; struct ibv_context **verbs; int dev_cnt; extern int gen_addr_ip(FILE *f); static void show_usage(char *program) { printf("usage 1: %s\n", program); printf(" [-f addr_format] - i(p), n(ame), or l(id)\n"); printf(" default: 'i'\n"); printf(" -s src_addr - format defined by -f option\n"); printf(" -d dest_addr - format defined by -f option\n"); printf(" [-v] - verify ACM response against SA query response\n"); printf("usage 2: %s\n", program); printf(" -A - generate local acm_addr.cfg configuration file\n"); printf(" -O - generate local acm_ops.cfg options file\n"); } static void gen_opts_temp(FILE *f) { fprintf(f, "# InfiniBand Multicast Communication Manager for clusters configuration file\n"); fprintf(f, "#\n"); fprintf(f, "# Use ib_acme utility with -O option to automatically generate a sample\n"); fprintf(f, "# acm_opts.cfg file for the current system.\n"); fprintf(f, "#\n"); fprintf(f, "# Entry format is:\n"); fprintf(f, "# name value\n"); fprintf(f, "\n"); fprintf(f, "# log_file:\n"); fprintf(f, "# Specifies the location of the ACM service output. The log file is used to\n"); fprintf(f, "# assist with ACM service debugging and troubleshooting. The log_file can\n"); fprintf(f, "# be set to 'stdout', 'stderr', or the base name of a file. If a file name\n"); fprintf(f, "# is specified, the actual name formed by appending a process ID and '.log'\n"); fprintf(f, "# extension to the end of the specified file name.\n"); fprintf(f, "# Examples:\n"); fprintf(f, "# log_file stdout\n"); fprintf(f, "# log_file stderr\n"); fprintf(f, "# log_file /tmp/acm_\n"); fprintf(f, "\n"); fprintf(f, "log_file stdout\n"); fprintf(f, "\n"); fprintf(f, "# log_level:\n"); fprintf(f, "# Indicates the amount of detailed data written to the log file. Log levels\n"); fprintf(f, "# should be one of the following values:\n"); fprintf(f, "# 0 - basic configuration & errors\n"); fprintf(f, "# 1 - verbose configuation & errors\n"); fprintf(f, "# 2 - verbose operation\n"); fprintf(f, "\n"); fprintf(f, "log_level 0\n"); fprintf(f, "\n"); fprintf(f, "# server_port:\n"); fprintf(f, "# TCP port number that the server listens on.\n"); fprintf(f, "# If this value is changed, then a corresponding change is required for\n"); fprintf(f, "# client applications.\n"); fprintf(f, "\n"); fprintf(f, "server_port 6125\n"); fprintf(f, "\n"); fprintf(f, "# timeout:\n"); fprintf(f, "# Additional time, in milliseconds, that the ACM service will wait for a\n"); fprintf(f, "# response from a remote ACM service or the IB SA. The actual request\n"); fprintf(f, "# timeout is this value plus the subnet timeout.\n"); fprintf(f, "\n"); fprintf(f, "timeout 2000\n"); fprintf(f, "\n"); fprintf(f, "# retries:\n"); fprintf(f, "# Number of times that the ACM service will retry a request. This affects\n"); fprintf(f, "# both ACM multicast messages and and IB SA messages.\n"); fprintf(f, "\n"); fprintf(f, "retries 15\n"); fprintf(f, "\n"); fprintf(f, "# send_depth:\n"); fprintf(f, "# Specifies the maximum number of outstanding requests that can be in\n"); fprintf(f, "# progress simultaneously. A larger send depth allows for greater\n"); fprintf(f, "# parallelism, but increases system resource usage and subnet load.\n"); fprintf(f, "# If the number of pending requests is greater than the send_depth,\n"); fprintf(f, "# the additional requests will automatically be queued until some of\n"); fprintf(f, "# the previous requests complete.\n"); fprintf(f, "\n"); fprintf(f, "send_depth 8\n"); fprintf(f, "\n"); fprintf(f, "# recv_depth:\n"); fprintf(f, "# Specifies the number of buffers allocated and ready to receive remote\n"); fprintf(f, "# requests. A larger receive depth consumes more system resources, but\n"); fprintf(f, "# can avoid dropping requests due to insufficient receive buffers.\n"); fprintf(f, "\n"); fprintf(f, "recv_depth 1024\n"); fprintf(f, "\n"); fprintf(f, "# min_mtu:\n"); fprintf(f, "# Indicates the minimum MTU supported by the ACM service. The ACM service\n"); fprintf(f, "# negotiates to use the largest MTU available between both sides of a\n"); fprintf(f, "# connection. It is most efficient and recommended that min_mtu be set\n"); fprintf(f, "# to the largest MTU value supported by all nodes in a cluster.\n"); fprintf(f, "\n"); fprintf(f, "min_mtu 2048\n"); fprintf(f, "\n"); fprintf(f, "#min_rate:\n"); fprintf(f, "# Indicates the minimum link rate, in Gbps, supported by the ACM service.\n"); fprintf(f, "# The ACM service negotiates to use the highest rate available between both\n"); fprintf(f, "# sides of a connection. It is most efficient and recommended that the\n"); fprintf(f, "# min_rate be set to the largest rate supported by all nodes in a cluster.\n"); fprintf(f, "\n"); fprintf(f, "min_rate 10\n"); fprintf(f, "\n"); } static int gen_opts(void) { FILE *f; printf("Generating acm_opts.cfg\n"); if (!(f = fopen("acm_opts.cfg", "w"))) { printf("Failed to open option configuration file\n"); return -1; } gen_opts_temp(f); fclose(f); return 0; } static void gen_addr_temp(FILE *f) { fprintf(f, "# InfiniBand Communication Management Assistant for clusters address file\n"); fprintf(f, "#\n"); fprintf(f, "# Use ib_acme utility with -G option to automatically generate a sample\n"); fprintf(f, "# acm_addr.cfg file for the current system.\n"); fprintf(f, "#\n"); fprintf(f, "# Entry format is:\n"); fprintf(f, "# address device port pkey\n"); fprintf(f, "#\n"); fprintf(f, "# The address may be one of the following:\n"); fprintf(f, "# host_name - ascii character string, up to 31 characters\n"); fprintf(f, "# address - IPv4 or IPv6 formatted address\n"); fprintf(f, "#\n"); fprintf(f, "# device name - struct ibv_device name\n"); fprintf(f, "# port number - valid port number on device (numbering starts at 1)\n"); fprintf(f, "# pkey - partition key in hex (can specify 'default' for pkey 0xFFFF)\n"); fprintf(f, "#\n"); fprintf(f, "# Up to 4 addresses can be associated with a given tuple\n"); fprintf(f, "#\n"); fprintf(f, "# Samples:\n"); fprintf(f, "# node31 ibv_device0 1 default\n"); fprintf(f, "# node31-1 ibv_device0 1 0x00FF\n"); fprintf(f, "# node31-2 ibv_device0 2 0x00FF\n"); fprintf(f, "# 192.168.0.1 ibv_device0 1 0xFFFF\n"); fprintf(f, "# 192.168.0.2 ibv_device0 2 default\n"); } static int open_verbs(void) { struct ibv_device **dev_array; int i, ret; dev_array = ibv_get_device_list(&dev_cnt); if (!dev_array) { printf("ibv_get_device_list - no devices present?\n"); return -1; } verbs = malloc(sizeof(struct ibv_context *) * dev_cnt); if (!verbs) { ret = -1; goto err1; } for (i = 0; i < dev_cnt; i++) { verbs[i] = ibv_open_device(dev_array[i]); if (!verbs) { printf("ibv_open_device - failed to open device\n"); ret = -1; goto err2; } } ibv_free_device_list(dev_array); return 0; err2: while (i--) ibv_close_device(verbs[i]); free(verbs); err1: ibv_free_device_list(dev_array); return ret; } static void close_verbs(void) { int i; for (i = 0; i < dev_cnt; i++) ibv_close_device(verbs[i]); free(verbs); } static int gen_addr_names(FILE *f) { struct ibv_device_attr dev_attr; struct ibv_port_attr port_attr; int i, index, ret, found_active; char host_name[256]; uint8_t p; ret = gethostname(host_name, sizeof host_name); if (ret) { printf("gethostname error: %d\n", ret); return ret; } strtok(host_name, "."); found_active = 0; index = 1; for (i = 0; i < dev_cnt; i++) { ret = ibv_query_device(verbs[i], &dev_attr); if (ret) break; for (p = 1; p <= dev_attr.phys_port_cnt; p++) { if (!found_active) { ret = ibv_query_port(verbs[i], p, &port_attr); if (!ret && port_attr.state == IBV_PORT_ACTIVE) { printf("%s %s %d default\n", host_name, verbs[i]->device->name, p); fprintf(f, "%s %s %d default\n", host_name, verbs[i]->device->name, p); found_active = 1; } } printf("%s-%d %s %d default\n", host_name, index, verbs[i]->device->name, p); fprintf(f, "%s-%d %s %d default\n", host_name, index++, verbs[i]->device->name, p); } } return ret; } static int gen_addr(void) { FILE *f; int ret; printf("Generating acm_addr.cfg\n"); if (!(f = fopen("acm_addr.cfg", "w"))) { printf("Failed to open address configuration file\n"); return -1; } ret = open_verbs(); if (ret) { goto out1; } gen_addr_temp(f); ret = gen_addr_names(f); if (ret) { printf("Failed to auto generate host names in config file\n"); goto out2; } ret = gen_addr_ip(f); if (ret) { printf("Failed to auto generate IP addresses in config file\n"); goto out2; } out2: close_verbs(); out1: fclose(f); return ret; } static void show_path(struct ib_path_record *path) { char gid[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; uint32_t fl_hop; printf("Path information\n"); inet_ntop(AF_INET6, path->dgid.raw, gid, sizeof gid); printf(" dgid: %s\n", gid); inet_ntop(AF_INET6, path->sgid.raw, gid, sizeof gid); printf(" sgid: %s\n", gid); printf(" dlid: 0x%x\n", ntohs(path->dlid)); printf(" slid: 0x%x\n", ntohs(path->slid)); fl_hop = ntohl(path->flowlabel_hoplimit); printf(" flow label: 0x%x\n", fl_hop >> 8); printf(" hop limit: %d\n", (uint8_t) fl_hop); printf(" tclass: %d\n", path->tclass); printf(" reverisible: %d\n", path->reversible_numpath >> 7); printf(" pkey: 0x%x\n", ntohs(path->pkey)); printf(" sl: %d\n", ntohs(path->qosclass_sl) & 0xF); printf(" mtu: %d\n", path->mtu & 0x1F); printf(" rate: %d\n", path->rate & 0x1F); printf(" packet lifetime: %d\n", path->packetlifetime & 0x1F); } static int resolve_ip(struct ib_path_record *path) { struct ib_acm_dev_addr dev_addr; struct ibv_ah_attr ah; struct ib_acm_resolve_data data; struct sockaddr_in src, dest; int ret; src.sin_family = AF_INET; ret = inet_pton(AF_INET, src_addr, &src.sin_addr); if (ret <= 0) { printf("inet_pton error on source address (%s): 0x%x\n", src_addr, ret); return ret; } dest.sin_family = AF_INET; ret = inet_pton(AF_INET, dest_addr, &dest.sin_addr); if (ret <= 0) { printf("inet_pton error on destination address (%s): 0x%x\n", dest_addr, ret); return ret; } ret = ib_acm_resolve_ip((struct sockaddr *) &src, (struct sockaddr *) &dest, &dev_addr, &ah, &data); if (ret) { printf("ib_acm_resolve_ip failed: 0x%x\n", ret); return ret; } ret = ib_acm_convert_to_path(&dev_addr, &ah, &data, path); if (ret) printf("ib_acm_convert_to_path failed: 0x%x\n", ret); return ret; } static int resolve_name(struct ib_path_record *path) { struct ib_acm_dev_addr dev_addr; struct ibv_ah_attr ah; struct ib_acm_resolve_data data; int ret; ret = ib_acm_resolve_name(src_addr, dest_addr, &dev_addr, &ah, &data); if (ret) { printf("ib_acm_resolve_name failed: 0x%x\n", ret); return ret; } ret = ib_acm_convert_to_path(&dev_addr, &ah, &data, path); if (ret) printf("ib_acm_convert_to_path failed: 0x%x\n", ret); return ret; } static int resolve_lid(struct ib_path_record *path) { int ret; path->slid = htons((uint16_t) atoi(src_addr)); path->dlid = htons((uint16_t) atoi(dest_addr)); path->reversible_numpath = IB_PATH_RECORD_REVERSIBLE | 1; ret = ib_acm_resolve_path(path); if (ret) printf("ib_acm_resolve_path failed: 0x%x\n", ret); return ret; } static int verify_resolve(struct ib_path_record *path) { int ret; ret = ib_acm_query_path(path); if (ret) printf("SA verification: failed 0x%x\n", ret); else printf("SA verification: success\n"); return ret; } static int resolve(char *program) { struct ib_path_record path; int ret; switch (addr_type) { case 'i': ret = resolve_ip(&path); break; case 'n': ret = resolve_name(&path); break; case 'l': memset(&path, 0, sizeof path); ret = resolve_lid(&path); break; default: show_usage(program); exit(1); } if (!ret) show_path(&path); if (verify) ret = verify_resolve(&path); return ret; } int main(int argc, char **argv) { int op, ret; ret = osd_init(); if (ret) goto out; while ((op = getopt(argc, argv, "f:s:d:vAO")) != -1) { switch (op) { case 'f': addr_type = optarg[0]; break; case 's': src_addr = optarg; break; case 'd': dest_addr = optarg; break; case 'v': verify = 1; break; case 'A': make_addr = 1; break; case 'O': make_opts = 1; break; default: show_usage(argv[0]); exit(1); } } if ((src_addr && !dest_addr) || (dest_addr && !src_addr) || (!src_addr && !dest_addr && !make_addr && !make_opts)) { show_usage(argv[0]); exit(1); } if (src_addr) ret = resolve(argv[0]); if (!ret && make_addr) ret = gen_addr(); if (!ret && make_opts) ret = gen_opts(); out: printf("return status 0x%x\n", ret); return ret; } acme_linux.c: Linux implementation to map IP addresses to IB endpoints. I'm less than thrilled about this code. /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include extern struct ibv_context **verbs; extern int dev_cnt; static int get_pkey(struct ifreq *ifreq, uint16_t *pkey) { char buf[128], *end; FILE *f; int ret; sprintf(buf, "//sys//class//net//%s//pkey", ifreq->ifr_name); f = fopen(buf, "r"); if (!f) { printf("failed to open %s\n", buf); return -1; } if (fgets(buf, sizeof buf, f)) { *pkey = strtol(buf, &end, 16); ret = 0; } else { printf("failed to read pkey\n"); ret = -1; } fclose(f); return ret; } static int get_sgid(struct ifreq *ifr, union ibv_gid *sgid) { char buf[128], *end; FILE *f; int i, p, ret; sprintf(buf, "//sys//class//net//%s//address", ifr->ifr_name); f = fopen(buf, "r"); if (!f) { printf("failed to open %s\n", buf); return -1; } if (fgets(buf, sizeof buf, f)) { for (i = 0, p = 12; i < 16; i++, p += 3) { buf[p + 2] = '\0'; sgid->raw[i] = (uint8_t) strtol(buf + p, &end, 16); } ret = 0; } else { printf("failed to read sgid\n"); ret = -1; } fclose(f); return ret; } static int get_devaddr(int s, struct ifreq *ifr, int *dev_index, uint8_t *port, uint16_t *pkey) { struct ibv_device_attr dev_attr; struct ibv_port_attr port_attr; union ibv_gid sgid, gid; int ret, i; ret = get_sgid(ifr, &sgid); if (ret) { printf("unable to get sgid\n"); return ret; } ret = get_pkey(ifr, pkey); if (ret) { printf("unable to get pkey\n"); return ret; } for (*dev_index = 0; *dev_index < dev_cnt; (*dev_index)++) { ret = ibv_query_device(verbs[*dev_index], &dev_attr); if (ret) continue; for (*port = 1; *port <= dev_attr.phys_port_cnt; (*port)++) { ret = ibv_query_port(verbs[*dev_index], *port, &port_attr); if (ret) continue; for (i = 0; i < port_attr.gid_tbl_len; i++) { ret = ibv_query_gid(verbs[*dev_index], *port, i, &gid); if (ret || !gid.global.interface_id) break; if (!memcmp(sgid.raw, gid.raw, sizeof gid)) return 0; } } } return -1; } int gen_addr_ip(FILE *f) { struct ifconf *ifc; struct ifreq *ifr; char ip[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; int s, ret, dev_index, i, len; uint16_t pkey; uint8_t port; s = socket(AF_INET6, SOCK_DGRAM, 0); if (!s) return -1; len = sizeof(*ifc) + sizeof(*ifr) * 64; ifc = malloc(len); if (!ifc) { ret = -1; goto out1; } memset(ifc, 0, len); ifc->ifc_len = len; ifc->ifc_req = (struct ifreq *) (ifc + 1); ret = ioctl(s, SIOCGIFCONF, ifc); if (ret < 0) { printf("ioctl ifconf error %d\n", ret); goto out2; } ifr = ifc->ifc_req; for (i = 0; i < ifc->ifc_len / sizeof(struct ifreq); i++) { switch (ifr[i].ifr_addr.sa_family) { case AF_INET: inet_ntop(ifr[i].ifr_addr.sa_family, &((struct sockaddr_in *) &ifr[i].ifr_addr)->sin_addr, ip, sizeof ip); break; case AF_INET6: inet_ntop(ifr[i].ifr_addr.sa_family, &((struct sockaddr_in6 *) &ifr[i].ifr_addr)->sin6_addr, ip, sizeof ip); break; default: continue; } ret = ioctl(s, SIOCGIFHWADDR, &ifr[i]); if (ret) { printf("failed to get hw address %d\n", ret); continue; } if (ifr[i].ifr_hwaddr.sa_family != ARPHRD_INFINIBAND) continue; ret = get_devaddr(s, &ifr[i], &dev_index, &port, &pkey); if (ret) continue; printf("%s %s %d 0x%x\n", ip, verbs[dev_index]->device->name, port, pkey); fprintf(f, "%s %s %d 0x%x\n", ip, verbs[dev_index]->device->name, port, pkey); } ret = 0; out2: free(ifc); out1: close(s); return ret; } acme_windows.c - windows specific implementation to map IP addresses to IB endpoints. This code doesn't bother me so much. /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "..\..\..\..\etc\user\getopt.c" #include "..\src\acme.c" #include "..\..\..\..\etc\user\inet.c" #include extern struct ibv_context **verbs; extern int dev_cnt; int gen_addr_ip(FILE *f) { WV_DEVICE_ADDRESS devaddr; IWVProvider *prov; HRESULT hr; struct addrinfo *res, hint, *ai; char ip[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; int i; hr = WvGetObject(&IID_IWVProvider, (LPVOID *) &prov); if (FAILED(hr)) return hr; memset(&hint, 0, sizeof hint); hint.ai_protocol = IPPROTO_TCP; hr = getaddrinfo("..localmachine", NULL, &hint, &res); if (hr) { printf("getaddrinfo error %d\n", hr); goto release; } for (ai = res; ai; ai = ai->ai_next) { switch (ai->ai_family) { case AF_INET: inet_ntop(ai->ai_family, &((struct sockaddr_in *) ai->ai_addr)->sin_addr, ip, sizeof ip); break; case AF_INET6: inet_ntop(ai->ai_family, &((struct sockaddr_in6 *) ai->ai_addr)->sin6_addr, ip, sizeof ip); break; default: continue; } hr = prov->lpVtbl->TranslateAddress(prov, ai->ai_addr, &devaddr); if (FAILED(hr)) continue; for (i = 0; i < dev_cnt; i++) { if (devaddr.DeviceGuid == ibv_get_device_guid(verbs[i]->device)) { printf("%s %s %d 0x%x\n", ip, verbs[i]->device->name, devaddr.PortNumber, ntohs(devaddr.Pkey)); fprintf(f, "%s %s %d 0x%x\n", ip, verbs[i]->device->name, devaddr.PortNumber, ntohs(devaddr.Pkey)); } } } hr = 0; freeaddrinfo(res); release: prov->lpVtbl->Release(prov); return hr; } From sean.hefty at intel.com Thu Sep 17 00:03:15 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 00:03:15 -0700 Subject: [ofa-general] [RFC] 5/5: IB ACM: ib_acm service In-Reply-To: References: Message-ID: <72A4BF9BB2C54D72AADF6C4F5C517DE5@amr.corp.intel.com> Name and address resolution service for InfiniBand. Defines and implements the ib_acm service to ib_acm service protocol. Signed-off-by: Sean Hefty --- Note: Some of this was implemented before the IB ACM package added a dependency on libibumad. I have not gone back through to see if all of the definitions are necessary given that dependency. /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_MAD_H) #define ACM_MAD_H #include #include #include #define ACM_SEND_SIZE 256 #define ACM_RECV_SIZE (ACM_SEND_SIZE + sizeof(struct ibv_grh)) #define IB_METHOD_GET 0x01 #define IB_METHOD_SET 0x02 #define IB_METHOD_SEND 0x03 #define IB_METHOD_GET_TABLE 0x12 #define IB_METHOD_DELETE 0x15 #define IB_METHOD_RESP 0x80 #define ACM_MGMT_CLASS 0x2C #define ACM_CTRL_ACK htons(0x8000) #define ACM_CTRL_RESOLVE htons(0x0001) #define ACM_CTRL_CM htons(0x0002) struct acm_mad { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; uint16_t status; uint16_t control; uint64_t tid; uint8_t data[240]; }; #define acm_class_status(status) ((uint8_t) (ntohs(status) >> 8)) #define ACM_QKEY 0x80010000 #define ACM_ADDRESS_INVALID 0x00 #define ACM_ADDRESS_NAME 0x01 #define ACM_ADDRESS_IP 0x02 #define ACM_ADDRESS_IP6 0x03 #define ACM_ADDRESS_RESERVED 0x04 /* start of reserved range */ #define ACM_MAX_GID_COUNT 10 struct acm_resolve_rec { uint8_t dest_type; uint8_t dest_length; uint8_t src_type; uint8_t src_length; uint8_t gid_cnt; uint8_t resp_resources; uint8_t init_depth; uint8_t reserved; uint8_t dest[ACM_MAX_ADDRESS]; uint8_t src[ACM_MAX_ADDRESS]; union ibv_gid gid[ACM_MAX_GID_COUNT]; }; #define IB_MGMT_CLASS_SA 0x03 struct ib_sa_mad { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; uint16_t status; uint16_t reserved1; uint64_t tid; uint16_t attr_id; uint16_t reserved2; uint32_t attr_mod; uint8_t rmpp_version; uint8_t rmpp_type; uint8_t rmpp_flags; uint8_t rmpp_status; uint32_t seg_num; uint32_t paylen_newwin; uint32_t sm_key[2]; uint16_t attr_offset; uint16_t reserved3; uint64_t comp_mask; uint8_t data[200]; }; #define IB_SA_ATTR_PATH_REC htons(0x0035) #define IB_COMP_MASK_PR_SERVICE_ID (htonll(1 << 0) | \ htonll(1 << 1)) #define IB_COMP_MASK_PR_DGID htonll(1 << 2) #define IB_COMP_MASK_PR_SGID htonll(1 << 3) #define IB_COMP_MASK_PR_DLID htonll(1 << 4) #define IB_COMP_MASK_PR_SLID htonll(1 << 5) #define IB_COMP_MASK_PR_RAW_TRAFFIC htonll(1 << 6) /* RESERVED htonll(1 << 7) */ #define IB_COMP_MASK_PR_FLOW_LABEL htonll(1 << 8) #define IB_COMP_MASK_PR_HOP_LIMIT htonll(1 << 9) #define IB_COMP_MASK_PR_TCLASS htonll(1 << 10) #define IB_COMP_MASK_PR_REVERSIBLE htonll(1 << 11) #define IB_COMP_MASK_PR_NUM_PATH htonll(1 << 12) #define IB_COMP_MASK_PR_PKEY htonll(1 << 13) #define IB_COMP_MASK_PR_QOS_CLASS htonll(1 << 14) #define IB_COMP_MASK_PR_SL htonll(1 << 15) #define IB_COMP_MASK_PR_MTU_SELECTOR htonll(1 << 16) #define IB_COMP_MASK_PR_MTU htonll(1 << 17) #define IB_COMP_MASK_PR_RATE_SELECTOR htonll(1 << 18) #define IB_COMP_MASK_PR_RATE htonll(1 << 19) #define IB_COMP_MASK_PR_PACKET_LIFETIME_SELECTOR htonll(1 << 20) #define IB_COMP_MASK_PR_PACKET_LIFETIME htonll(1 << 21) #define IB_COMP_MASK_PR_PREFERENCE htonll(1 << 22) /* RESERVED htonll(1 << 23) */ #define IB_MC_QPN 0xffffff #define IB_SA_ATTR_MC_MEMBER_REC htons(0x0038) #define IB_COMP_MASK_MC_MGID htonll(1 << 0) #define IB_COMP_MASK_MC_PORT_GID htonll(1 << 1) #define IB_COMP_MASK_MC_QKEY htonll(1 << 2) #define IB_COMP_MASK_MC_MLID htonll(1 << 3) #define IB_COMP_MASK_MC_MTU_SEL htonll(1 << 4) #define IB_COMP_MASK_MC_MTU htonll(1 << 5) #define IB_COMP_MASK_MC_TCLASS htonll(1 << 6) #define IB_COMP_MASK_MC_PKEY htonll(1 << 7) #define IB_COMP_MASK_MC_RATE_SEL htonll(1 << 8) #define IB_COMP_MASK_MC_RATE htonll(1 << 9) #define IB_COMP_MASK_MC_PACKET_LIFETIME_SEL htonll(1 << 10) #define IB_COMP_MASK_MC_PACKET_LIFETIME htonll(1 << 11) #define IB_COMP_MASK_MC_SL htonll(1 << 12) #define IB_COMP_MASK_MC_FLOW htonll(1 << 13) #define IB_COMP_MASK_MC_HOP htonll(1 << 14) #define IB_COMP_MASK_MC_SCOPE htonll(1 << 15) #define IB_COMP_MASK_MC_JOIN_STATE htonll(1 << 16) #define IB_COMP_MASK_MC_PROXY_JOIN htonll(1 << 17) struct ib_mc_member_rec { union ibv_gid mgid; union ibv_gid port_gid; uint32_t qkey; uint16_t mlid; uint8_t mtu; uint8_t tclass; uint16_t pkey; uint8_t rate; uint8_t packet_lifetime; uint32_t sl_flow_hop; uint8_t scope_state; uint8_t proxy_join; uint8_t reserved[2]; uint8_t pad[4]; }; #endif /* ACM_MAD_H */ /* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "acm_mad.h" #define MAX_EP_ADDR 4 #define MAX_EP_MC 2 struct acm_dest { uint8_t address[ACM_MAX_ADDRESS]; /* keep first */ struct ibv_ah *ah; struct ibv_ah_attr av; union ibv_gid mgid; DLIST_ENTRY req_queue; uint32_t remote_qpn; uint8_t init_depth; uint8_t resp_resources; uint8_t mtu; uint8_t packet_lifetime; }; struct acm_port { struct acm_device *dev; DLIST_ENTRY ep_list; int mad_portid; int mad_agentid; struct acm_dest sa_dest; enum ibv_port_state state; enum ibv_mtu mtu; enum ibv_rate rate; int subnet_timeout; int gid_cnt; uint16_t pkey_cnt; uint16_t lid; uint8_t lmc; uint8_t port_num; }; struct acm_device { struct ibv_context *verbs; struct ibv_comp_channel *channel; struct ibv_pd *pd; uint64_t guid; DLIST_ENTRY entry; uint8_t active; uint8_t init_depth; uint8_t resp_resources; int port_cnt; struct acm_port port[0]; }; struct acm_ep { struct acm_port *port; struct ibv_cq *cq; struct ibv_qp *qp; struct ibv_mr *mr; uint8_t *recv_bufs; DLIST_ENTRY entry; union acm_ep_addr addr[MAX_EP_ADDR]; uint8_t addr_type[MAX_EP_ADDR]; void *dest_map[ACM_ADDRESS_RESERVED - 1]; struct acm_dest mc_dest[MAX_EP_MC]; int mc_cnt; uint16_t pkey_index; uint16_t pkey; lock_t lock; int available_sends; DLIST_ENTRY pending_queue; DLIST_ENTRY active_queue; DLIST_ENTRY wait_queue; }; struct acm_send_msg { DLIST_ENTRY entry; struct acm_ep *ep; struct ibv_mr *mr; struct ibv_send_wr wr; struct ibv_sge sge; uint64_t expires; int tries; uint8_t data[ACM_SEND_SIZE]; }; struct acm_client { lock_t lock; /* acquire ep lock first */ SOCKET sock; int index; atomic_t refcnt; }; struct acm_request { struct acm_client *client; DLIST_ENTRY entry; struct acm_msg msg; }; static DLIST_ENTRY dev_list; static atomic_t tid; static DLIST_ENTRY timeout_list; static event_t timeout_event; static atomic_t wait_cnt; static SOCKET listen_socket; static struct acm_client client[FD_SETSIZE - 1]; static FILE *flog; static lock_t log_lock; static char log_file[128] = "stdout"; static int log_level = 0; static short server_port = 6125; static int timeout = 2000; static int retries = 15; static int send_depth = 64; static int recv_depth = 1024; static uint8_t min_mtu = IBV_MTU_2048; static uint8_t min_rate = IBV_RATE_10_GBPS; #define acm_log(level, format, ...) \ acm_write(level, "%s: "format, __func__, ## __VA_ARGS__) static void acm_write(int level, const char *format, ...) { va_list args; if (level > log_level) return; va_start(args, format); lock_acquire(&log_lock); vfprintf(flog, format, args); lock_release(&log_lock); va_end(args); } static void acm_log_ep_addr(int level, const char *msg, union acm_ep_addr *addr, uint8_t ep_type) { char ip_addr[ACM_MAX_ADDRESS]; if (level > log_level) return; lock_acquire(&log_lock); fprintf(flog, msg); switch (ep_type) { case ACM_EP_TYPE_NAME: fprintf(flog, "%s\n", addr->name); break; case ACM_EP_TYPE_ADDRESS_IP: inet_ntop(AF_INET, addr->addr, ip_addr, ACM_MAX_ADDRESS); fprintf(flog, "%s\n", ip_addr); break; case ACM_EP_TYPE_ADDRESS_IP6: inet_ntop(AF_INET6, addr->addr, ip_addr, ACM_MAX_ADDRESS); fprintf(flog, "%s\n", ip_addr); break; case ACM_EP_TYPE_DEVICE: fprintf(flog, "device guid 0x%llx, pkey index %d, port %d\n", addr->dev.guid, addr->dev.pkey_index, addr->dev.port_num); break; case ACM_EP_TYPE_AV: fprintf(flog, "endpoint specified using address vector\n"); break; default: fprintf(flog, "unknown endpoint address 0x%x\n", ep_type); } lock_release(&log_lock); } static void *zalloc(size_t size) { void *buf; buf = malloc(size); if (buf) memset(buf, 0, size); return buf; } static struct acm_send_msg * acm_alloc_send(struct acm_ep *ep, struct acm_dest *dest, size_t size) { struct acm_send_msg *msg; msg = (struct acm_send_msg *) zalloc(sizeof *msg); if (!msg) { acm_log(0, "ERROR - unable to allocate send buffer\n"); return NULL; } msg->ep = ep; msg->mr = ibv_reg_mr(ep->port->dev->pd, msg->data, size, 0); if (!msg->mr) { acm_log(0, "ERROR - failed to register send buffer\n"); goto err; } msg->wr.next = NULL; msg->wr.sg_list = &msg->sge; msg->wr.num_sge = 1; msg->wr.opcode = IBV_WR_SEND; msg->wr.send_flags = IBV_SEND_SIGNALED; msg->wr.wr_id = (uintptr_t) msg; msg->wr.wr.ud.ah = dest->ah; msg->wr.wr.ud.remote_qpn = dest->remote_qpn; msg->wr.wr.ud.remote_qkey = ACM_QKEY; msg->sge.length = size; msg->sge.lkey = msg->mr->lkey; msg->sge.addr = (uintptr_t) msg->data; return msg; err: free(msg); return NULL; } static void acm_free_send(struct acm_send_msg *msg) { ibv_dereg_mr(msg->mr); free(msg); } static void acm_post_send(struct acm_send_msg *msg) { struct acm_ep *ep = msg->ep; struct ibv_send_wr *bad_wr; if (ep->available_sends) { acm_log(2, "posting send to QP\n"); ep->available_sends--; DListInsertTail(&msg->entry, &ep->active_queue); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } else { acm_log(2, "no sends available, queuing message\n"); DListInsertTail(&msg->entry, &ep->pending_queue); } } static void acm_post_recv(struct acm_ep *ep, uint64_t address) { struct ibv_recv_wr wr, *bad_wr; struct ibv_sge sge; wr.next = NULL; wr.sg_list = &sge; wr.num_sge = 1; wr.wr_id = address; sge.length = ACM_RECV_SIZE; sge.lkey = ep->mr->lkey; sge.addr = address; ibv_post_recv(ep->qp, &wr, &bad_wr); } static void acm_send_available(struct acm_ep *ep) { struct acm_send_msg *msg; struct ibv_send_wr *bad_wr; DLIST_ENTRY *entry; if (DListEmpty(&ep->pending_queue)) { ep->available_sends++; } else { acm_log(2, "posting queued send message\n"); entry = ep->pending_queue.Next; DListRemove(entry); msg = container_of(entry, struct acm_send_msg, entry); DListInsertTail(&msg->entry, &ep->active_queue); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } } static void acm_complete_send(struct acm_send_msg *msg) { struct acm_ep *ep = msg->ep; lock_acquire(&ep->lock); DListRemove(&msg->entry); if (msg->tries) { acm_log(2, "waiting for response\n"); msg->expires = time_stamp_ms() + ep->port->subnet_timeout + timeout; DListInsertTail(&msg->entry, &ep->wait_queue); if (atomic_inc(&wait_cnt) == 1) event_signal(&timeout_event); } else { acm_log(2, "freeing\n"); acm_send_available(ep); acm_free_send(msg); } lock_release(&ep->lock); } static struct acm_send_msg *acm_get_request(struct acm_ep *ep, uint64_t tid, int *free) { struct acm_send_msg *msg, *req = NULL; struct acm_mad *mad; DLIST_ENTRY *entry, *next; acm_log(2, "\n"); lock_acquire(&ep->lock); for (entry = ep->wait_queue.Next; entry != &ep->wait_queue; entry = next) { next = entry->Next; msg = container_of(entry, struct acm_send_msg, entry); mad = (struct acm_mad *) msg->data; if (mad->tid == tid) { acm_log(2, "match found in wait queue\n"); req = msg; DListRemove(entry); (void) atomic_dec(&wait_cnt); acm_send_available(ep); *free = 1; goto unlock; } } for (entry = ep->active_queue.Next; entry != &ep->active_queue; entry = entry->Next) { msg = container_of(entry, struct acm_send_msg, entry); mad = (struct acm_mad *) msg->data; if (mad->tid == tid && msg->tries) { acm_log(2, "match found in active queue\n"); req = msg; req->tries = 0; *free = 0; break; } } unlock: lock_release(&ep->lock); return req; } static uint8_t acm_gid_index(struct acm_port *port, union ibv_gid *gid) { union ibv_gid cmp_gid; uint8_t i; for (i = 0; i < port->gid_cnt; i++) { ibv_query_gid(port->dev->verbs, port->port_num, i, &cmp_gid); if (!memcmp(&cmp_gid, gid, sizeof cmp_gid)) break; } return i; } static int acm_mc_index(struct acm_ep *ep, union ibv_gid *gid) { int i; for (i = 0; i < ep->mc_cnt; i++) { if (!memcmp(&ep->mc_dest[i].address, gid, sizeof(*gid))) return i; } return -1; } static void acm_init_mc_av(struct acm_port *port, struct ib_mc_member_rec *mc_rec, struct ibv_ah_attr *av) { uint32_t sl_flow_hop; sl_flow_hop = ntohl(mc_rec->sl_flow_hop); av->dlid = ntohs(mc_rec->mlid); av->sl = (uint8_t) (sl_flow_hop >> 28); av->src_path_bits = port->sa_dest.av.src_path_bits; av->static_rate = mc_rec->rate & 0x3F; av->port_num = port->port_num; av->is_global = 1; av->grh.dgid = mc_rec->mgid; av->grh.flow_label = (sl_flow_hop >> 8) & 0xFFFFF; av->grh.sgid_index = acm_gid_index(port, &mc_rec->port_gid); av->grh.hop_limit = (uint8_t) sl_flow_hop; av->grh.traffic_class = mc_rec->tclass; } static void acm_process_join_resp(struct acm_ep *ep, struct ib_user_mad *umad) { struct acm_dest *dest; struct ib_mc_member_rec *mc_rec; struct ib_sa_mad *mad; int index, ret; mad = (struct ib_sa_mad *) umad->data; acm_log(1, "response status: 0x%x, mad status: 0x%x\n", umad->status, mad->status); if (umad->status) { acm_log(0, "ERROR - send join failed 0x%x\n", umad->status); return; } if (mad->status) { acm_log(0, "ERROR - join response status 0x%x\n", mad->status); return; } mc_rec = (struct ib_mc_member_rec *) mad->data; lock_acquire(&ep->lock); index = acm_mc_index(ep, &mc_rec->mgid); if (index >= 0) { dest = &ep->mc_dest[index]; dest->remote_qpn = IB_MC_QPN; dest->mgid = mc_rec->mgid; acm_init_mc_av(ep->port, mc_rec, &dest->av); dest->mtu = mc_rec->mtu & 0x3F; dest->packet_lifetime = mc_rec->packet_lifetime & 0x3F; dest->ah = ibv_create_ah(ep->port->dev->pd, &dest->av); ret = ibv_attach_mcast(ep->qp, &mc_rec->mgid, mc_rec->mlid); if (ret) { acm_log(0, "ERROR - unable to attach QP to multicast group\n"); } acm_log(1, "join successful\n"); } else { acm_log(0, "ERROR - MGID in join response not found\n"); } lock_release(&ep->lock); } static int acm_compare_dest(const void *dest1, const void *dest2) { return memcmp(dest1, dest2, ACM_MAX_ADDRESS); } static int acm_addr_index(struct acm_ep *ep, uint8_t *addr, uint8_t addr_type) { int i; for (i = 0; i < MAX_EP_ADDR; i++) { if (ep->addr_type[i] != addr_type) continue; if ((addr_type == ACM_ADDRESS_NAME && !strnicmp((char *) ep->addr[i].name, (char *) addr, ACM_MAX_ADDRESS)) || !memcmp(ep->addr[i].addr, addr, ACM_MAX_ADDRESS)) return i; } return -1; } /* * Multicast groups are ordered lowest to highest preference. */ static int acm_record_av(struct acm_dest *dest, struct acm_ep *ep, struct ibv_wc *wc, struct acm_resolve_rec *rec) { int i, index; acm_log(2, "\n"); for (i = min(rec->gid_cnt, ACM_MAX_GID_COUNT) - 1; i >= 0; i--) { index = acm_mc_index(ep, &rec->gid[i]); if (index >= 0) { acm_log(2, "selecting MC group at index %d\n", index); dest->av = ep->mc_dest[index].av; dest->av.dlid = wc->slid; dest->av.src_path_bits = wc->dlid_path_bits; dest->av.grh.dgid = ((struct ibv_grh *) (uintptr_t) wc->wr_id)->sgid; dest->mgid = ep->mc_dest[index].mgid; dest->mtu = ep->mc_dest[index].mtu; dest->packet_lifetime = ep->mc_dest[index].packet_lifetime; return ACM_STATUS_SUCCESS; } } return ACM_STATUS_ENODATA; } /* * Record the source of a resolve request. Use the source QPN to see if * the remote service has relocated and we need to update our cache. */ static struct acm_dest * acm_record_src(struct acm_ep *ep, struct ibv_wc *wc, struct acm_resolve_rec *rec) { struct acm_dest *dest, **tdest; int ret; acm_log(2, "\n"); lock_acquire(&ep->lock); tdest = tfind(rec->src, &ep->dest_map[rec->src_type - 1], acm_compare_dest); if (!tdest) { acm_log(2, "creating new dest\n"); dest = zalloc(sizeof *dest); if (!dest) { acm_log(0, "ERROR - unable to allocate dest\n"); goto unlock; } memcpy(dest->address, rec->src, ACM_MAX_ADDRESS); DListInit(&dest->req_queue); tsearch(dest, &ep->dest_map[rec->src_type - 1], acm_compare_dest); } else { dest = *tdest; } if (dest->ah) { if (dest->remote_qpn == wc->src_qp) goto unlock; ibv_destroy_ah(dest->ah); // TODO: ah could be in use dest->ah = NULL; } acm_log(2, "creating address handle\n"); ret = acm_record_av(dest, ep, wc, rec); if (ret) { acm_log(0, "ERROR - failed to record av\n"); goto err; } dest->ah = ibv_create_ah(ep->port->dev->pd, &dest->av); if (!dest->ah) { acm_log(0, "ERROR - failed to create ah\n"); goto err; } dest->remote_qpn = wc->src_qp; dest->init_depth = rec->init_depth; dest->resp_resources = rec->resp_resources; unlock: lock_release(&ep->lock); return dest; err: if (!tdest) { tdelete(dest->address, &ep->dest_map[rec->src_type - 1], acm_compare_dest); free(dest); } lock_release(&ep->lock); return NULL; } static void acm_init_resp_mad(struct acm_mad *resp, struct acm_mad *req) { resp->base_version = req->base_version; resp->mgmt_class = req->mgmt_class; resp->class_version = req->class_version; resp->method = req->method | IB_METHOD_RESP; resp->status = ACM_STATUS_SUCCESS; resp->control = req->control; resp->tid = req->tid; } static int acm_validate_resolve_req(struct acm_mad *mad) { struct acm_resolve_rec *rec; if (mad->method != IB_METHOD_GET) { acm_log(0, "ERROR - invalid method 0x%x\n", mad->method); return ACM_STATUS_EINVAL; } rec = (struct acm_resolve_rec *) mad->data; if (!rec->src_type || rec->src_type >= ACM_ADDRESS_RESERVED) { acm_log(0, "ERROR - unknown src type 0x%x\n", rec->src_type); return ACM_STATUS_EINVAL; } return 0; } static void acm_process_resolve_req(struct acm_ep *ep, struct ibv_wc *wc, struct acm_mad *mad) { struct acm_resolve_rec *rec, *resp_rec; struct acm_dest *dest; struct acm_send_msg *msg; struct acm_mad *resp_mad; acm_log(2, "\n"); if (acm_validate_resolve_req(mad)) { acm_log(0, "ERROR - invalid request\n"); return; } rec = (struct acm_resolve_rec *) mad->data; dest = acm_record_src(ep, wc, rec); if (!dest) { acm_log(0, "ERROR - failed to record source\n"); return; } if (acm_addr_index(ep, rec->dest, rec->dest_type) < 0) { acm_log(2, "no matching address - discarding\n"); return; } msg = acm_alloc_send(ep, dest, sizeof (*resp_mad)); if (!msg) { acm_log(0, "ERROR - failed to allocate message\n"); return; } resp_mad = (struct acm_mad *) msg->data; resp_rec = (struct acm_resolve_rec *) resp_mad->data; acm_init_resp_mad(resp_mad, mad); resp_rec->dest_type = rec->src_type; resp_rec->dest_length = rec->src_length; resp_rec->src_type = rec->dest_type; resp_rec->src_length = rec->dest_length; resp_rec->gid_cnt = 1; resp_rec->resp_resources = ep->port->dev->resp_resources; resp_rec->init_depth = ep->port->dev->init_depth; memcpy(resp_rec->dest, rec->src, ACM_MAX_ADDRESS); memcpy(resp_rec->src, rec->dest, ACM_MAX_ADDRESS); memcpy(resp_rec->gid, dest->mgid.raw, sizeof(union ibv_gid)); acm_log(2, "sending resolve response\n"); lock_acquire(&ep->lock); acm_post_send(msg); lock_release(&ep->lock); } static int acm_client_resolve_resp(struct acm_ep *ep, struct acm_client *client, struct acm_resolve_msg *msg, struct acm_dest *dest, uint8_t status) { int ret; acm_log(1, "status 0x%x\n", status); lock_acquire(&client->lock); if (client->sock == INVALID_SOCKET) { acm_log(0, "ERROR - connection lost\n"); ret = ACM_STATUS_ENOTCONN; goto release; } msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.status = status; msg->hdr.param = 0; if (!status) { msg->hdr.src_type = ACM_EP_TYPE_DEVICE; msg->src.dev.guid = ep->port->dev->guid; msg->src.dev.pkey_index = ep->pkey_index; msg->src.dev.port_num = ep->port->port_num; if (dest) { acm_log(2, "destination found\n"); msg->hdr.dest_type = ACM_EP_TYPE_AV; msg->dest.av = dest->av; msg->data.init_depth = min(ep->port->dev->init_depth, dest->resp_resources); msg->data.resp_resources = min(ep->port->dev->resp_resources, dest->init_depth); msg->data.packet_lifetime = dest->packet_lifetime; msg->data.mtu = dest->mtu; } } ret = send(client->sock, (char *) msg, sizeof *msg, 0); if (ret != sizeof(*msg)) acm_log(0, "failed to send response\n"); else ret = 0; release: lock_release(&client->lock); (void) atomic_dec(&client->refcnt); return ret; } static struct acm_dest * acm_record_dest(struct acm_ep *ep, struct ibv_wc *wc, struct acm_resolve_rec *req_rec, struct acm_resolve_rec *resp_rec) { struct acm_dest *dest, **tdest; int ret; acm_log(2, "\n"); lock_acquire(&ep->lock); tdest = tfind(req_rec->dest, &ep->dest_map[req_rec->dest_type - 1], acm_compare_dest); if (!tdest) { dest = NULL; goto unlock; } dest = *tdest; if (dest->ah) goto unlock; acm_log(2, "creating address handle\n"); ret = acm_record_av(dest, ep, wc, resp_rec); if (ret) { acm_log(0, "ERROR - failed to record av\n"); goto unlock; } dest->ah = ibv_create_ah(ep->port->dev->pd, &dest->av); if (!dest->ah) { acm_log(0, "ERROR - failed to create ah\n"); goto unlock; } dest->remote_qpn = wc->src_qp; dest->init_depth = resp_rec->init_depth; dest->resp_resources = resp_rec->resp_resources; unlock: lock_release(&ep->lock); return dest; } static void acm_process_resolve_resp(struct acm_ep *ep, struct ibv_wc *wc, struct acm_send_msg *msg, struct acm_mad *mad) { struct acm_resolve_rec *req_rec, *resp_rec; struct acm_dest *dest; struct acm_request *client_req; DLIST_ENTRY *entry; uint8_t status; status = acm_class_status(mad->status); acm_log(2, "resp status 0x%x\n", status); req_rec = (struct acm_resolve_rec *) ((struct acm_mad *) msg->data)->data; resp_rec = (struct acm_resolve_rec *) mad->data; dest = acm_record_dest(ep, wc, req_rec, resp_rec); if (!dest) { acm_log(0, "ERROR - cannot record dest\n"); return; } if (!status && !dest->ah) status = ACM_STATUS_EINVAL; lock_acquire(&ep->lock); while (!DListEmpty(&dest->req_queue)) { entry = dest->req_queue.Next; DListRemove(entry); client_req = container_of(entry, struct acm_request, entry); lock_release(&ep->lock); acm_log(2, "completing queued client request\n"); acm_client_resolve_resp(ep, client_req->client, (struct acm_resolve_msg *) &client_req->msg, dest, status); lock_acquire(&ep->lock); } if (status) { acm_log(0, "resp failed 0x%x\n", status); tdelete(dest->address, &ep->dest_map[req_rec->dest_type - 1], acm_compare_dest); } lock_release(&ep->lock); } static int acm_validate_recv(struct acm_mad *mad) { if (mad->base_version != 1 || mad->class_version != 1) { acm_log(0, "ERROR - invalid version %d %d\n", mad->base_version, mad->class_version); return ACM_STATUS_EINVAL; } if (mad->mgmt_class != ACM_MGMT_CLASS) { acm_log(0, "ERROR - invalid mgmt class 0x%x\n", mad->mgmt_class); return ACM_STATUS_EINVAL; } if (mad->control != ACM_CTRL_RESOLVE) { acm_log(0, "ERROR - invalid control 0x%x\n", mad->control); return ACM_STATUS_EINVAL; } return 0; } static void acm_process_recv(struct acm_ep *ep, struct ibv_wc *wc) { struct acm_mad *mad; struct acm_send_msg *req; int free; acm_log(2, "\n"); mad = (struct acm_mad *) (uintptr_t) (wc->wr_id + sizeof(struct ibv_grh)); if (acm_validate_recv(mad)) { acm_log(0, "ERROR - discarding message\n"); goto out; } if (mad->method & IB_METHOD_RESP) { acm_log(2, "received response\n"); req = acm_get_request(ep, mad->tid, &free); if (!req) { acm_log(0, "response did not match active request\n"); goto out; } acm_log(2, "found matching request\n"); acm_process_resolve_resp(ep, wc, req, mad); if (free) acm_free_send(req); } else { acm_log(2, "unsolicited request\n"); acm_process_resolve_req(ep, wc, mad); free = 0; } out: acm_post_recv(ep, wc->wr_id); } static void acm_process_comp(struct acm_ep *ep, struct ibv_wc *wc) { if (wc->status) { acm_log(0, "ERROR - work completion error\n" "\topcode %d, completion status %d\n", wc->opcode, wc->status); return; } if (wc->opcode & IBV_WC_RECV) acm_process_recv(ep, wc); else acm_complete_send((struct acm_send_msg *) (uintptr_t) wc->wr_id); } static void CDECL_FUNC acm_comp_handler(void *context) { struct acm_device *dev = (struct acm_device *) context; struct acm_ep *ep; struct ibv_cq *cq; struct ibv_wc wc; int cnt; acm_log(1, "started\n"); while (1) { ibv_get_cq_event(dev->channel, &cq, (void *) &ep); cnt = 0; while (ibv_poll_cq(cq, 1, &wc) > 0) { cnt++; acm_process_comp(ep, &wc); } ibv_req_notify_cq(cq, 0); while (ibv_poll_cq(cq, 1, &wc) > 0) { cnt++; acm_process_comp(ep, &wc); } ibv_ack_cq_events(cq, cnt); } } static void acm_format_mgid(union ibv_gid *mgid, uint16_t pkey, uint8_t tos, uint8_t rate, uint8_t mtu) { mgid->raw[0] = 0xFF; mgid->raw[1] = 0x10 | 0x05; mgid->raw[2] = 0x40; mgid->raw[3] = 0x01; mgid->raw[4] = (uint8_t) (pkey >> 8); mgid->raw[5] = (uint8_t) pkey; mgid->raw[6] = tos; mgid->raw[7] = rate; mgid->raw[8] = mtu; mgid->raw[9] = 0; mgid->raw[10] = 0; mgid->raw[11] = 0; mgid->raw[12] = 0; mgid->raw[13] = 0; mgid->raw[14] = 0; mgid->raw[15] = 0; } static void acm_init_path_query(struct ib_sa_mad *mad, struct ib_path_record *path) { uint32_t fl_hop; uint16_t qos_sl; acm_log(2, "\n"); mad->base_version = 1; mad->mgmt_class = IB_MGMT_CLASS_SA; mad->class_version = 2; mad->method = IB_METHOD_GET; mad->tid = (uint64_t) atomic_inc(&tid); mad->attr_id = IB_SA_ATTR_PATH_REC; memcpy(mad->data, path, sizeof(*path)); if (path->service_id) mad->comp_mask |= IB_COMP_MASK_PR_SERVICE_ID; if (path->dgid.global.interface_id || path->dgid.global.subnet_prefix) mad->comp_mask |= IB_COMP_MASK_PR_DGID; if (path->sgid.global.interface_id || path->sgid.global.subnet_prefix) mad->comp_mask |= IB_COMP_MASK_PR_SGID; if (path->dlid) mad->comp_mask |= IB_COMP_MASK_PR_DLID; if (path->slid) mad->comp_mask |= IB_COMP_MASK_PR_SLID; fl_hop = ntohl(path->flowlabel_hoplimit); if (fl_hop >> 8) mad->comp_mask |= IB_COMP_MASK_PR_FLOW_LABEL; if (fl_hop & 0xFF) mad->comp_mask |= IB_COMP_MASK_PR_HOP_LIMIT; if (path->tclass) mad->comp_mask |= IB_COMP_MASK_PR_TCLASS; if (path->reversible_numpath & 0x80) mad->comp_mask |= IB_COMP_MASK_PR_REVERSIBLE; if (path->pkey) mad->comp_mask |= IB_COMP_MASK_PR_PKEY; qos_sl = ntohs(path->qosclass_sl); if (qos_sl >> 4) mad->comp_mask |= IB_COMP_MASK_PR_QOS_CLASS; if (qos_sl & 0xF) mad->comp_mask |= IB_COMP_MASK_PR_SL; if (path->mtu & 0xC0) mad->comp_mask |= IB_COMP_MASK_PR_MTU_SELECTOR; if (path->mtu & 0x3F) mad->comp_mask |= IB_COMP_MASK_PR_MTU; if (path->rate & 0xC0) mad->comp_mask |= IB_COMP_MASK_PR_RATE_SELECTOR; if (path->rate & 0x3F) mad->comp_mask |= IB_COMP_MASK_PR_RATE; if (path->packetlifetime & 0xC0) mad->comp_mask |= IB_COMP_MASK_PR_PACKET_LIFETIME_SELECTOR; if (path->packetlifetime & 0x3F) mad->comp_mask |= IB_COMP_MASK_PR_PACKET_LIFETIME; } static void acm_init_join(struct ib_sa_mad *mad, union ibv_gid *port_gid, uint16_t pkey, uint8_t tos, uint8_t tclass, uint8_t sl, uint8_t rate, uint8_t mtu) { struct ib_mc_member_rec *mc_rec; acm_log(2, "\n"); mad->base_version = 1; mad->mgmt_class = IB_MGMT_CLASS_SA; mad->class_version = 2; mad->method = IB_METHOD_SET; mad->tid = (uint64_t) atomic_inc(&tid); mad->attr_id = IB_SA_ATTR_MC_MEMBER_REC; mad->comp_mask = IB_COMP_MASK_MC_MGID | IB_COMP_MASK_MC_PORT_GID | IB_COMP_MASK_MC_QKEY | IB_COMP_MASK_MC_MTU_SEL| IB_COMP_MASK_MC_MTU | IB_COMP_MASK_MC_TCLASS | IB_COMP_MASK_MC_PKEY | IB_COMP_MASK_MC_RATE_SEL | IB_COMP_MASK_MC_RATE | IB_COMP_MASK_MC_SL | IB_COMP_MASK_MC_FLOW | IB_COMP_MASK_MC_SCOPE | IB_COMP_MASK_MC_JOIN_STATE; mc_rec = (struct ib_mc_member_rec *) mad->data; acm_format_mgid(&mc_rec->mgid, pkey, tos, rate, mtu); mc_rec->port_gid = *port_gid; mc_rec->qkey = ACM_QKEY; mc_rec->mtu = 0x80 | mtu; mc_rec->tclass = tclass; mc_rec->pkey = htons(pkey); mc_rec->rate = 0x80 | rate; mc_rec->sl_flow_hop = htonl(((uint32_t) sl) << 28); mc_rec->scope_state = 0x51; } static void acm_join_group(struct acm_ep *ep, union ibv_gid *port_gid, uint8_t tos, uint8_t tclass, uint8_t sl, uint8_t rate, uint8_t mtu) { struct acm_port *port; struct ib_sa_mad *mad; struct ib_user_mad *umad; struct ib_mc_member_rec *mc_rec; int ret, len; acm_log(2, "\n"); len = sizeof(*umad) + sizeof(*mad); umad = (struct ib_user_mad *) zalloc(len); if (!umad) { acm_log(0, "ERROR - unable to allocate MAD for join\n"); return; } port = ep->port; umad->addr.qpn = htonl(port->sa_dest.remote_qpn); umad->addr.qkey = htonl(ACM_QKEY); umad->addr.pkey_index = ep->pkey_index; umad->addr.lid = htons(port->sa_dest.av.dlid); umad->addr.sl = port->sa_dest.av.sl; umad->addr.path_bits = port->sa_dest.av.src_path_bits; acm_log(0, "%s %d pkey 0x%x, sl 0x%x, rate 0x%x, mtu 0x%x\n", ep->port->dev->verbs->device->name, ep->port->port_num, ep->pkey, sl, rate, mtu); mad = (struct ib_sa_mad *) umad->data; acm_init_join(mad, port_gid, ep->pkey, tos, tclass, sl, rate, mtu); mc_rec = (struct ib_mc_member_rec *) mad->data; memcpy(&ep->mc_dest[ep->mc_cnt++], &mc_rec->mgid, sizeof(mc_rec->mgid)); ret = umad_send(port->mad_portid, port->mad_agentid, (void *) umad, sizeof(*mad), timeout, retries); if (ret) { acm_log(0, "ERROR - failed to send multicast join request %d\n", ret); goto out; } acm_log(1, "waiting for response from SA to join request\n"); ret = umad_recv(port->mad_portid, (void *) umad, &len, -1); if (ret < 0) { acm_log(0, "ERROR - recv error for multicast join response %d\n", ret); goto out; } acm_process_join_resp(ep, umad); out: free(umad); } static void acm_port_join(void *context) { struct acm_device *dev; struct acm_port *port = (struct acm_port *) context; struct acm_ep *ep; union ibv_gid port_gid; DLIST_ENTRY *ep_entry; int ret; dev = port->dev; acm_log(1, "device %s port %d\n", dev->verbs->device->name, port->port_num); ret = ibv_query_gid(dev->verbs, port->port_num, 0, &port_gid); if (ret) { acm_log(0, "ERROR - ibv_query_gid %d device %s port %d\n", ret, dev->verbs->device->name, port->port_num); return; } for (ep_entry = port->ep_list.Next; ep_entry != &port->ep_list; ep_entry = ep_entry->Next) { ep = container_of(ep_entry, struct acm_ep, entry); acm_join_group(ep, &port_gid, 0, 0, 0, min_rate, min_mtu); if (port->rate != min_rate || port->mtu != min_mtu) acm_join_group(ep, &port_gid, 0, 0, 0, port->rate, port->mtu); } acm_log(1, "joins for device %s port %d complete\n", dev->verbs->device->name, port->port_num); } static void acm_join_groups(void) { struct acm_device *dev; struct acm_port *port; DLIST_ENTRY *dev_entry; int i; acm_log(1, "initiating multicast joins for all ports\n"); for (dev_entry = dev_list.Next; dev_entry != &dev_list; dev_entry = dev_entry->Next) { dev = container_of(dev_entry, struct acm_device, entry); for (i = 0; i < dev->port_cnt; i++) { port = &dev->port[i]; if (port->state != IBV_PORT_ACTIVE) continue; acm_log(1, "starting join for device %s, port %d\n", dev->verbs->device->name, port->port_num); // TODO: handle dynamic changes //beginthread(acm_port_join, port); acm_port_join(port); } } } static void acm_process_timeouts(void) { DLIST_ENTRY *entry; struct acm_send_msg *msg; struct acm_mad *mad; struct acm_resolve_rec *rec; struct acm_dest *dest, **tdest; struct acm_request *req; struct acm_ep *ep; while (!DListEmpty(&timeout_list)) { entry = timeout_list.Next; DListRemove(entry); msg = container_of(entry, struct acm_send_msg, entry); mad = (struct acm_mad *) msg->data; rec = (struct acm_resolve_rec *) mad->data; ep = msg->ep; acm_log_ep_addr(0, "acm_process_timeouts: dest ", (union acm_ep_addr *) &rec->dest, rec->dest_type); lock_acquire(&ep->lock); tdest = tfind(rec->dest, &ep->dest_map[rec->dest_type - 1], acm_compare_dest); if (!tdest) { acm_log(0, "destination already removed\n"); lock_release(&ep->lock); continue; } else { dest = *tdest; } acm_log(2, "failing pending client requests\n"); while (!DListEmpty(&dest->req_queue)) { entry = dest->req_queue.Next; DListRemove(entry); req = container_of(entry, struct acm_request, entry); lock_release(&ep->lock); acm_client_resolve_resp(ep, req->client, (struct acm_resolve_msg *) &req->msg, dest, ACM_STATUS_ETIMEDOUT); lock_acquire(&ep->lock); } acm_log(2, "resolve timed out, releasing destination\n"); tdelete(dest->address, &ep->dest_map[rec->dest_type - 1], acm_compare_dest); lock_release(&ep->lock); } } static void acm_process_wait_queue(struct acm_ep *ep, uint64_t *next_expire) { struct acm_send_msg *msg; DLIST_ENTRY *entry, *next; struct ibv_send_wr *bad_wr; for (entry = ep->wait_queue.Next; entry != &ep->wait_queue; entry = next) { next = entry->Next; msg = container_of(entry, struct acm_send_msg, entry); if (msg->expires < time_stamp_ms()) { DListRemove(entry); (void) atomic_dec(&wait_cnt); if (--msg->tries) { acm_log(2, "retrying request\n"); DListInsertTail(&msg->entry, &ep->active_queue); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } else { acm_log(0, "failing request\n"); acm_send_available(ep); DListInsertTail(&msg->entry, &timeout_list); } } else { *next_expire = min(*next_expire, msg->expires); break; } } } static void CDECL_FUNC acm_retry_handler(void *context) { struct acm_device *dev; struct acm_port *port; struct acm_ep *ep; DLIST_ENTRY *dev_entry, *ep_entry; uint64_t next_expire; int i, wait; acm_log(0, "started\n"); while (1) { while (!atomic_get(&wait_cnt)) event_wait(&timeout_event, -1); next_expire = -1; for (dev_entry = dev_list.Next; dev_entry != &dev_list; dev_entry = dev_entry->Next) { dev = container_of(dev_entry, struct acm_device, entry); for (i = 0; i < dev->port_cnt; i++) { port = &dev->port[i]; for (ep_entry = port->ep_list.Next; ep_entry != &port->ep_list; ep_entry = ep_entry->Next) { ep = container_of(ep_entry, struct acm_ep, entry); lock_acquire(&ep->lock); if (!DListEmpty(&ep->wait_queue)) acm_process_wait_queue(ep, &next_expire); lock_release(&ep->lock); } } } acm_process_timeouts(); wait = (int) (next_expire - time_stamp_ms()); if (wait > 0 && atomic_get(&wait_cnt)) event_wait(&timeout_event, wait); } } static void acm_init_server(void) { int i; for (i = 0; i < FD_SETSIZE - 1; i++) { lock_init(&client[i].lock); client[i].index = i; client[i].sock = INVALID_SOCKET; } } static int acm_listen(void) { struct sockaddr_in addr; int ret; acm_log(2, "\n"); listen_socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); if (listen_socket == INVALID_SOCKET) { acm_log(0, "ERROR - unable to allocate listen socket\n"); return socket_errno(); } memset(&addr, 0, sizeof addr); addr.sin_family = AF_INET; addr.sin_port = htons(server_port); ret = bind(listen_socket, (struct sockaddr *) &addr, sizeof addr); if (ret == SOCKET_ERROR) { acm_log(0, "ERROR - unable to bind listen socket\n"); return socket_errno(); } ret = listen(listen_socket, 0); if (ret == SOCKET_ERROR) { acm_log(0, "ERROR - unable to start listen\n"); return socket_errno(); } acm_log(2, "listen active\n"); return 0; } static void acm_release_client(struct acm_client *client) { lock_acquire(&client->lock); shutdown(client->sock, SHUT_RDWR); closesocket(client->sock); client->sock = INVALID_SOCKET; lock_release(&client->lock); (void) atomic_dec(&client->refcnt); } static void acm_svr_accept(void) { SOCKET s; int i; acm_log(2, "\n"); s = accept(listen_socket, NULL, NULL); if (s == INVALID_SOCKET) { acm_log(0, "ERROR - failed to accept connection\n"); return; } for (i = 0; i < FD_SETSIZE - 1; i++) { if (!atomic_get(&client[i].refcnt)) break; } if (i == FD_SETSIZE - 1) { acm_log(0, "all connections busy - rejecting\n"); closesocket(s); return; } client[i].sock = s; atomic_set(&client[i].refcnt, 1); acm_log(2, "assigned client id %d\n", i); } static uint8_t acm_get_addr_type(uint8_t ep_type) { if (ep_type >= ACM_ADDRESS_RESERVED) { acm_log(0, "ERROR - invalid ep type %d\n", ep_type); return ACM_ADDRESS_INVALID; } return ep_type; } static int acm_client_query_resp(struct acm_ep *ep, struct acm_client *client, struct acm_query_msg *msg, uint8_t status) { int ret; acm_log(1, "status 0x%x\n", status); lock_acquire(&client->lock); if (client->sock == INVALID_SOCKET) { acm_log(0, "ERROR - connection lost\n"); ret = ACM_STATUS_ENOTCONN; goto release; } msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.status = status; ret = send(client->sock, (char *) msg, sizeof *msg, 0); if (ret != sizeof(*msg)) acm_log(0, "failed to send response\n"); else ret = 0; release: lock_release(&client->lock); (void) atomic_dec(&client->refcnt); return ret; } static struct acm_ep * acm_get_ep_by_path(struct ib_path_record *path) { struct acm_device *dev; struct acm_port *port; struct acm_ep *ep; DLIST_ENTRY *dev_entry, *ep_entry; int i; for (dev_entry = dev_list.Next; dev_entry != &dev_list; dev_entry = dev_entry->Next) { dev = container_of(dev_entry, struct acm_device, entry); for (i = 0; i < dev->port_cnt; i++) { port = &dev->port[i]; // requires slid if (port->lid != ntohs(path->slid)) continue; for (ep_entry = port->ep_list.Next; ep_entry != &port->ep_list; ep_entry = ep_entry->Next) { // ignores pkey ep = container_of(ep_entry, struct acm_ep, entry); return ep; } } } acm_log(0, "could not find endpoint\n"); return NULL; } // TODO: process send/recv asynchronously static uint8_t acm_query_sa(struct acm_ep *ep, uint8_t query, union acm_query_data *data) { struct acm_port *port; struct ib_sa_mad *mad; struct ib_user_mad *umad; int ret, len; size_t size; acm_log(2, "\n"); len = sizeof(*umad) + sizeof(*mad); umad = (struct ib_user_mad *) zalloc(len); if (!umad) { acm_log(0, "ERROR - unable to allocate MAD\n"); return ACM_STATUS_ENOMEM; } port = ep->port; umad->addr.qpn = htonl(port->sa_dest.remote_qpn); umad->addr.qkey = htonl(ACM_QKEY); umad->addr.pkey_index = ep->pkey_index; umad->addr.lid = htons(port->sa_dest.av.dlid); umad->addr.sl = port->sa_dest.av.sl; umad->addr.path_bits = port->sa_dest.av.src_path_bits; mad = (struct ib_sa_mad *) umad->data; switch (query) { case ACM_QUERY_PATH_RECORD: acm_init_path_query(mad, &data->path); size = sizeof(data->path); break; default: acm_log(0, "ERROR - unknown attribute id\n"); ret = ACM_STATUS_EINVAL; goto out; } ret = umad_send(port->mad_portid, port->mad_agentid, (void *) umad, sizeof(*mad), timeout, retries); if (ret) { acm_log(0, "ERROR - umad_send %d\n", ret); goto out; } acm_log(2, "waiting to receive SA response\n"); ret = umad_recv(port->mad_portid, (void *) umad, &len, -1); if (ret < 0) { acm_log(0, "ERROR - umad_recv %d\n", ret); goto out; } memcpy(data, mad->data, size); ret = umad->status ? umad->status : mad->status; if (ret) { acm_log(0, "SA query response error: 0x%x\n", ret); ret = ((uint8_t) ret) ? ret : -1; } out: free(umad); return (uint8_t) ret; } static int acm_svr_query(struct acm_client *client, struct acm_query_msg *msg) { struct acm_ep *ep; uint8_t status; acm_log(2, "processing client query\n"); ep = acm_get_ep_by_path(&msg->data.path); if (!ep) { acm_log(0, "could not find local end point\n"); status = ACM_STATUS_ESRCADDR; goto resp; } (void) atomic_inc(&client->refcnt); lock_acquire(&ep->lock); status = acm_query_sa(ep, msg->hdr.param & ~ACM_QUERY_SA, &msg->data); lock_release(&ep->lock); resp: return acm_client_query_resp(ep, client, msg, status); } static uint8_t acm_send_resolve(struct acm_ep *ep, union acm_ep_addr *src, uint8_t src_type, struct acm_dest *dest, uint8_t dest_type) { struct acm_send_msg *msg; struct acm_mad *mad; struct acm_resolve_rec *rec; int i; acm_log(2, "\n"); if (!ep->mc_dest[0].ah) { acm_log(0, "ERROR - multicast group not ready\n"); return ACM_STATUS_ENOTCONN; } msg = acm_alloc_send(ep, &ep->mc_dest[0], sizeof(struct acm_mad)); if (!msg) { acm_log(0, "ERROR - cannot allocate send msg\n"); return ACM_STATUS_ENOMEM; } msg->tries = retries + 1; mad = (struct acm_mad *) msg->data; mad->base_version = 1; mad->mgmt_class = ACM_MGMT_CLASS; mad->class_version = 1; mad->method = IB_METHOD_GET; mad->control = ACM_CTRL_RESOLVE; mad->tid = (uint64_t) atomic_inc(&tid); rec = (struct acm_resolve_rec *) mad->data; rec->src_type = src_type; rec->src_length = ACM_MAX_ADDRESS; memcpy(rec->src, src->addr, ACM_MAX_ADDRESS); rec->dest_type = dest_type; rec->dest_length = ACM_MAX_ADDRESS; memcpy(rec->dest, dest->address, ACM_MAX_ADDRESS); rec->resp_resources = ep->port->dev->resp_resources; rec->init_depth = ep->port->dev->init_depth; rec->gid_cnt = (uint8_t) ep->mc_cnt; for (i = 0; i < ep->mc_cnt; i++) memcpy(&rec->gid[i], ep->mc_dest[i].address, 16); acm_post_send(msg); return 0; } static struct acm_ep * acm_get_ep_by_addr(union acm_ep_addr *addr, uint8_t src_type) { struct acm_device *dev; struct acm_port *port; struct acm_ep *ep; DLIST_ENTRY *dev_entry, *ep_entry; int i; acm_log_ep_addr(2, "acm_get_ep_by_addr: ", addr, src_type); for (dev_entry = dev_list.Next; dev_entry != &dev_list; dev_entry = dev_entry->Next) { dev = container_of(dev_entry, struct acm_device, entry); for (i = 0; i < dev->port_cnt; i++) { port = &dev->port[i]; for (ep_entry = port->ep_list.Next; ep_entry != &port->ep_list; ep_entry = ep_entry->Next) { ep = container_of(ep_entry, struct acm_ep, entry); if (acm_addr_index(ep, addr->addr, src_type) >= 0) return ep; } } } acm_log_ep_addr(0, "acm_get_ep_by_addr: could not find ", addr, src_type); return NULL; } static int acm_svr_resolve(struct acm_client *client, struct acm_resolve_msg *msg) { struct acm_ep *ep; struct acm_dest *dest, **tdest; struct acm_request *req; uint8_t dest_type, src_type; uint8_t status; acm_log_ep_addr(2, "acm_svr_resolve: source ", &msg->src, msg->hdr.src_type); ep = acm_get_ep_by_addr(&msg->src, msg->hdr.src_type); if (!ep) { acm_log(0, "unknown local end point\n"); status = ACM_STATUS_ESRCADDR; goto resp; } dest_type = acm_get_addr_type(msg->hdr.dest_type); if (dest_type == ACM_ADDRESS_INVALID) { acm_log(0, "ERROR - unknown destination type\n"); status = ACM_STATUS_EDESTTYPE; goto resp; } acm_log_ep_addr(2, "acm_svr_resolve: dest ", &msg->dest, msg->hdr.dest_type); (void) atomic_inc(&client->refcnt); lock_acquire(&ep->lock); tdest = tfind(msg->dest.addr, &ep->dest_map[dest_type - 1], acm_compare_dest); dest = tdest ? *tdest : NULL; if (dest && dest->ah) { acm_log(2, "request satisfied from local cache\n"); status = ACM_STATUS_SUCCESS; goto release; } req = zalloc(sizeof *req); if (!req) { acm_log(0, "ERROR - unable to allocate memory to queue client request\n"); status = ACM_STATUS_ENOMEM; goto release; } if (!dest) { acm_log(2, "adding new destination\n"); dest = zalloc(sizeof *dest); if (!dest) { acm_log(0, "ERROR - unable to allocate destination in client request\n"); status = ACM_STATUS_ENOMEM; goto free_req; } memcpy(dest->address, msg->dest.addr, ACM_MAX_ADDRESS); src_type = acm_get_addr_type(msg->hdr.src_type); acm_log(2, "sending resolve msg to dest\n"); status = acm_send_resolve(ep, &msg->src, src_type, dest, dest_type); if (status) { acm_log(0, "ERROR - failure sending resolve request 0x%x\n", status); goto free_dest; } DListInit(&dest->req_queue); tsearch(dest, &ep->dest_map[dest_type - 1], acm_compare_dest); } acm_log(2, "queuing client request\n"); req->client = client; memcpy(&req->msg, msg, sizeof(req->msg)); DListInsertTail(&req->entry, &dest->req_queue); lock_release(&ep->lock); return 0; free_dest: free(dest); dest = NULL; free_req: free(req); release: lock_release(&ep->lock); resp: return acm_client_resolve_resp(ep, client, msg, dest, status); } static void acm_svr_receive(struct acm_client *client) { struct acm_msg msg; int ret; acm_log(2, "\n"); ret = recv(client->sock, (char *) &msg, sizeof msg, 0); if (ret != sizeof msg) { acm_log(2, "client disconnected\n"); ret = ACM_STATUS_ENOTCONN; goto out; } if (msg.hdr.version != ACM_VERSION) { acm_log(0, "ERROR - unsupported version %d\n", msg.hdr.version); goto out; } switch (msg.hdr.opcode & ACM_OP_MASK) { case ACM_OP_RESOLVE: ret = acm_svr_resolve(client, (struct acm_resolve_msg *) &msg); break; case ACM_OP_QUERY: ret = acm_svr_query(client, (struct acm_query_msg *) &msg); break; default: acm_log(0, "ERROR - unknown opcode 0x%x\n", msg.hdr.opcode); ret = -1; break; } out: if (ret) acm_release_client(client); } static void acm_server(void) { fd_set readfds; int i, n, ret; acm_log(0, "started\n"); acm_init_server(); ret = acm_listen(); if (ret) { acm_log(0, "ERROR - server listen failed\n"); return; } while (1) { n = (int) listen_socket; FD_ZERO(&readfds); FD_SET(listen_socket, &readfds); for (i = 0; i < FD_SETSIZE - 1; i++) { if (client[i].sock != INVALID_SOCKET) { FD_SET(client[i].sock, &readfds); n = max(n, (int) client[i].sock); } } ret = select(n + 1, &readfds, NULL, NULL, NULL); if (ret == SOCKET_ERROR) { acm_log(0, "ERROR - server select error\n"); continue; } if (FD_ISSET(listen_socket, &readfds)) acm_svr_accept(); for (i = 0; i < FD_SETSIZE - 1; i++) { if (client[i].sock != INVALID_SOCKET && FD_ISSET(client[i].sock, &readfds)) { acm_log(2, "receiving from client %d\n", i); acm_svr_receive(&client[i]); } } } } static enum ibv_rate acm_get_rate(uint8_t width, uint8_t speed) { switch (width) { case 1: switch (speed) { case 1: return IBV_RATE_2_5_GBPS; case 2: return IBV_RATE_5_GBPS; case 4: return IBV_RATE_10_GBPS; default: return IBV_RATE_MAX; } case 2: switch (speed) { case 1: return IBV_RATE_10_GBPS; case 2: return IBV_RATE_20_GBPS; case 4: return IBV_RATE_40_GBPS; default: return IBV_RATE_MAX; } case 4: switch (speed) { case 1: return IBV_RATE_20_GBPS; case 2: return IBV_RATE_40_GBPS; case 4: return IBV_RATE_80_GBPS; default: return IBV_RATE_MAX; } case 8: switch (speed) { case 1: return IBV_RATE_30_GBPS; case 2: return IBV_RATE_60_GBPS; case 4: return IBV_RATE_120_GBPS; default: return IBV_RATE_MAX; } default: acm_log(0, "ERROR - unknown link width 0x%x\n", width); return IBV_RATE_MAX; } } static enum ibv_mtu acm_convert_mtu(int mtu) { switch (mtu) { case 256: return IBV_MTU_256; case 512: return IBV_MTU_512; case 1024: return IBV_MTU_1024; case 2048: return IBV_MTU_2048; case 4096: return IBV_MTU_4096; default: return IBV_MTU_2048; } } static enum ibv_rate acm_convert_rate(int rate) { switch (rate) { case 2: return IBV_RATE_2_5_GBPS; case 5: return IBV_RATE_5_GBPS; case 10: return IBV_RATE_10_GBPS; case 20: return IBV_RATE_20_GBPS; case 30: return IBV_RATE_30_GBPS; case 40: return IBV_RATE_40_GBPS; case 60: return IBV_RATE_60_GBPS; case 80: return IBV_RATE_80_GBPS; case 120: return IBV_RATE_120_GBPS; default: return IBV_RATE_10_GBPS; } } static int acm_post_recvs(struct acm_ep *ep) { int i, size; size = recv_depth * ACM_RECV_SIZE; ep->recv_bufs = malloc(size); if (!ep->recv_bufs) { acm_log(0, "ERROR - unable to allocate receive buffer\n"); return ACM_STATUS_ENOMEM; } ep->mr = ibv_reg_mr(ep->port->dev->pd, ep->recv_bufs, size, IBV_ACCESS_LOCAL_WRITE); if (!ep->mr) { acm_log(0, "ERROR - unable to register receive buffer\n"); goto err; } for (i = 0; i < recv_depth; i++) { acm_post_recv(ep, (uintptr_t) (ep->recv_bufs + ACM_RECV_SIZE * i)); } return 0; err: free(ep->recv_bufs); return -1; } static int acm_assign_ep_names(struct acm_ep *ep) { char *dev_name; FILE *f; char s[120]; char dev[32], addr[32], pkey_str[8]; uint16_t pkey; uint8_t type; int port, index = 0; struct in6_addr ip_addr; dev_name = ep->port->dev->verbs->device->name; acm_log(1, "device %s, port %d, pkey 0x%x\n", dev_name, ep->port->port_num, ep->pkey); if (!(f = fopen("acm_addr.cfg", "r"))) { acm_log(0, "ERROR - unable to open acm_addr.cfg file\n"); return ACM_STATUS_ENODATA; } while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (sscanf(s, "%32s%32s%d%8s", addr, dev, &port, pkey_str) != 4) continue; acm_log(2, "%s", s); if (inet_pton(AF_INET, addr, &ip_addr) > 0) type = ACM_ADDRESS_IP; else if (inet_pton(AF_INET6, addr, &ip_addr) > 0) type = ACM_ADDRESS_IP6; else type = ACM_ADDRESS_NAME; if (stricmp(pkey_str, "default")) { if (sscanf(pkey_str, "%hx", &pkey) != 1) { acm_log(0, "ERROR - bad pkey format %s\n", pkey_str); continue; } } else { pkey = 0xFFFF; } if (!stricmp(dev_name, dev) && (ep->port->port_num == (uint8_t) port) && (ep->pkey == pkey)) { ep->addr_type[index] = type; acm_log(1, "assigning %s\n", addr); if (type == ACM_ADDRESS_IP) memcpy(ep->addr[index].addr, &ip_addr, 4); else if (type == ACM_ADDRESS_IP6) memcpy(ep->addr[index].addr, &ip_addr, sizeof ip_addr); else strncpy((char *) ep->addr[index].addr, addr, ACM_MAX_ADDRESS); if (++index == MAX_EP_ADDR) { acm_log(1, "maximum number of names assigned to EP\n"); break; } } } fclose(f); return !index; } static int acm_activate_ep(struct acm_port *port, struct acm_ep *ep, uint16_t pkey_index) { struct ibv_qp_init_attr init_attr; struct ibv_qp_attr attr; int ret; acm_log(1, "\n"); ep->port = port; ep->pkey_index = pkey_index; ep->available_sends = send_depth; DListInit(&ep->pending_queue); DListInit(&ep->active_queue); DListInit(&ep->wait_queue); lock_init(&ep->lock); ret = ibv_query_pkey(port->dev->verbs, port->port_num, pkey_index, &ep->pkey); if (ret) return ACM_STATUS_EINVAL; ret = acm_assign_ep_names(ep); if (ret) { acm_log(0, "ERROR - unable to assign EP name\n"); return ret; } ep->cq = ibv_create_cq(port->dev->verbs, send_depth + recv_depth, ep, port->dev->channel, 0); if (!ep->cq) { acm_log(0, "ERROR - failed to create CQ\n"); return -1; } ret = ibv_req_notify_cq(ep->cq, 0); if (ret) { acm_log(0, "ERROR - failed to arm CQ\n"); goto err1; } memset(&init_attr, 0, sizeof init_attr); init_attr.cap.max_send_wr = send_depth; init_attr.cap.max_recv_wr = recv_depth; init_attr.cap.max_send_sge = 1; init_attr.cap.max_recv_sge = 1; init_attr.qp_context = ep; init_attr.sq_sig_all = 1; init_attr.qp_type = IBV_QPT_UD; init_attr.send_cq = ep->cq; init_attr.recv_cq = ep->cq; ep->qp = ibv_create_qp(ep->port->dev->pd, &init_attr); if (!ep->qp) { acm_log(0, "ERROR - failed to create QP\n"); goto err1; } attr.qp_state = IBV_QPS_INIT; attr.port_num = port->port_num; attr.pkey_index = pkey_index; attr.qkey = ACM_QKEY; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_QKEY); if (ret) { acm_log(0, "ERROR - failed to modify QP to init\n"); goto err2; } attr.qp_state = IBV_QPS_RTR; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE); if (ret) { acm_log(0, "ERROR - failed to modify QP to rtr\n"); goto err2; } attr.qp_state = IBV_QPS_RTS; attr.sq_psn = 0; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE | IBV_QP_SQ_PSN); if (ret) { acm_log(0, "ERROR - failed to modify QP to rts\n"); goto err2; } ret = acm_post_recvs(ep); if (ret) goto err2; return 0; err2: ibv_destroy_qp(ep->qp); err1: ibv_destroy_cq(ep->cq); return -1; } static void acm_activate_port(struct acm_port *port) { struct acm_ep *ep; int i, ret; acm_log(1, "%s %d\n", port->dev->verbs->device->name, port->port_num); for (i = 0; i < port->pkey_cnt; i++) { ep = zalloc(sizeof *ep); if (!ep) break; ret = acm_activate_ep(port, ep, (uint16_t) i); if (!ret) { DListInsertHead(&ep->entry, &port->ep_list); } else { acm_log(0, "ERROR - failed to activate EP\n"); free(ep); } } if (DListEmpty(&port->ep_list)) goto err1; port->mad_portid = umad_open_port(port->dev->verbs->device->name, port->port_num); if (port->mad_portid < 0) { acm_log(0, "ERROR - unable to open MAD port\n"); goto err2; } port->mad_agentid = umad_register(port->mad_portid, IB_MGMT_CLASS_SA, 1, 1, NULL); if (port->mad_agentid < 0) { acm_log(0, "ERROR - unable to register MAD client\n"); goto err3; } return; err3: umad_close_port(port->mad_portid); err2: /* TODO: cleanup ep list */ err1: port->state = IBV_PORT_NOP; port->dev->active--; } static int acm_activate_dev(struct acm_device *dev) { int i; acm_log(1, "%s\n", dev->verbs->device->name); dev->pd = ibv_alloc_pd(dev->verbs); if (!dev->pd) return ACM_STATUS_ENOMEM; dev->channel = ibv_create_comp_channel(dev->verbs); if (!dev->channel) { acm_log(0, "ERROR - unable to create comp channel\n"); goto err1; } for (i = 0; i < dev->port_cnt; i++) { acm_log(2, "checking port %d\n", dev->port[i].port_num); if (dev->port[i].state == IBV_PORT_ACTIVE) acm_activate_port(&dev->port[i]); } if (!dev->active) goto err2; acm_log(1, "starting completion thread\n"); beginthread(acm_comp_handler, dev); return 0; err2: ibv_destroy_comp_channel(dev->channel); err1: ibv_dealloc_pd(dev->pd); return -1; } static void acm_init_port(struct acm_port *port) { struct ibv_port_attr attr; union ibv_gid gid; uint16_t pkey; int ret; acm_log(1, "%s %d\n", port->dev->verbs->device->name, port->port_num); DListInit(&port->ep_list); ret = ibv_query_port(port->dev->verbs, port->port_num, &attr); if (ret) return; port->state = attr.state; port->mtu = attr.active_mtu; port->rate = acm_get_rate(attr.active_width, attr.active_speed); port->subnet_timeout = 1 << (attr.subnet_timeout - 8); for (;; port->gid_cnt++) { ret = ibv_query_gid(port->dev->verbs, port->port_num, port->gid_cnt, &gid); if (ret || !gid.global.interface_id) break; } for (;; port->pkey_cnt++) { ret = ibv_query_pkey(port->dev->verbs, port->port_num, port->pkey_cnt, &pkey); if (ret || !pkey) break; } port->lid = attr.lid; port->lmc = attr.lmc; port->sa_dest.av.dlid = attr.sm_lid; port->sa_dest.av.sl = attr.sm_sl; port->sa_dest.av.port_num = port->port_num; port->sa_dest.remote_qpn = 1; if (port->state == IBV_PORT_ACTIVE) port->dev->active++; } static void acm_open_dev(struct ibv_device *ibdev) { struct acm_device *dev; struct ibv_device_attr attr; struct ibv_context *verbs; size_t size; int i, ret; acm_log(1, "%s\n", ibdev->name); verbs = ibv_open_device(ibdev); if (verbs == NULL) { acm_log(0, "ERROR - opening device %s\n", ibdev->name); return; } ret = ibv_query_device(verbs, &attr); if (ret) { acm_log(0, "ERROR - ibv_query_device (%s) %d\n", ret, ibdev->name); goto err1; } size = sizeof(*dev) + sizeof(struct acm_port) * attr.phys_port_cnt; dev = (struct acm_device *) zalloc(size); if (!dev) goto err1; dev->verbs = verbs; dev->guid = ibv_get_device_guid(ibdev); dev->port_cnt = attr.phys_port_cnt; dev->init_depth = (uint8_t) attr.max_qp_init_rd_atom; dev->resp_resources = (uint8_t) attr.max_qp_rd_atom; for (i = 0; i < dev->port_cnt; i++) { dev->port[i].dev = dev; dev->port[i].port_num = i + 1; acm_init_port(&dev->port[i]); } if (!dev->active || acm_activate_dev(dev)) goto err2; acm_log(1, "%s now active\n", ibdev->name); DListInsertHead(&dev->entry, &dev_list); return; err2: free(dev); err1: ibv_close_device(verbs); } static void acm_set_options(void) { FILE *f; char s[120]; char opt[32], value[32]; if (!(f = fopen("acm_opts.cfg", "r"))) return; while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (sscanf(s, "%32s%32s", opt, value) != 2) continue; if (!stricmp("log_file", opt)) strcpy(log_file, value); else if (!stricmp("log_level", opt)) log_level = atoi(value); else if (!stricmp("server_port", opt)) server_port = (short) atoi(value); else if (!stricmp("timeout", opt)) timeout = atoi(value); else if (!stricmp("retries", opt)) retries = atoi(value); else if (!stricmp("send_depth", opt)) send_depth = atoi(value); else if (!stricmp("recv_depth", opt)) recv_depth = atoi(value); else if (!stricmp("min_mtu", opt)) min_mtu = acm_convert_mtu(atoi(value)); else if (!stricmp("min_rate", opt)) min_rate = acm_convert_rate(atoi(value)); } fclose(f); } static void acm_log_options(void) { acm_log(0, "log level %d\n", log_level); acm_log(0, "server_port %d\n", server_port); acm_log(0, "timeout %d ms\n", timeout); acm_log(0, "retries %d\n", retries); acm_log(0, "send depth %d\n", send_depth); acm_log(0, "receive depth %d\n", recv_depth); acm_log(0, "minimum mtu %d\n", min_mtu); acm_log(0, "minimum rate %d\n", min_rate); } static FILE *acm_open_log(void) { FILE *f; int n; if (!stricmp(log_file, "stdout")) return stdout; if (!stricmp(log_file, "stderr")) return stderr; n = strlen(log_file); sprintf(&log_file[n], "%5u.log", getpid()); if (!(f = fopen(log_file, "w"))) f = stdout; return f; } int CDECL_FUNC main(int argc, char **argv) { struct ibv_device **ibdev; int dev_cnt; int i; if (osd_init()) return -1; acm_set_options(); lock_init(&log_lock); flog = acm_open_log(); acm_log(0, "Assistant to the InfiniBand Communication Manager\n"); acm_log_options(); DListInit(&dev_list); DListInit(&timeout_list); event_init(&timeout_event); umad_init(); ibdev = ibv_get_device_list(&dev_cnt); if (!ibdev) { acm_log(0, "ERROR - unable to get device list\n"); return -1; } acm_log(1, "opening devices\n"); for (i = 0; i < dev_cnt; i++) acm_open_dev(ibdev[i]); ibv_free_device_list(ibdev); acm_log(1, "initiating multicast joins\n"); acm_join_groups(); acm_log(1, "multicast joins done\n"); acm_log(1, "starting timeout/retry thread\n"); beginthread(acm_retry_handler, NULL); acm_log(1, "starting server\n"); acm_server(); acm_log(0, "shutting down\n"); fclose(flog); return 0; } From sean.hefty at intel.com Thu Sep 17 00:20:01 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 00:20:01 -0700 Subject: [ofa-general] RE: [ofw] [RFC] 2/5: IB ACM: windows abstractions In-Reply-To: <38877AA1B953874AAE8FBAEE42569B101CB2AE9B@TK5EX14MBXW651.wingroup.windeploy.ntdev.microsoft.com> References: <38877AA1B953874AAE8FBAEE42569B101CB2AE9B@TK5EX14MBXW651.wingroup.windeploy.ntdev.microsoft.com> Message-ID: <18C51D2B511F47809BE3811E57C773CA@amr.corp.intel.com> >> An attempt was made to limit the number of dependencies on external >> libraries, >> such as complib. We add Windows support for the Linux 'search' binary >> tree interfaces. This is implemented on Windows using complib >> fleximap, but >> gets linked in statically. > >Should we make complib a static library? I don't know if the memory savings >are that great when most users only use a fraction of the complib >functionality. If I have time I'll take a look at file sizes and report back. I don't know if it matters much. Making it static could actually make the overall stack smaller. For windows, it's trivial to pull in the needed complib source files statically, since it's one big build tree. I initially tried to limit the acm to only being dependent on libibverbs, but it's just too much work long term to avoid using libibumad. But I really don't want to have it depend on opensm in order to pick up complib when a short header file can provide what's needed. > >> void *tsearch(const void *key, void **rootp, >> int (*compar)(const void *, const void *)) >> { >> cl_fmap_item_t *item, *map_item; >> >> if (!*rootp) { >> *rootp = malloc(sizeof(cl_fmap_t)); > >You need to check that malloc returned you memory. > >> cl_fmap_init((cl_fmap_t *) *rootp, fcompare); >> } >> >> compare = compar; >> item = malloc(sizeof(cl_fmap_item_t)); > >Ditto. er... yeah From vlad at lists.openfabrics.org Thu Sep 17 03:06:34 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 17 Sep 2009 03:06:34 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090917-0200 daily build status Message-ID: <20090917100634.6BF01E620DE@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From monis at Voltaire.COM Thu Sep 17 03:16:12 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Thu, 17 Sep 2009 13:16:12 +0300 Subject: [ofa-general] [PATCH] IB/ipoib: Do not turn on carrier to a non active port Message-ID: <4AB20C6C.9090005@Voltaire.COM> This patch fixes https://bugs.openfabrics.org/show_bug.cgi?id=1726 Multicast join can succeed even if IB port is down. This happens when OpenSM runs on the same port as the requesting port. IPoIB on the other hand, calls netif_carrier_on() when join succeeded without caring about the state of the IB port. The result is - an IPoIB interface in RUNNING state but without active IB port to support it. If a bonding interface uses this IPoIB interface as a slave it might not detect that this slave is almost useless and failover functionality will be damaged. The fix here is to check the state of the IB port in the carrier_task before calling netif_carrier_on(). Signed-off-by: Moni Shoua --- drivers/infiniband/ulp/ipoib/ipoib.h | 2 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 2 ++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 13 +++++++++++-- 4 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 753a983..f29ce14 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -292,7 +292,7 @@ struct ipoib_dev_priv { struct delayed_work pkey_poll_task; struct delayed_work mcast_task; - struct work_struct carrier_on_task; + struct delayed_work carrier_on_task; struct work_struct flush_light; struct work_struct flush_normal; struct work_struct flush_heavy; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index e35f4a0..c452089 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -724,6 +724,8 @@ int ipoib_ib_dev_down(struct net_device *dev, int flush) ipoib_dbg(priv, "downing ib_dev\n"); clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + cancel_delayed_work(&priv->carrier_on_task); + netif_carrier_off(dev); /* Shutdown the P_Key thread if still active */ diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 2bf5116..5242e0d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1079,7 +1079,7 @@ static void ipoib_setup(struct net_device *dev) INIT_DELAYED_WORK(&priv->pkey_poll_task, ipoib_pkey_poll); INIT_DELAYED_WORK(&priv->mcast_task, ipoib_mcast_join_task); - INIT_WORK(&priv->carrier_on_task, ipoib_mcast_carrier_on_task); + INIT_DELAYED_WORK(&priv->carrier_on_task, ipoib_mcast_carrier_on_task); INIT_WORK(&priv->flush_light, ipoib_ib_dev_flush_light); INIT_WORK(&priv->flush_normal, ipoib_ib_dev_flush_normal); INIT_WORK(&priv->flush_heavy, ipoib_ib_dev_flush_heavy); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 25874fc..b4b4016 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -361,13 +361,22 @@ static int ipoib_mcast_sendonly_join(struct ipoib_mcast *mcast) void ipoib_mcast_carrier_on_task(struct work_struct *work) { struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv, - carrier_on_task); + carrier_on_task.work); + struct ib_port_attr attr; /* * Take rtnl_lock to avoid racing with ipoib_stop() and * turning the carrier back on while a device is being * removed. */ + + if (ib_query_port(priv->ca, priv->port, &attr) || + attr.state != IB_PORT_ACTIVE) { + ipoib_dbg(priv, "wait with carrier until IB port is active\n"); + if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, HZ); + return; + } rtnl_lock(); netif_carrier_on(priv->dev); rtnl_unlock(); @@ -403,7 +412,7 @@ static int ipoib_mcast_join_complete(int status, * deadlock on rtnl_lock here. */ if (mcast == priv->broadcast) - queue_work(ipoib_workqueue, &priv->carrier_on_task); + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, 0); return 0; } From bart.vanassche at gmail.com Thu Sep 17 03:22:11 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 17 Sep 2009 12:22:11 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AAFC42D.4030708@vlnb.net> <4AAFC794.7090205@vlnb.net> <4AAFCA77.6050305@vlnb.net> <4AB12B40.9050902@vlnb.net> Message-ID: On Wed, Sep 16, 2009 at 9:41 PM, Chris Worley wrote: > > On Wed, Sep 16, 2009 at 12:15 PM, Vladislav Bolkhovitin wrote: > > Chris Worley, on 09/16/2009 12:51 AM wrote: > > > > > > On Tue, Sep 15, 2009 at 11:10 AM, Vladislav Bolkhovitin > > > wrote: > > > [ ... ] > > > [  357.250550] ib_srpt: srpt_xmit_response: tag= 38 channel in bad state 2 > > > [  357.250553] scst: ***ERROR***: Target driver ib_srpt > > > xmit_response() returned fatal error > > > > It's because srpt called scst_tgt_cmd_done() when the corresponding command > > hasn't yet been sent to xmit_response() callback, so srpt should use another > > function to abort commands in this state. > > Could this be related to the hang (i.e. the command has been aborted > before xmit_response has been called... but w/o causing a panic)? When analyzing such logs it's important to distinguish between cause and consequence. What happened first is that the OFED SRP initiator noticed that something went wrong with the IB communication, as indicated by the log message "srp_qp_in_err_timer called". This means that an error occurred in the IB network or in one of the two IB stacks. This resulted in the SRP initiator trying to relogin without intervening logout. The error messages logged by SRPT are a consequence of the initiator relogin. While the SRPT issue will be fixed, such a fix won't solve the slow reads and the hang you observed. Regarding the SRP communication problems you observed: since my attempts to reproduce this issue have been unsuccessful so far, I'm afraid these communication problems are caused by some component in your IB network that is not working as reliable as it should. By the way, the description of the patch that generated the message "srp_qp_in_err_timer called" is interesting. The patch description indicates that the condition "srp_qp_in_err_timer called" should only happen during multipath failover. See also http://www.mail-archive.com/ewg at lists.openfabrics.org/msg01959.html (which is not the latest version of this patch). Bart. From bart.vanassche at gmail.com Thu Sep 17 03:47:26 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 17 Sep 2009 12:47:26 +0200 Subject: [ofa-general] Re: Merge process for OFED patches In-Reply-To: <4AB0FA4D.1020009@voltaire.com> References: <4AB0FA4D.1020009@voltaire.com> Message-ID: On Wed, Sep 16, 2009 at 4:46 PM, Or Gerlitz wrote: > > Bart Van Assche wrote: >> >> I would like to contact the author of the fourth patch. But unfortunately I could not find any author information in that patch. > > yes, non signed and  unreviewed patches is a common practice of ofed, does this create legal issues? maybe that would be the way to stop this? The reason I was looking for a patch description and patch author is that I would like to know why a certain message was generated by the OFED 1.4.1 SRP initiator. A description in the patch or an e-mail address of the patch author would have helped me to find this out. By the way, after I sent the e-mail at the start of this thread I have found this information via a web search. Bart. From hal.rosenstock at gmail.com Thu Sep 17 04:32:24 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 07:32:24 -0400 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: <20090830120011.GG21909@me> References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> Message-ID: On Sun, Aug 30, 2009 at 8:00 AM, Sasha Khapyorsky wrote: > On 07:32 Sun 30 Aug , Hal Rosenstock wrote: > > > > > > > > osm_link_mgr.c:link_mgr_get_smsl has the following: > > > > > > > > /* Find osm_port of the source = p_physp */ > > > > slid = osm_physp_get_base_lid(p_physp); > > > > p_src_port = > > > > cl_ptr_vector_get(&sm->p_subn->port_lid_tbl, > > > cl_ntoh16(slid)); > > > > > > > > /* Call lash to find proper SL */ > > > > sl = osm_get_lash_sl(p_osm, p_src_port, p_sm_port); > > > > > > > > It may be that this code is invoked prior to the LID being assigned > > > > > > How is it possible? In the code I can see that link_mgr_process() is > > > always executed after lid_mgr run. > > > > When nodes use gPXE, the LID is not passed from the gPXE to the Linux > > environment. > > How is it related to gPXE? > > OpenSM's lid manager runs and assigns lids to all available endports, > only after this link manager runs and try with SMSL - at this point all > lids should be in place and p_subn->port_lid_tbl should be fine. > Is that (lids in place) always the case ? What about if the sets of PortInfo for LID fail. -- Hal > > Am I missing something? > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peterz at infradead.org Thu Sep 17 04:30:28 2009 From: peterz at infradead.org (Peter Zijlstra) Date: Thu, 17 Sep 2009 13:30:28 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: Message-ID: <1253187028.8439.2.camel@twins> On Thu, 2009-09-10 at 21:38 -0700, Roland Dreier wrote: > Linus, please consider pulling from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This will get "ummunotify," a new character device that allows a > userspace library to register for MMU notifications; this is > particularly useful for MPI implementions (message passing libraries > used in HPC) to be able to keep track of what wacky things consumers > do to their memory mappings. My colleague Jeff Squyres from the Open > MPI project posted a blog entry about why MPI wants this: > > http://blogs.cisco.com/ciscotalk/performance/comments/better_linux_memory_tracking/ > > His summary of ummunotify: > > "It’s elegant, doesn’t require strange linker tricks, and seems to > work in all cases. Yay!" > > This code went through several review iterations on lkml and was in > -mm and -next for quite a few weeks. Andrew is OK with merging it (I > think -- Andrew please correct me if I misunderstood you). Anton Blanchard suggested a while back that this might be integrated with perf-counters, since perf-counters already does mmap() tracking and also provides events through an mmap()'ed buffer. Has anybody looked into this? If someone did and I missed the discussion on why it isn't appropriate, kindly point me in the right direction ;-) From ftillier at microsoft.com Wed Sep 16 23:35:07 2009 From: ftillier at microsoft.com (Fab Tillier) Date: Thu, 17 Sep 2009 06:35:07 +0000 Subject: [ofa-general] RE: [ofw] [RFC] 2/5: IB ACM: windows abstractions In-Reply-To: References: Message-ID: <38877AA1B953874AAE8FBAEE42569B101CB2AE9B@TK5EX14MBXW651.wingroup.windeploy.ntdev.microsoft.com> > -----Original Message----- > From: ofw-bounces at lists.openfabrics.org [mailto:ofw- > bounces at lists.openfabrics.org] On Behalf Of Sean Hefty > Sent: Wednesday, September 16, 2009 11:28 PM > To: Hefty, Sean; ofw at lists.openfabrics.org; OpenFabrics General > Subject: [ofw] [RFC] 2/5: IB ACM: windows abstractions > > The following abstractions are defined to support the IB ACM running on > Windows. > > An attempt was made to limit the number of dependencies on external > libraries, > such as complib. We add Windows support for the Linux 'search' binary > tree interfaces. This is implemented on Windows using complib > fleximap, but > gets linked in statically. Should we make complib a static library? I don't know if the memory savings are that great when most users only use a fraction of the complib functionality. If I have time I'll take a look at file sizes and report back. > void *tsearch(const void *key, void **rootp, > int (*compar)(const void *, const void *)) > { > cl_fmap_item_t *item, *map_item; > > if (!*rootp) { > *rootp = malloc(sizeof(cl_fmap_t)); You need to check that malloc returned you memory. > cl_fmap_init((cl_fmap_t *) *rootp, fcompare); > } > > compare = compar; > item = malloc(sizeof(cl_fmap_item_t)); Ditto. > map_item = cl_fmap_insert((cl_fmap_t *) *rootp, key, item); > if (map_item != item) > free(item); > > return (void *) &map_item->p_key; > } -Fab From rdreier at cisco.com Thu Sep 17 07:24:45 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 07:24:45 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <1253187028.8439.2.camel@twins> (Peter Zijlstra's message of "Thu, 17 Sep 2009 13:30:28 +0200") References: <1253187028.8439.2.camel@twins> Message-ID: > Anton Blanchard suggested a while back that this might be integrated > with perf-counters, since perf-counters already does mmap() tracking and > also provides events through an mmap()'ed buffer. > > Has anybody looked into this? I didn't see the original suggestion. Certainly hooking in to existing infrastructure for user/kernel communication would be good. The fit doesn't seem great to me, although I am rather naive about perf counters. The problem that ummunotify is trying to solve is to let an app say 'for these 1000 address ranges (that possibly only cover a small part of my total address space), please let me know when the mappings are invalidated for any reason'. So getting those events in the kernel is no problem -- we have the MMU notifier hooks that tell us exactly what we need to know. The issue is purely the way userspace registers interest in address ranges, and how to kernel returns the events. For perf counters it seems that one would have to create a new counter for each address range... is that correct? And also I don't know if perf counter has an analog for the fast path optimization that ummunotify provides via a mmap'ed generation counter (a quick way for userspace to see 'nothing happened since last time you checked'). - R. From rdreier at cisco.com Thu Sep 17 07:32:47 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 07:32:47 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: (Roland Dreier's message of "Thu, 17 Sep 2009 07:24:45 -0700") References: <1253187028.8439.2.camel@twins> Message-ID: > So getting those events in the kernel is no problem -- we have the MMU > notifier hooks that tell us exactly what we need to know. The issue is > purely the way userspace registers interest in address ranges, and how > to kernel returns the events. > > For perf counters it seems that one would have to create a new counter > for each address range... is that correct? And also I don't know if > perf counter has an analog for the fast path optimization that > ummunotify provides via a mmap'ed generation counter (a quick way for > userspace to see 'nothing happened since last time you checked'). Oh I forgot... ummunotify also preallocates everything etc. so that there is no way for events to be lost. Which saves userspace from having to trash everything cached and start over, which it would have to do if it misses an invalidate event. And AFAIK, pref counters does have the possibility of overflowing a buffer and losing an event, right? - R. From peterz at infradead.org Thu Sep 17 07:43:13 2009 From: peterz at infradead.org (Peter Zijlstra) Date: Thu, 17 Sep 2009 16:43:13 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <1253187028.8439.2.camel@twins> Message-ID: <1253198593.14935.20.camel@laptop> On Thu, 2009-09-17 at 07:24 -0700, Roland Dreier wrote: > > Anton Blanchard suggested a while back that this might be integrated > > with perf-counters, since perf-counters already does mmap() tracking and > > also provides events through an mmap()'ed buffer. > > > > Has anybody looked into this? > > I didn't see the original suggestion. Certainly hooking in to existing > infrastructure for user/kernel communication would be good. > > The fit doesn't seem great to me, although I am rather naive about perf > counters. The problem that ummunotify is trying to solve is to let an > app say 'for these 1000 address ranges (that possibly only cover a small > part of my total address space), please let me know when the mappings > are invalidated for any reason'. > > So getting those events in the kernel is no problem -- we have the MMU > notifier hooks that tell us exactly what we need to know. The issue is > purely the way userspace registers interest in address ranges, and how > to kernel returns the events. > > For perf counters it seems that one would have to create a new counter > for each address range... is that correct? And also I don't know if > perf counter has an analog for the fast path optimization that > ummunotify provides via a mmap'ed generation counter (a quick way for > userspace to see 'nothing happened since last time you checked'). You're right in that perf-counter currently doesn't provide a way to specify these ranges, we simply track all mmap() traffic. The thing is that mmap() data is basically a side channel. For your usage you'd basically have to open a NOP counter and only observe the mmap data. We could look at ways of adding ranges.. We do have a means of detecting if new data is available, we keep a data head index. If that moves, you've got new stuff. From peterz at infradead.org Thu Sep 17 07:49:36 2009 From: peterz at infradead.org (Peter Zijlstra) Date: Thu, 17 Sep 2009 16:49:36 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <1253187028.8439.2.camel@twins> Message-ID: <1253198976.14935.27.camel@laptop> On Thu, 2009-09-17 at 07:32 -0700, Roland Dreier wrote: > > So getting those events in the kernel is no problem -- we have the MMU > > notifier hooks that tell us exactly what we need to know. The issue is > > purely the way userspace registers interest in address ranges, and how > > to kernel returns the events. > > > > For perf counters it seems that one would have to create a new counter > > for each address range... is that correct? And also I don't know if > > perf counter has an analog for the fast path optimization that > > ummunotify provides via a mmap'ed generation counter (a quick way for > > userspace to see 'nothing happened since last time you checked'). > > Oh I forgot... ummunotify also preallocates everything etc. so that > there is no way for events to be lost. Which saves userspace from > having to trash everything cached and start over, which it would have to > do if it misses an invalidate event. > > And AFAIK, pref counters does have the possibility of overflowing a > buffer and losing an event, right? Well, you cannot pre-allocate everything, either you get back-logged evens in kernel space leading to a kernel DoS, or you loose events. Perf counters have two modes, a RO mmap() and a RW mmap(). The RO mode will automagically overwrite its tail data without regard for userspace having observed it. In the RW mode userspace has to advance the tail, the kernel will drop events when full and insert a PERF_EVENT_LOST event once there is room again. Hmm, or are you saying you can only get 1 event per registered range and allocate the thing on registration? That'd need some registration limit to avoid DoS scenarios. From rdreier at cisco.com Thu Sep 17 08:02:04 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 08:02:04 -0700 Subject: [ofa-general] Re: [PATCH] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: <4AB20C6C.9090005@Voltaire.COM> (Moni Shoua's message of "Thu, 17 Sep 2009 13:16:12 +0300") References: <4AB20C6C.9090005@Voltaire.COM> Message-ID: > + if (ib_query_port(priv->ca, priv->port, &attr) || > + attr.state != IB_PORT_ACTIVE) { > + ipoib_dbg(priv, "wait with carrier until IB port is active\n"); > + if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) > + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, HZ); > + return; > + } This queueing delayed work to poll the port state seems a bit odd to me... we get an event when the port changes state anyway, right? So can't we just turn the carrier on when we get an active event? - R. From rdreier at cisco.com Thu Sep 17 08:03:25 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 08:03:25 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <1253198976.14935.27.camel@laptop> (Peter Zijlstra's message of "Thu, 17 Sep 2009 16:49:36 +0200") References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> Message-ID: > Hmm, or are you saying you can only get 1 event per registered range and > allocate the thing on registration? That'd need some registration limit > to avoid DoS scenarios. Yes, that's what I do. You're right, I should add a limit... although their are lots of ways for userspace to consume arbitrary amounts of kernel resources already. - R. From rdreier at cisco.com Thu Sep 17 08:45:29 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 08:45:29 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: (Roland Dreier's message of "Thu, 17 Sep 2009 08:03:25 -0700") References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> Message-ID: > > > Hmm, or are you saying you can only get 1 event per registered range and > > > allocate the thing on registration? That'd need some registration limit > > > to avoid DoS scenarios. > > > > Yes, that's what I do. You're right, I should add a limit... although > > their are lots of ways for userspace to consume arbitrary amounts of > > kernel resources already. > > I'd be good to work at reducing that number, not adding to it ;-) Yes, definitely. I'll add a quick ummunotify module parameter that limits the number of registrations per process. > But yeah, I currently don't see a very nice match to perf counters. OK. It would be nice to tie into something more general, but I think I agree -- perf counters are missing the filtering and the "no lost events" that ummunotify does have. And I'm not sure it's worth messing up the perf counters design just to jam one more not totally related thing in. - R. From rdreier at cisco.com Thu Sep 17 08:50:18 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Sep 2009 08:50:18 -0700 Subject: [ofa-general] Re: [PATCH] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: <4AB20C6C.9090005@Voltaire.COM> (Moni Shoua's message of "Thu, 17 Sep 2009 13:16:12 +0300") References: <4AB20C6C.9090005@Voltaire.COM> Message-ID: And by the way, this current patch has a deadlock I think: > @@ -724,6 +724,8 @@ int ipoib_ib_dev_down(struct net_device *dev, int flush) > ipoib_dbg(priv, "downing ib_dev\n"); > > clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); > + cancel_delayed_work(&priv->carrier_on_task); ipoib_ib_dev_down() is called with rtnl held but carrier_on_task() does rtn_lock(). So if carrier_on_task() is running but about to take the rtnl when we try to do cancel_delayed_work() here, then it will wait forever. I think using lockdep on a new enough kernel (2.6.30 or maybe 2.6.31) will report workqueue / timer vs. lock deadlocks. - R. From peterz at infradead.org Thu Sep 17 08:22:24 2009 From: peterz at infradead.org (Peter Zijlstra) Date: Thu, 17 Sep 2009 17:22:24 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> Message-ID: <1253200944.14935.29.camel@laptop> On Thu, 2009-09-17 at 08:03 -0700, Roland Dreier wrote: > > Hmm, or are you saying you can only get 1 event per registered range and > > allocate the thing on registration? That'd need some registration limit > > to avoid DoS scenarios. > > Yes, that's what I do. You're right, I should add a limit... although > their are lots of ways for userspace to consume arbitrary amounts of > kernel resources already. I'd be good to work at reducing that number, not adding to it ;-) But yeah, I currently don't see a very nice match to perf counters. From weiny2 at llnl.gov Thu Sep 17 10:18:04 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 17 Sep 2009 10:18:04 -0700 Subject: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: Message-ID: <20090917101804.12e9e5ce.weiny2@llnl.gov> On Wed, 16 Sep 2009 23:45:05 -0700 "Sean Hefty" wrote: > Add an end-user library with simple interfaces for communicating > with the ib_acm service. > > The linux and windows specific files for the library are simple and not > shown for this review > > Signed-off-by: Sean Hefty > --- > > ib_acm.h: defines library interfaces. > These are the end-user application interfaces to the ib acm. > [snip] > > #define IB_PATH_RECORD_REVERSIBLE 0x80 > > struct ib_path_record > { > uint64_t service_id; > union ibv_gid dgid; > union ibv_gid sgid; > uint16_t dlid; > uint16_t slid; > uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop limit-7:0*/ > uint8_t tclass; > uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ > uint16_t pkey; > uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ > uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ > uint8_t rate; /* rate selector-7:6 rate-5:0 */ > uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 */ > uint8_t preference; > uint8_t reserved[6]; > }; I would prefer to use the structures already defined in ib_types.h... I understand your not wanting to make ACM dependant on the OpenSM packages so is it time to move ib_types.h out of the OpenSM tree and somewhere more generic? Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) inlined functions. Perhaps it deserves it's own library? Ira [snip] -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From sean.hefty at intel.com Thu Sep 17 10:35:39 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 10:35:39 -0700 Subject: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917101804.12e9e5ce.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> Message-ID: >> #define IB_PATH_RECORD_REVERSIBLE 0x80 >> >> struct ib_path_record >> { >> uint64_t service_id; >> union ibv_gid dgid; >> union ibv_gid sgid; >> uint16_t dlid; >> uint16_t slid; >> uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop >limit-7:0*/ >> uint8_t tclass; >> uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ >> uint16_t pkey; >> uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ >> uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ >> uint8_t rate; /* rate selector-7:6 rate-5:0 */ >> uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 >*/ >> uint8_t preference; >> uint8_t reserved[6]; >> }; > >I would prefer to use the structures already defined in ib_types.h... I >understand your not wanting to make ACM dependant on the OpenSM packages so is >it time to move ib_types.h out of the OpenSM tree and somewhere more generic? >Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. > >OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) inlined >functions. Perhaps it deserves it's own library? Defining some of these types in libibumad isn't a bad idea. Although, WinOF actually has 2 copies of ib_types.h (that differ...) I find using ib_types.h painful given its size; separate header files may help. - Sean From sashak at voltaire.com Thu Sep 17 11:04:19 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 17 Sep 2009 21:04:19 +0300 Subject: [ofa-general] [PATCH] opensm: fix uninitialized return value in osm_sm_mcgrp_leave() In-Reply-To: <20090906173900.GK25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> Message-ID: <20090917180419.GA23406@me> Fix uninitialized return value in osm_sm_mcgrp_leave() function. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sm.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index e446c9d..f3fa7f4 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -516,7 +516,7 @@ ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid) { osm_port_t *p_port; - ib_api_status_t status; + ib_api_status_t status = IB_SUCCESS; OSM_LOG_ENTER(p_sm->p_log); -- 1.6.5.rc1 From arlin.r.davis at intel.com Thu Sep 17 11:32:16 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Thu, 17 Sep 2009 11:32:16 -0700 Subject: [ofa-general] [PATCH] dapl v2: common: no cleanup/release code for timer thread Message-ID: dapl_set_timer() creates a thread to process timers for dat_ep_connect but provides no mechanism to destroy/exit during dapl library unload. Timers are initialized in library init code and should be released in the fini code. Add a dapl_timer_release call to the dapl_fini function to check state of timer thread and destroy before exiting. Signed-off-by: Arlin Davis --- dapl/common/dapl_timer_util.c | 48 ++++++++++++++++++++++++++++++++++++++++- dapl/common/dapl_timer_util.h | 1 + dapl/udapl/dapl_init.c | 1 + 3 files changed, 49 insertions(+), 1 deletions(-) diff --git a/dapl/common/dapl_timer_util.c b/dapl/common/dapl_timer_util.c index f0d7964..cccaff1 100644 --- a/dapl/common/dapl_timer_util.c +++ b/dapl/common/dapl_timer_util.c @@ -52,11 +52,17 @@ #include "dapl.h" #include "dapl_timer_util.h" +#define DAPL_TIMER_INIT 0 +#define DAPL_TIMER_RUN 1 +#define DAPL_TIMER_DESTROY 2 +#define DAPL_TIMER_EXIT 3 + struct timer_head { DAPL_LLIST_HEAD timer_list_head; DAPL_OS_LOCK lock; DAPL_OS_WAIT_OBJECT wait_object; DAPL_OS_THREAD timeout_thread_handle; + int state; } g_daplTimerHead; typedef struct timer_head DAPL_TIMER_HEAD; @@ -73,6 +79,23 @@ void dapls_timer_init() dapl_os_lock_init(&g_daplTimerHead.lock); dapl_os_wait_object_init(&g_daplTimerHead.wait_object); g_daplTimerHead.timeout_thread_handle = 0; + g_daplTimerHead.state = DAPL_TIMER_INIT; +} + +void dapls_timer_release() +{ + dapl_os_lock(&g_daplTimerHead.lock); + if (g_daplTimerHead.state != DAPL_TIMER_RUN) { + dapl_os_unlock(&g_daplTimerHead.lock); + return; + } + + g_daplTimerHead.state = DAPL_TIMER_DESTROY; + dapl_os_unlock(&g_daplTimerHead.lock); + while (g_daplTimerHead.state != DAPL_TIMER_EXIT) { + dapl_os_wait_object_wakeup(&g_daplTimerHead.wait_object); + dapl_os_sleep_usec(2000); + } } /* @@ -107,6 +130,9 @@ dapls_timer_set(IN DAPL_OS_TIMER * timer, dapl_os_thread_create(dapls_timer_thread, &g_daplTimerHead, &g_daplTimerHead.timeout_thread_handle); + + while (g_daplTimerHead.state != DAPL_TIMER_RUN) + dapl_os_sleep_usec(2000); } dapl_llist_init_entry(&timer->list_entry); @@ -121,6 +147,12 @@ dapls_timer_set(IN DAPL_OS_TIMER * timer, * first. */ dapl_os_lock(&g_daplTimerHead.lock); + + if (g_daplTimerHead.state != DAPL_TIMER_RUN) { + dapl_os_unlock(&g_daplTimerHead.lock); + return DAT_INVALID_STATE; + } + /* * Deal with 3 cases due to our list structure: * 1) list is empty: become the list head @@ -246,6 +278,10 @@ void dapls_timer_thread(void *arg) timer_head = arg; + dapl_os_lock(&timer_head->lock); + timer_head->state = DAPL_TIMER_RUN; + dapl_os_unlock(&timer_head->lock); + for (;;) { if (dapl_llist_is_empty(&timer_head->timer_list_head)) { dat_status = @@ -265,7 +301,9 @@ void dapls_timer_thread(void *arg) timer_list_head); dapl_os_get_time(&cur_time); - if (list_ptr->expires <= cur_time) { + if (list_ptr->expires <= cur_time || + timer_head->state == DAPL_TIMER_DESTROY) { + /* * Remove the entry from the list. Sort out how much * time we need to sleep for the next one @@ -297,6 +335,14 @@ void dapls_timer_thread(void *arg) dapl_os_lock(&timer_head->lock); } } + + /* Destroy - all timers triggered and list is empty */ + if (timer_head->state == DAPL_TIMER_DESTROY) { + timer_head->state = DAPL_TIMER_EXIT; + dapl_os_unlock(&timer_head->lock); + break; + } + /* * release the lock before going back to the top to sleep */ diff --git a/dapl/common/dapl_timer_util.h b/dapl/common/dapl_timer_util.h index c24d26a..02f7069 100644 --- a/dapl/common/dapl_timer_util.h +++ b/dapl/common/dapl_timer_util.h @@ -36,6 +36,7 @@ **********************************************************************/ void dapls_timer_init ( void ); +void dapls_timer_release( void ); DAT_RETURN dapls_timer_set ( IN DAPL_OS_TIMER *timer, diff --git a/dapl/udapl/dapl_init.c b/dapl/udapl/dapl_init.c index e0af8f7..a889ffb 100644 --- a/dapl/udapl/dapl_init.c +++ b/dapl/udapl/dapl_init.c @@ -151,6 +151,7 @@ void dapl_fini(void) } dapls_ib_release(); + dapls_timer_release(); dapl_dbg_log(DAPL_DBG_TYPE_UTIL, "DAPL: Exit (dapl_fini)\n"); -- 1.5.2.5 From arlin.r.davis at intel.com Thu Sep 17 11:32:25 2009 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 17 Sep 2009 11:32:25 -0700 Subject: [ofa-general] [PATCH] dapl v2: scm, cma: thread doesn't always get teminated on library close. Message-ID: DAPL doesn't actually wait for the async processing thread to exit before allowing the library to close. It will wait up to 10 seconds, which under heavy load isn't enough time. Since the thread is created by an application level thread, it will continue to run as long as the application runs. But if the application closes the library, then all library data and code is invalid, which can result in the thread running something that's not library code and accessing freed memory. Signed-off-by: Sean Hefty --- dapl/openib_cma/device.c | 8 +------- dapl/openib_scm/device.c | 8 +------- 2 files changed, 2 insertions(+), 14 deletions(-) diff --git a/dapl/openib_cma/device.c b/dapl/openib_cma/device.c index 743e8fa..c1c1ee2 100644 --- a/dapl/openib_cma/device.c +++ b/dapl/openib_cma/device.c @@ -562,8 +562,6 @@ DAT_RETURN dapli_ib_thread_init(void) void dapli_ib_thread_destroy(void) { - int retries = 10; - dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ib_thread_destroy(%d)\n", dapl_os_getpid()); /* @@ -578,11 +576,7 @@ void dapli_ib_thread_destroy(void) goto bail; g_ib_thread_state = IB_THREAD_CANCEL; - if (dapls_thread_signal() == -1) - dapl_log(DAPL_DBG_TYPE_UTIL, - " destroy: thread wakeup error = %s\n", - strerror(errno)); - while ((g_ib_thread_state != IB_THREAD_EXIT) && (retries--)) { + while ((g_ib_thread_state != IB_THREAD_EXIT)) { dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ib_thread_destroy: waiting for ib_thread\n"); if (dapls_thread_signal() == -1) diff --git a/dapl/openib_scm/device.c b/dapl/openib_scm/device.c index 9c91b78..bb72279 100644 --- a/dapl/openib_scm/device.c +++ b/dapl/openib_scm/device.c @@ -545,8 +545,6 @@ DAT_RETURN dapli_ib_thread_init(void) void dapli_ib_thread_destroy(void) { - int retries = 10; - dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ib_thread_destroy(%d)\n", dapl_os_getpid()); /* @@ -561,11 +559,7 @@ void dapli_ib_thread_destroy(void) goto bail; g_ib_thread_state = IB_THREAD_CANCEL; - if (dapls_thread_signal() == -1) - dapl_log(DAPL_DBG_TYPE_UTIL, - " destroy: thread wakeup error = %s\n", - strerror(errno)); - while ((g_ib_thread_state != IB_THREAD_EXIT) && (retries--)) { + while (g_ib_thread_state != IB_THREAD_EXIT) { dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ib_thread_destroy: waiting for ib_thread\n"); if (dapls_thread_signal() == -1) -- 1.5.2.5 From weiny2 at llnl.gov Thu Sep 17 13:20:50 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 17 Sep 2009 13:20:50 -0700 Subject: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> Message-ID: <20090917132050.041b077d.weiny2@llnl.gov> On Thu, 17 Sep 2009 10:35:39 -0700 "Sean Hefty" wrote: > >> #define IB_PATH_RECORD_REVERSIBLE 0x80 > >> > >> struct ib_path_record > >> { > >> uint64_t service_id; > >> union ibv_gid dgid; > >> union ibv_gid sgid; > >> uint16_t dlid; > >> uint16_t slid; > >> uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop > >limit-7:0*/ > >> uint8_t tclass; > >> uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ > >> uint16_t pkey; > >> uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ > >> uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ > >> uint8_t rate; /* rate selector-7:6 rate-5:0 */ > >> uint8_t packetlifetime; /* lifetime selector-7:6 > lifetime-5:0 > >*/ > >> uint8_t preference; > >> uint8_t reserved[6]; > >> }; > > > >I would prefer to use the structures already defined in ib_types.h... I > >understand your not wanting to make ACM dependant on the OpenSM packages so is > >it time to move ib_types.h out of the OpenSM tree and somewhere more generic? > >Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. > > > >OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) inlined > >functions. Perhaps it deserves it's own library? > > Defining some of these types in libibumad isn't a bad idea. Although, WinOF > actually has 2 copies of ib_types.h (that differ...) I find using ib_types.h > painful given its size; separate header files may help. Yes I was thinking multiple headers. There seems like there is already some precedent in ib_cm_types.h (although that entire file seems to be enclosed in a #ifndef WIN32 clause? So am I wrong on this?) In the end I would like to make ib_types.h just list the specific headers. Sasha, would you be willing to accept such a patch? First move ib_types.h to umad and then move the long inline functions into the lib and separate out the remaining header. Or would you prefer a new library? I think there is enough code there but I leave it up to you. Ira > > - Sean > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From hal.rosenstock at gmail.com Thu Sep 17 14:41:30 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 17:41:30 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917132050.041b077d.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> Message-ID: On Thu, Sep 17, 2009 at 4:20 PM, Ira Weiny wrote: > On Thu, 17 Sep 2009 10:35:39 -0700 > "Sean Hefty" wrote: > > > >> #define IB_PATH_RECORD_REVERSIBLE 0x80 > > >> > > >> struct ib_path_record > > >> { > > >> uint64_t service_id; > > >> union ibv_gid dgid; > > >> union ibv_gid sgid; > > >> uint16_t dlid; > > >> uint16_t slid; > > >> uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 > hop > > >limit-7:0*/ > > >> uint8_t tclass; > > >> uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 > */ > > >> uint16_t pkey; > > >> uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ > > >> uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ > > >> uint8_t rate; /* rate selector-7:6 rate-5:0 > */ > > >> uint8_t packetlifetime; /* lifetime selector-7:6 > > lifetime-5:0 > > >*/ > > >> uint8_t preference; > > >> uint8_t reserved[6]; > > >> }; > > > > > >I would prefer to use the structures already defined in ib_types.h... I > > >understand your not wanting to make ACM dependant on the OpenSM packages > so is > > >it time to move ib_types.h out of the OpenSM tree and somewhere more > generic? > > >Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. > > > > > >OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) > inlined > > >functions. Perhaps it deserves it's own library? > > > > Defining some of these types in libibumad isn't a bad idea. Although, > WinOF > > actually has 2 copies of ib_types.h (that differ...) I find using > ib_types.h > > painful given its size; separate header files may help. > > Yes I was thinking multiple headers. There seems like there is already > some precedent in ib_cm_types.h (although that entire file seems to be > enclosed in a #ifndef WIN32 clause? So am I wrong on this?) > > In the end I would like to make ib_types.h just list the specific headers. > > Sasha, would you be willing to accept such a patch? First move ib_types.h > to umad I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h but does not want libibumad. -- Hal > and then move the long inline functions into the lib and separate out the > remaining header. > > Or would you prefer a new library? I think there is enough code there but > I leave it up to you. > > Ira > > > > > - Sean > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > _______________________________________________ > ofw mailing list > ofw at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw > -------------- next part -------------- An HTML attachment was scrubbed... URL: From weiny2 at llnl.gov Thu Sep 17 14:40:50 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 17 Sep 2009 14:40:50 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> Message-ID: <20090917144050.4ba15f84.weiny2@llnl.gov> On Thu, 17 Sep 2009 17:41:30 -0400 Hal Rosenstock wrote: > On Thu, Sep 17, 2009 at 4:20 PM, Ira Weiny wrote: > > > On Thu, 17 Sep 2009 10:35:39 -0700 > > "Sean Hefty" wrote: > > > > > >> #define IB_PATH_RECORD_REVERSIBLE 0x80 > > > >> > > > >> struct ib_path_record > > > >> { > > > >> uint64_t service_id; > > > >> union ibv_gid dgid; > > > >> union ibv_gid sgid; > > > >> uint16_t dlid; > > > >> uint16_t slid; > > > >> uint32_t flowlabel_hoplimit; /* resv-31:28 flow label-27:8 > > hop > > > >limit-7:0*/ > > > >> uint8_t tclass; > > > >> uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 > > */ > > > >> uint16_t pkey; > > > >> uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ > > > >> uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ > > > >> uint8_t rate; /* rate selector-7:6 rate-5:0 > > */ > > > >> uint8_t packetlifetime; /* lifetime selector-7:6 > > > lifetime-5:0 > > > >*/ > > > >> uint8_t preference; > > > >> uint8_t reserved[6]; > > > >> }; > > > > > > > >I would prefer to use the structures already defined in ib_types.h... I > > > >understand your not wanting to make ACM dependant on the OpenSM packages > > so is > > > >it time to move ib_types.h out of the OpenSM tree and somewhere more > > generic? > > > >Perhaps libibumad? This also applies to ib_sa_mad in your 5th patch. > > > > > > > >OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) > > inlined > > > >functions. Perhaps it deserves it's own library? > > > > > > Defining some of these types in libibumad isn't a bad idea. Although, > > WinOF > > > actually has 2 copies of ib_types.h (that differ...) I find using > > ib_types.h > > > painful given its size; separate header files may help. > > > > Yes I was thinking multiple headers. There seems like there is already > > some precedent in ib_cm_types.h (although that entire file seems to be > > enclosed in a #ifndef WIN32 clause? So am I wrong on this?) > > > > In the end I would like to make ib_types.h just list the specific headers. > > > > Sasha, would you be willing to accept such a patch? First move ib_types.h > > to umad > > > I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants > ib_types.h but does not want libibumad. Would a separate library be a better solution then? I would prefer that as well. Ira > > -- Hal > > > > > and then move the long inline functions into the lib and separate out the > > remaining header. > > > > Or would you prefer a new library? I think there is enough code there but > > I leave it up to you. > > > > Ira > > > > > > > > - Sean > > > > > > > > > -- > > Ira Weiny > > Math Programmer/Computer Scientist > > Lawrence Livermore National Lab > > 925-423-8008 > > weiny2 at llnl.gov > > _______________________________________________ > > ofw mailing list > > ofw at lists.openfabrics.org > > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw > > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From sean.hefty at intel.com Thu Sep 17 14:49:50 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 14:49:50 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> Message-ID: <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h >but does not want libibumad. Well, libibumad is pretty useless without some network structure definitions. Currently, the alternatives are to install opensm, which also requires installing libibmad, libibcommon, and complib, or for the app to define what they need, which is what was done here. I'm not sure how you pick up ib_types.h without libibumad getting installed, but you can make a reasonable argument that libibumad should define the MAD and SA attribute structures. - Sean From weiny2 at llnl.gov Thu Sep 17 15:11:50 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 17 Sep 2009 15:11:50 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> Message-ID: <20090917151150.049c4c3c.weiny2@llnl.gov> On Thu, 17 Sep 2009 14:49:50 -0700 "Sean Hefty" wrote: > >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants ib_types.h > >but does not want libibumad. > > Well, libibumad is pretty useless without some network structure definitions. > Currently, the alternatives are to install opensm, which also requires > installing libibmad, libibcommon, and complib, or for the app to define what > they need, which is what was done here. I'm not sure how you pick up ib_types.h > without libibumad getting installed, but you can make a reasonable argument that > libibumad should define the MAD and SA attribute structures. Actually, now that I think about it... does ibutils depend on OpenSM then? I would think that it would be better to have it depend on ibumad rather than OpenSM... :-/ Ok I think I am starting to see why you mention this... Does ibutils actually link with anything? It looks like ibutils is using the inline functions to effectively make a "static" link to this functionality? I don't see any dependencies on any libs in the Makefile.am's. Is that correct? :-/ In this case I don't know that it matters if we move the header. However, it would matter if we moved the inline functions... Does ibutils form it's own packets and open the mad devices on it's own, outside of ibumad? From my quick look it seems it would have to. Ira > > - Sean > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From mashirle at us.ibm.com Thu Sep 17 15:46:50 2009 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 17 Sep 2009 15:46:50 -0700 Subject: [ofa-general] mlx4 second port lro issue In-Reply-To: <4AACF75D.8020706@mellanox.co.il> References: <1252638865.4712.22.camel@localhost.localdomain> <4AACF75D.8020706@mellanox.co.il> Message-ID: <1253227610.23761.3.camel@localhost.localdomain> On Sun, 2009-09-13 at 16:45 +0300, Tziporet Koren wrote: > Yevgeny - our maintainer of mlx4_en driver is on vacation > He will look into it when he is back Thanks. I tried to reproduce it in our lab. I found mlx_en works with LRO for both ports. The distro is Ubuntu, ethtool -i shows the driver version is 1.4.1.1 (June,2009), fw version is 2.6.9. However the performance is very bad. Thanks Shirley From hal.rosenstock at gmail.com Thu Sep 17 16:09:50 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 19:09:50 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917144050.4ba15f84.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> Message-ID: On Thu, Sep 17, 2009 at 5:40 PM, Ira Weiny wrote: > On Thu, 17 Sep 2009 17:41:30 -0400 > Hal Rosenstock wrote: > > > On Thu, Sep 17, 2009 at 4:20 PM, Ira Weiny wrote: > > > > > On Thu, 17 Sep 2009 10:35:39 -0700 > > > "Sean Hefty" wrote: > > > > > > > >> #define IB_PATH_RECORD_REVERSIBLE 0x80 > > > > >> > > > > >> struct ib_path_record > > > > >> { > > > > >> uint64_t service_id; > > > > >> union ibv_gid dgid; > > > > >> union ibv_gid sgid; > > > > >> uint16_t dlid; > > > > >> uint16_t slid; > > > > >> uint32_t flowlabel_hoplimit; /* resv-31:28 flow > label-27:8 > > > hop > > > > >limit-7:0*/ > > > > >> uint8_t tclass; > > > > >> uint8_t reversible_numpath; /* reversible-7:7 num > path-6:0 > > > */ > > > > >> uint16_t pkey; > > > > >> uint16_t qosclass_sl; /* qos class-15:4 sl-3:0 */ > > > > >> uint8_t mtu; /* mtu selector-7:6 mtu-5:0 > */ > > > > >> uint8_t rate; /* rate selector-7:6 > rate-5:0 > > > */ > > > > >> uint8_t packetlifetime; /* lifetime selector-7:6 > > > > lifetime-5:0 > > > > >*/ > > > > >> uint8_t preference; > > > > >> uint8_t reserved[6]; > > > > >> }; > > > > > > > > > >I would prefer to use the structures already defined in > ib_types.h... I > > > > >understand your not wanting to make ACM dependant on the OpenSM > packages > > > so is > > > > >it time to move ib_types.h out of the OpenSM tree and somewhere more > > > generic? > > > > >Perhaps libibumad? This also applies to ib_sa_mad in your 5th > patch. > > > > > > > > > >OTOH, ib_types.h is a 10K line file with multiple long (>10 lines) > > > inlined > > > > >functions. Perhaps it deserves it's own library? > > > > > > > > Defining some of these types in libibumad isn't a bad idea. > Although, > > > WinOF > > > > actually has 2 copies of ib_types.h (that differ...) I find using > > > ib_types.h > > > > painful given its size; separate header files may help. > > > > > > Yes I was thinking multiple headers. There seems like there is already > > > some precedent in ib_cm_types.h (although that entire file seems to be > > > enclosed in a #ifndef WIN32 clause? So am I wrong on this?) > > > > > > In the end I would like to make ib_types.h just list the specific > headers. > > > > > > Sasha, would you be willing to accept such a patch? First move > ib_types.h > > > to umad > > > > > > I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants > > ib_types.h but does not want libibumad. > I miswrote about ibis as it uses osm_vendor layer so it can use libibumad but there are other vendor layers other than osm_vendor_ibumad in use. There are other combinations where umad isn't used (even Windows is not fully moved over still). > > Would a separate library be a better solution then? Maybe but what aside from the header would be in the library ? -- Hal > I would prefer that as well. > > Ira > > > > > -- Hal > > > > > > > > > and then move the long inline functions into the lib and separate out > the > > > remaining header. > > > > > > Or would you prefer a new library? I think there is enough code there > but > > > I leave it up to you. > > > > > > Ira > > > > > > > > > > > - Sean > > > > > > > > > > > > > -- > > > Ira Weiny > > > Math Programmer/Computer Scientist > > > Lawrence Livermore National Lab > > > 925-423-8008 > > > weiny2 at llnl.gov > > > _______________________________________________ > > > ofw mailing list > > > ofw at lists.openfabrics.org > > > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw > > > > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Thu Sep 17 16:12:49 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 19:12:49 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> Message-ID: On Thu, Sep 17, 2009 at 5:49 PM, Sean Hefty wrote: > >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants > ib_types.h > >but does not want libibumad. > > Well, libibumad is pretty useless without some network structure > definitions. > ib_types.h is more akin to what is in libibmad rather than libibumad. > Currently, the alternatives are to install opensm, which also requires > installing libibmad, libibcommon, and complib, or for the app to define > what > they need, which is what was done here. I'm not sure how you pick up > ib_types.h > without libibumad getting installed, but you can make a reasonable argument > that > libibumad should define the MAD and SA attribute structures. > libibumad is currently transparent to the MAD details. It's the MAD library which knows this and that's more a diagnostic library. In a different world, this might all just be one library... -- Hal > > - Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Thu Sep 17 16:16:01 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 19:16:01 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917151150.049c4c3c.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090917151150.049c4c3c.weiny2@llnl.gov> Message-ID: On Thu, Sep 17, 2009 at 6:11 PM, Ira Weiny wrote: > On Thu, 17 Sep 2009 14:49:50 -0700 > "Sean Hefty" wrote: > > > >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants > ib_types.h > > >but does not want libibumad. > > > > Well, libibumad is pretty useless without some network structure > definitions. > > Currently, the alternatives are to install opensm, which also requires > > installing libibmad, libibcommon, and complib, or for the app to define > what > > they need, which is what was done here. I'm not sure how you pick up > ib_types.h > > without libibumad getting installed, but you can make a reasonable > argument that > > libibumad should define the MAD and SA attribute structures. > > Actually, now that I think about it... does ibutils depend on OpenSM > then? I think it has to as it uses the OpenSM vendor layer (at least ibis). ibmgtsim is another story. > I would think that it would be better to have it depend on ibumad rather > than OpenSM... > This may be historical but it was built on the OpenSM vendor layer before there was umad. Mellanox is best to comment on these aspects. -- Hal > > :-/ Ok I think I am starting to see why you mention this... Does ibutils > actually link with anything? It looks like ibutils is using the inline > functions to effectively make a "static" link to this functionality? I > don't see any dependencies on any libs in the Makefile.am's. Is that > correct? :-/ > > In this case I don't know that it matters if we move the header. However, > it would matter if we moved the inline functions... > > Does ibutils form it's own packets and open the mad devices on it's own, > outside of ibumad? From my quick look it seems it would have to. > > Ira > > > > > - Sean > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgunthorpe at obsidianresearch.com Thu Sep 17 16:20:47 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Thu, 17 Sep 2009 17:20:47 -0600 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917144050.4ba15f84.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> Message-ID: <20090917232047.GS25981@obsidianresearch.com> On Thu, Sep 17, 2009 at 02:40:50PM -0700, Ira Weiny wrote: > > I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) > > wants ib_types.h but does not want libibumad. > > Would a separate library be a better solution then? I would prefer > that as well. Please no more libraries, there are too many already. We have way too many libraries with just one or two c files in them. libibcm needs to learn how to do PR queries, it should have a good PR query API since libibcm is pretty useless without being able to do PR queries.. Jason From sean.hefty at intel.com Thu Sep 17 16:26:44 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 16:26:44 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917232047.GS25981@obsidianresearch.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> Message-ID: <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> >libibcm needs to learn how to do PR queries, it should have a good PR >query API since libibcm is pretty useless without being able to do PR >queries.. PR queries don't work - regardless of what the API looks like or where it resides. Plus adding PR queries to libibcm doesn't solve the problem of where the structure definitions reside. - Sean From hal.rosenstock at gmail.com Thu Sep 17 16:30:25 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 19:30:25 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090917151150.049c4c3c.weiny2@llnl.gov> Message-ID: On Thu, Sep 17, 2009 at 7:16 PM, Hal Rosenstock wrote: > > > On Thu, Sep 17, 2009 at 6:11 PM, Ira Weiny wrote: > >> On Thu, 17 Sep 2009 14:49:50 -0700 >> "Sean Hefty" wrote: >> >> > >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants >> ib_types.h >> > >but does not want libibumad. >> > >> > Well, libibumad is pretty useless without some network structure >> definitions. >> > Currently, the alternatives are to install opensm, which also requires >> > installing libibmad, libibcommon, and complib, or for the app to define >> what >> > they need, which is what was done here. I'm not sure how you pick up >> ib_types.h >> > without libibumad getting installed, but you can make a reasonable >> argument that >> > libibumad should define the MAD and SA attribute structures. >> >> Actually, now that I think about it... does ibutils depend on OpenSM >> then? > > > I think it has to as it uses the OpenSM vendor layer (at least ibis). > ibmgtsim is another story. > Also, configure takes --with-osm for OpenSM location. > > >> I would think that it would be better to have it depend on ibumad rather >> than OpenSM... >> > > This may be historical but it was built on the OpenSM vendor layer before > there was umad. > > Mellanox is best to comment on these aspects. > > -- Hal > > >> >> :-/ Ok I think I am starting to see why you mention this... Does ibutils >> actually link with anything? It looks like ibutils is using the inline >> functions to effectively make a "static" link to this functionality? I >> don't see any dependencies on any libs in the Makefile.am's. Is that >> correct? :-/ >> >> In this case I don't know that it matters if we move the header. However, >> it would matter if we moved the inline functions... >> >> Does ibutils form it's own packets and open the mad devices on it's own, >> outside of ibumad? From my quick look it seems it would have to. >> >> Ira >> >> > >> > - Sean >> > >> >> >> -- >> Ira Weiny >> Math Programmer/Computer Scientist >> Lawrence Livermore National Lab >> 925-423-8008 >> weiny2 at llnl.gov >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From weiny2 at llnl.gov Thu Sep 17 16:33:34 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 17 Sep 2009 16:33:34 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> Message-ID: <20090917163334.129394ac.weiny2@llnl.gov> On Thu, 17 Sep 2009 16:26:44 -0700 "Sean Hefty" wrote: > >libibcm needs to learn how to do PR queries, it should have a good PR > >query API since libibcm is pretty useless without being able to do PR > >queries.. > > PR queries don't work - regardless of what the API looks like or where it > resides. Plus adding PR queries to libibcm doesn't solve the problem of where > the structure definitions reside. Yes, I am just tired of finding the same structure definitions all over the place. I am not faulting anyone for redefining them but the fact that they have to is a problem. Ira > > - Sean > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jgunthorpe at obsidianresearch.com Thu Sep 17 16:39:43 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Thu, 17 Sep 2009 17:39:43 -0600 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> Message-ID: <20090917233943.GT25981@obsidianresearch.com> On Thu, Sep 17, 2009 at 04:26:44PM -0700, Sean Hefty wrote: > >libibcm needs to learn how to do PR queries, it should have a good PR > >query API since libibcm is pretty useless without being able to do PR > >queries.. > > PR queries don't work - regardless of what the API looks like or > where it resides. Plus adding PR queries to libibcm doesn't solve > the problem of where the structure definitions reside. Huh? PR queries work fine, I don't understand your comment. If libibcm has the PR query API then it naturally has the structure definitions. Jason From sean.hefty at intel.com Thu Sep 17 16:47:17 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 16:47:17 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917233943.GT25981@obsidianresearch.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> <20090917233943.GT25981@obsidianresearch.com> Message-ID: >PR queries work fine, I don't understand your comment. MPI does not use PR queries because it does not scale. From jgunthorpe at obsidianresearch.com Thu Sep 17 16:54:11 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Thu, 17 Sep 2009 17:54:11 -0600 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> <20090917233943.GT25981@obsidianresearch.com> Message-ID: <20090917235411.GU25981@obsidianresearch.com> On Thu, Sep 17, 2009 at 04:47:17PM -0700, Sean Hefty wrote: > >PR queries work fine, I don't understand your comment. > > MPI does not use PR queries because it does not scale. Not all the world is MPI. Your new acm stuff still does PR queries. Anyone using libibverbs multicast needs to do PR queries from userspace. Anyone using libibcm needs to do PR queries from userspace. Therefore we should just jam the PR query stuff in libibcm, everyone can use that, and your acm can ride on the PR query code from libibcm for its own needs too. Jason From hal.rosenstock at gmail.com Thu Sep 17 16:57:32 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 17 Sep 2009 19:57:32 -0400 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> Message-ID: On Thu, Sep 17, 2009 at 7:12 PM, Hal Rosenstock wrote: > > > On Thu, Sep 17, 2009 at 5:49 PM, Sean Hefty wrote: > >> >I'm not sure this is a good idea. ibutils (ibis and ibmgtsim) wants >> ib_types.h >> >but does not want libibumad. >> >> Well, libibumad is pretty useless without some network structure >> definitions. >> > > ib_types.h is more akin to what is in libibmad rather than libibumad. > > >> Currently, the alternatives are to install opensm, which also requires >> installing libibmad, libibcommon, and complib, or for the app to define >> what >> they need, which is what was done here. I'm not sure how you pick up >> ib_types.h >> without libibumad getting installed, but you can make a reasonable >> argument that >> libibumad should define the MAD and SA attribute structures. >> > > libibumad is currently transparent to the MAD details. It's the MAD library > which knows this and that's more a diagnostic library. > > In a different world, this might all just be one library... > Although not a fit IMO, the pragmatic solution is to move ib_types,h into libibumad. I think it is better there than OpenSM which was never quite right either. That can at least start to eliminate the duplications in this area. -- Hal > > -- Hal > > >> >> - Sean >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Thu Sep 17 22:57:01 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 22:57:01 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090917235411.GU25981@obsidianresearch.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090917144050.4ba15f84.weiny2@llnl.gov> <20090917232047.GS25981@obsidianresearch.com> <672EF1DC627C4E5A92E8EE85D8C2C7D4@amr.corp.intel.com> <20090917233943.GT25981@obsidianresearch.com> <20090917235411.GU25981@obsidianresearch.com> Message-ID: >Not all the world is MPI. The focus of this package is for MPI though. The librdmacm interface does perform standard PR queries for applications that use that interface. I'm not fond the mad interfaces, but I'm not trying to fix them with this. We can debate whether an application should use an interface that exposes path records and the IB CM protocol directly, but the feedback from MPI and other developers is that connection establishment over IB requires too much code and is too difficult. Short term, while the ib_acm is considered experimental, I want to call the ib_acm from under the librdmacm interface. This allows it to be used without applications needing to change. Long term, if the ib_acm can to prove itself, then accessing it directly from the kernel is a possibility. >Your new acm stuff still does PR queries. The primary reason for adding PR query was to verify that the path information returned by the ib_acm was usable. A user needs some way to know if the ib_acm can be used on their cluster. This was one of the last things that I added, and I think it has value, even if only for verification purposes. The central mechanism the ib_acm employs to acquire path data uses multicast. >Anyone using libibverbs multicast needs to do PR queries from >userspace. The ib_acm uses libibverbs multicast and does not do PR queries. >Anyone using libibcm needs to do PR queries from userspace. Open MPI has coded to the libibcm and does not perform PR queries. What's needed in either of the above cases is path information; however, there are alternate ways of obtaining this information without involving a direct query to the SA. MPI and DAPL can connect over IB today without doing PR queries. While there are limitations to determining path information without doing a PR query, there are also limitations to obtaining path information doing one. Looking at current implementations, I would deduce that the latter is more limiting than the former in practice. >Therefore we should just jam the PR query stuff in libibcm, everyone >can use that, and your acm can ride on the PR query code from >libibcm for its own needs too. These are the calls exposed through libibacm: int ib_acm_resolve_name(char *src, char *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data); int ib_acm_resolve_ip(struct sockaddr *src, struct sockaddr *dest, struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data); int ib_acm_resolve_path(struct ib_path_record *path); int ib_acm_query_path(struct ib_path_record *path); int ib_acm_convert_to_path(struct ib_acm_dev_addr *dev_addr, struct ibv_ah_attr *ah, struct ib_acm_resolve_data *data, struct ib_path_record *path); Of these, the one of most importance to the problem I'm trying to solve is ib_acm_resolve_ip(). I do not believe that we want to add what should be considered an experimental interface to libibcm, libibumad, or librdmacm based on socket addresses that would then need to be maintained. If your objection is that ib_acm_query_path() should be moved to libibcm, that's a possibility. libibacm already interfaces to libibumad, and it was trivial to add support for PR queries. libibcm does not currently depend on libibumad. And if you take a step back in the connection process, I don't know that support for just PR queries is sufficient for establishing a connection over IB. You first need to identify the endpoint, which opens up the possibility of other SA queries. - Sean From sean.hefty at intel.com Thu Sep 17 23:09:22 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 17 Sep 2009 23:09:22 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> Message-ID: >Although not a fit IMO, the pragmatic solution is to move ib_types,h into >libibumad. I think it is better there than OpenSM which was never quite right >either. That can at least start to eliminate the duplications in this area. ib_types.h includes complib header files... From vlad at lists.openfabrics.org Fri Sep 18 03:05:39 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 18 Sep 2009 03:05:39 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090918-0200 daily build status Message-ID: <20090918100539.A8FBBE621DA@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From mingo at elte.hu Fri Sep 18 04:50:53 2009 From: mingo at elte.hu (Ingo Molnar) Date: Fri, 18 Sep 2009 13:50:53 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> Message-ID: <20090918115053.GK9930@elte.hu> * Roland Dreier wrote: > > But yeah, I currently don't see a very nice match to perf counters. > > OK. It would be nice to tie into something more general, but I think > I agree -- perf counters are missing the filtering and the "no lost > events" that ummunotify does have. And I'm not sure it's worth > messing up the perf counters design just to jam one more not totally > related thing in. The filtering can be done and has been done - see Li Zefan's patchset that uses filter expressions to do per event in-kernel filtering. The OOM DoS is a bug in your patches i think, which perfcounters solves ;-) Ingo From sashak at voltaire.com Fri Sep 18 06:44:33 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 18 Sep 2009 16:44:33 +0300 Subject: [ofa-general] [PATCH] opensm/osm_sa_mcmember_record.c: fix mcgrp cleanup crash In-Reply-To: <20090907121003.GN25241@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> <20090906174548.GL25241@me> <20090906213852.GM25241@me> <20090907121003.GN25241@me> Message-ID: <20090918134433.GB23406@me> When multiple MCMember leave requests are issued the request processors may run concurrently and then the race between port from multicast group removing and multicast group deletion (then it becomes empty) is possible for different processors, as result we are getting double free crash (reproducible under stress testing). This patch fixes this by moving multicast group cleanup call under some lock protected block as port removing code. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sa_mcmember_record.c | 4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index b580eb7..f291bf0 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -964,13 +964,11 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) /* remove port and/or update join state */ osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, p_mcm_port, &mcmember_rec); + osm_mgrp_cleanup(sa->p_subn, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); mcmr_rcv_respond(sa, p_madw, &mcmember_rec); - CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); - osm_mgrp_cleanup(sa->p_subn, p_mgrp); - CL_PLOCK_RELEASE(sa->p_lock); Exit: OSM_LOG_EXIT(sa->p_log); } -- 1.6.5.rc1 From sashak at voltaire.com Fri Sep 18 06:57:52 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 18 Sep 2009 16:57:52 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_sa_mcmember_record.c: fix mcgrp cleanup crash In-Reply-To: <20090918134433.GB23406@me> References: <20090906154901.GF25241@me> <20090906154931.GG25241@me> <20090906173900.GK25241@me> <20090906174548.GL25241@me> <20090906213852.GM25241@me> <20090907121003.GN25241@me> <20090918134433.GB23406@me> Message-ID: <20090918135752.GC23406@me> On 16:44 Fri 18 Sep , Sasha Khapyorsky wrote: > > This patch fixes this by moving multicast group cleanup call under some > lock protected block as port removing code. Please disregard this patch - it is not against mainstream where problem exists, but not so trivially fixable. I will fix it in separate patch series soon. Sasha From sashak at voltaire.com Fri Sep 18 07:10:39 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 18 Sep 2009 17:10:39 +0300 Subject: [ofa-general] [PATCH] opensm/multicast: consolidate port addition/removing code In-Reply-To: <20090915103129.GM17481@me> References: <20090907154747.GO25241@me> <20090915103129.GM17481@me> Message-ID: <20090918141039.GA13667@me> Consolidate port addition/removing to multicast groups code and mlid rerouting requests so that SA MCMember join and leave requests processor will use just a single call (osm_mgrp_add_port() and osm_mgrp_remove_port() respectively) to update multicast group related data and to request (schedule) multicast routing (MFTs) recalculation. This also fixes a bug already reported (by me) on the list when due to race between different SA requests processors during MCMember leave storming multicast group cleanup code can double free an objects and crash. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_multicast.h | 4 +- opensm/include/opensm/osm_sm.h | 55 ++-------------- opensm/opensm/osm_multicast.c | 51 +++++++++++---- opensm/opensm/osm_sa_mcmember_record.c | 108 ++++--------------------------- opensm/opensm/osm_sm.c | 102 +----------------------------- 5 files changed, 64 insertions(+), 256 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index 825be8e..32bcb78 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -378,8 +378,8 @@ void osm_mgrp_delete_port(IN osm_subn_t * subn, IN osm_log_t * log, * SEE ALSO *********/ -int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, - osm_mcm_port_t * mcm_port, ib_member_rec_t * mcmr); +void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, + osm_mcm_port_t * mcm_port, ib_member_rec_t * mcmr); void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mpgr); END_C_DECLS diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index 986143a..de3f098 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -525,65 +525,24 @@ osm_resp_send(IN osm_sm_t * sm, * *********/ -/****f* OpenSM: SM/osm_sm_mcgrp_join +/****f* OpenSM: SM/osm_sm_reroute_mlid * NAME -* osm_sm_mcgrp_join +* osm_sm_reroute_mlid * * DESCRIPTION -* Adds a port to the multicast group. Creates the multicast group -* if necessary. -* -* This function is called by the SA. +* Requests (schedules) MLID rerouting * * SYNOPSIS */ -ib_api_status_t -osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, - IN osm_mgrp_t *mgrp, - IN const ib_net64_t port_guid); -/* -* PARAMETERS -* p_sm -* [in] Pointer to an osm_sm_t object. -* -* mgrp -* [in] Pointer to multicast group to join -* -* port_guid -* [in] Port GUID to add to the group. -* -* RETURN VALUES -* None -* -* NOTES -* -* SEE ALSO -*********/ +void osm_sm_reroute_mlid(osm_sm_t * sm, ib_net16_t mlid); -/****f* OpenSM: SM/osm_sm_mcgrp_leave -* NAME -* osm_sm_mcgrp_leave -* -* DESCRIPTION -* Removes a port from the multicast group. -* -* This function is called by the SA. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm, - IN osm_mgrp_t *mgrp, IN const ib_net64_t port_guid); /* * PARAMETERS -* p_sm +* sm * [in] Pointer to an osm_sm_t object. * -* mgrp -* [in] Poniter to multicast group to leave -* -* port_guid -* [in] Port GUID to remove from the group. +* mlid +* [in] MLID value * * RETURN VALUES * None diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index f548990..b03af48 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -44,10 +44,13 @@ #include #include +#include #include #include #include #include +#include +#include /********************************************************************** **********************************************************************/ @@ -139,6 +142,15 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, uint8_t prev_join_state = 0, join_state = mcmr->scope_state; uint8_t prev_scope; + if (osm_log_is_active(log, OSM_LOG_VERBOSE)) { + char gid_str[INET6_ADDRSTRLEN]; + OSM_LOG(log, OSM_LOG_VERBOSE, "Port 0x%016" PRIx64 " joining " + "MC group %s (mlid 0x%x)\n", cl_ntoh64(port->guid), + inet_ntop(AF_INET6, mgrp->mcmember_rec.mgid.raw, + gid_str, sizeof(gid_str)), + cl_ntoh16(mgrp->mlid)); + } + mcm_port = osm_mcm_port_new(port, mcmr, proxy); if (!mcm_port) return NULL; @@ -158,18 +170,22 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, osm_mcm_port_delete(mcm_port); mcm_port = (osm_mcm_port_t *) prev_item; - /* - o15.0.1.11 - Join state of the end port should be the or of the - previous setting with the current one - */ ib_member_get_scope_state(mcm_port->scope_state, &prev_scope, &prev_join_state); mcm_port->scope_state = ib_member_set_scope_state(prev_scope, prev_join_state | join_state); + } else { + if (osm_port_add_mgrp(port, mgrp)) { + osm_mcm_port_delete(mcm_port); + return NULL; + } + osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); } + /* o15.0.1.11: copy the join state */ + mcmr->scope_state = mcm_port->scope_state; + if ((join_state & IB_JOIN_STATE_FULL) && !(prev_join_state & IB_JOIN_STATE_FULL) && ++mgrp->full_members == 1) @@ -180,10 +196,9 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, /********************************************************************** **********************************************************************/ -int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, - osm_mcm_port_t * mcm_port, ib_member_rec_t *mcmr) +void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, + osm_mcm_port_t * mcm_port, ib_member_rec_t *mcmr) { - int ret; uint8_t join_state = mcmr->scope_state & 0xf; uint8_t port_join_state, new_join_state; @@ -195,22 +210,32 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, port_join_state = mcm_port->scope_state & 0x0F; new_join_state = port_join_state & ~join_state; + if (osm_log_is_active(log, OSM_LOG_VERBOSE)) { + char gid_str[INET6_ADDRSTRLEN]; + OSM_LOG(log, OSM_LOG_VERBOSE, + "Port 0x%" PRIx64 " leaving MC group %s (mlid 0x%x)\n", + cl_ntoh64(mcm_port->port->guid), + inet_ntop(AF_INET6, mgrp->mcmember_rec.mgid.raw, + gid_str, sizeof(gid_str)), + cl_ntoh16(mgrp->mlid)); + } + if (new_join_state) { mcm_port->scope_state = new_join_state | (mcm_port->scope_state & 0xf0); OSM_LOG(log, OSM_LOG_DEBUG, "updating port 0x%" PRIx64 " JoinState 0x%x -> 0x%x\n", - cl_ntoh64(mcm_port->port_gid.unicast.interface_id), + cl_ntoh64(mcm_port->port->guid), port_join_state, new_join_state); mcmr->scope_state = mcm_port->scope_state; - ret = 0; } else { mcmr->scope_state = mcm_port->scope_state; cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); OSM_LOG(log, OSM_LOG_DEBUG, "removing port 0x%" PRIx64 "\n", - cl_ntoh64(mcm_port->port_gid.unicast.interface_id)); + cl_ntoh64(mcm_port->port->guid)); + osm_port_remove_mgrp(mcm_port->port, mgrp); osm_mcm_port_delete(mcm_port); - ret = 1; + osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); } /* no more full members so the group will be deleted after re-route @@ -218,8 +243,6 @@ int osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, if ((port_join_state & IB_JOIN_STATE_FULL) && !(new_join_state & IB_JOIN_STATE_FULL) && --mgrp->full_members == 0) mgrp_send_notice(subn, log, mgrp, 67); - - return ret; } void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index a51c839..f291bf0 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -132,59 +132,6 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) return 0; } -/********************************************************************* - Add a port to the group. Calculating its PROXY_JOIN by the Port and - requester gids. -**********************************************************************/ -static ib_api_status_t add_new_mgrp_port(osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, - IN osm_port_t *port, - IN ib_member_rec_t * - p_recvd_mcmember_rec, - IN osm_mad_addr_t * p_mad_addr, - OUT osm_mcm_port_t ** pp_mcmr_port) -{ - boolean_t proxy_join; - ib_gid_t requester_gid; - ib_api_status_t res; - - /* set the proxy_join if the requester gid is not identical to the - joined gid */ - res = osm_get_gid_by_mad_addr(sa->p_log, sa->p_subn, p_mad_addr, - &requester_gid); - if (res != IB_SUCCESS) { - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B29: " - "Could not find GID for requester\n"); - - return IB_INVALID_PARAMETER; - } - - if (!memcmp(&p_recvd_mcmember_rec->port_gid, &requester_gid, - sizeof(ib_gid_t))) { - proxy_join = FALSE; - OSM_LOG(sa->p_log, OSM_LOG_DEBUG, - "Create new port with proxy_join FALSE\n"); - } else { - /* The port is not the one specified in PortGID. - The check that the requester is in the same partition as - the PortGID is done before - just need to update - the proxy_join. */ - proxy_join = TRUE; - OSM_LOG(sa->p_log, OSM_LOG_DEBUG, - "Create new port with proxy_join TRUE\n"); - } - - *pp_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, port, - p_recvd_mcmember_rec, proxy_join); - if (*pp_mcmr_port == NULL) { - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B06: " - "osm_mgrp_add_port failed\n"); - - return IB_INSUFFICIENT_MEMORY; - } - - return IB_SUCCESS; -} - /********************************************************************** **********************************************************************/ static inline boolean_t check_join_comp_mask(ib_net64_t comp_mask) @@ -968,7 +915,6 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) ib_member_rec_t mcmember_rec; ib_net64_t portguid; osm_mcm_port_t *p_mcm_port; - int removed; OSM_LOG_ENTER(sa->p_log); @@ -1016,20 +962,13 @@ static void mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) } /* remove port and/or update join state */ - removed = osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, - p_mcm_port, &mcmember_rec); + osm_mgrp_remove_port(sa->p_subn, sa->p_log, p_mgrp, p_mcm_port, + &mcmember_rec); + osm_mgrp_cleanup(sa->p_subn, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); - /* we can leave if port was deleted from MCG */ - if (removed && osm_sm_mcgrp_leave(sa->sm, p_mgrp, portguid)) - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B09: " - "osm_sm_mcgrp_leave failed\n"); - mcmr_rcv_respond(sa, p_madw, &mcmember_rec); - CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); - osm_mgrp_cleanup(sa->p_subn, p_mgrp); - CL_PLOCK_RELEASE(sa->p_lock); Exit: OSM_LOG_EXIT(sa->p_log); } @@ -1051,6 +990,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) osm_physp_t *p_request_physp; uint8_t is_new_group; /* TRUE = there is a need to create a group */ uint8_t join_state; + boolean_t proxy; OSM_LOG_ENTER(sa->p_log); @@ -1098,6 +1038,8 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } + proxy = (p_physp != p_request_physp); + ib_member_get_scope_state(p_recvd_mcmember_rec->scope_state, NULL, &join_state); @@ -1214,11 +1156,15 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } + /* copy qkey mlid tclass pkey sl_flow_hop mtu rate pkt_life sl_flow_hop */ + copy_from_create_mc_rec(&mcmember_rec, &p_mgrp->mcmember_rec); + /* create or update existing port (join-state will be updated) */ - status = add_new_mgrp_port(sa, p_mgrp, p_port, p_recvd_mcmember_rec, - osm_madw_get_mad_addr_ptr(p_madw), - &p_mcmr_port); - if (status != IB_SUCCESS) { + p_mcmr_port = osm_mgrp_add_port(sa->p_subn, sa->p_log, p_mgrp, p_port, + &mcmember_rec, proxy); + if (!p_mcmr_port) { + OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B06: " + "osm_mgrp_add_port failed\n"); /* we fail to add the port so we might need to delete the group */ osm_mgrp_cleanup(sa->p_subn, p_mgrp); CL_PLOCK_RELEASE(sa->p_lock); @@ -1228,35 +1174,9 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } - /* o15.0.1.11: copy the join state */ - mcmember_rec.scope_state = p_mcmr_port->scope_state; - - /* copy qkey mlid tclass pkey sl_flow_hop mtu rate pkt_life sl_flow_hop */ - copy_from_create_mc_rec(&mcmember_rec, &p_mgrp->mcmember_rec); - /* Release the lock as we don't need it. */ CL_PLOCK_RELEASE(sa->p_lock); - /* do the actual routing (actually schedule the update) */ - status = osm_sm_mcgrp_join(sa->sm, p_mgrp, - p_recvd_mcmember_rec->port_gid.unicast. - interface_id); - if (status != IB_SUCCESS) { - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B14: " - "osm_sm_mcgrp_join failed from port 0x%016" PRIx64 - " (%s), " "sending IB_SA_MAD_STATUS_NO_RESOURCES\n", - cl_ntoh64(portguid), p_port->p_node->print_desc); - CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); - /* the request for routing failed so we need to remove the port */ - osm_mgrp_delete_port(sa->p_subn, sa->p_log, p_mgrp, - p_recvd_mcmember_rec->port_gid. - unicast.interface_id); - CL_PLOCK_RELEASE(sa->p_lock); - osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); - goto Exit; - - } - /* failed to route */ if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) osm_dump_mc_record(sa->p_log, &mcmember_rec, OSM_LOG_DEBUG); diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index f3fa7f4..88f5ebe 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -443,109 +443,15 @@ Exit: /********************************************************************** **********************************************************************/ -static void request_mlid(osm_sm_t * sm, uint16_t mlid) +void osm_sm_reroute_mlid(osm_sm_t * sm, ib_net16_t mlid) { - mlid -= IB_LID_MCAST_START_HO; + mlid = cl_ntoh16(mlid) - IB_LID_MCAST_START_HO; sm->mlids_req[mlid] = 1; if (sm->mlids_req_max < mlid) sm->mlids_req_max = mlid; osm_sm_signal(sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); -} - -ib_api_status_t osm_sm_mcgrp_join(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, - IN const ib_net64_t port_guid) -{ - osm_port_t *p_port; - ib_api_status_t status = IB_SUCCESS; - osm_mcm_info_t *p_mcm; - - OSM_LOG_ENTER(p_sm->p_log); - - OSM_LOG(p_sm->p_log, OSM_LOG_VERBOSE, - "Port 0x%016" PRIx64 " joining MLID 0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); - - /* - * Acquire the port object for the port joining this group. - */ - CL_PLOCK_EXCL_ACQUIRE(p_sm->p_lock); - p_port = osm_get_port_by_guid(p_sm->p_subn, port_guid); - if (!p_port) { - OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E05: " - "No port object for port 0x%016" PRIx64 "\n", - cl_ntoh64(port_guid)); - status = IB_INVALID_PARAMETER; - goto Exit; - } - - /* - * Check if the object (according to mlid) already exists on this port. - * If it does - then no need to update it again, and no need to - * create the mc tree again. Just goto Exit. - */ - p_mcm = (osm_mcm_info_t *) cl_qlist_head(&p_port->mcm_list); - while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { - if (p_mcm->mgrp == mgrp) { - OSM_LOG(p_sm->p_log, OSM_LOG_DEBUG, - "Found mlid object for Port:" - "0x%016" PRIx64 " lid:0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); - goto Exit; - } - p_mcm = (osm_mcm_info_t *) cl_qlist_next(&p_mcm->list_item); - } - - status = osm_port_add_mgrp(p_port, mgrp); - if (status != IB_SUCCESS) { - OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E03: " - "Unable to associate port 0x%" PRIx64 " to mlid 0x%X\n", - cl_ntoh64(osm_port_get_guid(p_port)), - cl_ntoh16(osm_mgrp_get_mlid(mgrp))); - goto Exit; - } - - request_mlid(p_sm, cl_ntoh16(mgrp->mlid)); -Exit: - CL_PLOCK_RELEASE(p_sm->p_lock); - OSM_LOG_EXIT(p_sm->p_log); - - return status; -} - -ib_api_status_t osm_sm_mcgrp_leave(IN osm_sm_t * p_sm, IN osm_mgrp_t *mgrp, - IN const ib_net64_t port_guid) -{ - osm_port_t *p_port; - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_sm->p_log); - - OSM_LOG(p_sm->p_log, OSM_LOG_VERBOSE, - "Port 0x%" PRIx64 " leaving MLID 0x%X\n", - cl_ntoh64(port_guid), cl_ntoh16(mgrp->mlid)); - - /* - * Acquire the port object for the port leaving this group. - */ - CL_PLOCK_EXCL_ACQUIRE(p_sm->p_lock); - - p_port = osm_get_port_by_guid(p_sm->p_subn, port_guid); - if (!p_port) { - OSM_LOG(p_sm->p_log, OSM_LOG_ERROR, "ERR 2E04: " - "No port object for port 0x%" PRIx64 "\n", - cl_ntoh64(port_guid)); - status = IB_INVALID_PARAMETER; - goto Exit; - } - - osm_port_remove_mgrp(p_port, mgrp); - - request_mlid(p_sm, cl_hton16(mgrp->mlid)); -Exit: - CL_PLOCK_RELEASE(p_sm->p_lock); - OSM_LOG_EXIT(p_sm->p_log); - - return status; + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, "rerouting requested for MLID 0x%x\n", + mlid + IB_LID_MCAST_START_HO); } void osm_set_sm_priority(osm_sm_t * sm, uint8_t priority) -- 1.6.5.rc1 From sashak at voltaire.com Fri Sep 18 08:15:22 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 18 Sep 2009 18:15:22 +0300 Subject: [ofa-general] [PATCH] opensm/multicast: merge mcm_port and mcm_info In-Reply-To: <20090918141039.GA13667@me> References: <20090907154747.GO25241@me> <20090915103129.GM17481@me> <20090918141039.GA13667@me> Message-ID: <20090918151522.GB13667@me> Merge osm_mcm_port (mgrp's joined ports list) and osm_mcm_info (port's multicast groups list where port is joined in) structures in a single one - we need a both equivalently and merging simplifies allocation and cleanup mechanisms dramatically. As "side effect" it also fixes non-member re-join bug (reported by Eli couple of months ago) "automatically": http://lists.openfabrics.org/pipermail/general/2009-May/059644.html Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_mcm_port.h | 18 ++++++-- opensm/include/opensm/osm_port.h | 81 ---------------------------------- opensm/opensm/Makefile.am | 3 +- opensm/opensm/osm_drop_mgr.c | 13 ++--- opensm/opensm/osm_mcm_port.c | 8 ++- opensm/opensm/osm_multicast.c | 38 +++++++++++----- opensm/opensm/osm_port.c | 59 ------------------------ opensm/opensm/osm_sm.c | 1 - 8 files changed, 51 insertions(+), 170 deletions(-) diff --git a/opensm/include/opensm/osm_mcm_port.h b/opensm/include/opensm/osm_mcm_port.h index 74b6615..4d82df7 100644 --- a/opensm/include/opensm/osm_mcm_port.h +++ b/opensm/include/opensm/osm_mcm_port.h @@ -57,6 +57,9 @@ #endif /* __cplusplus */ BEGIN_C_DECLS + +struct osm_mgrp; + /****s* OpenSM: MCM Port Object/osm_mcm_port_t * NAME * osm_mcm_port_t @@ -72,7 +75,9 @@ BEGIN_C_DECLS */ typedef struct osm_mcm_port { cl_map_item_t map_item; + cl_list_item_t list_item; osm_port_t *port; + struct osm_mgrp *mgrp; ib_gid_t port_gid; uint8_t scope_state; boolean_t proxy_join; @@ -85,6 +90,9 @@ typedef struct osm_mcm_port { * port * Reference to the parent port. * +* mgrp +* The pointer to multicast group where this port is member of +* * port_gid * GID of the member port. * @@ -111,18 +119,20 @@ typedef struct osm_mcm_port { * * SYNOPSIS */ -osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t * port, IN ib_member_rec_t *mcmr, - IN boolean_t proxy_join); +osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t * port, IN struct osm_mgrp *mgrp, + IN ib_member_rec_t *mcmr, IN boolean_t proxy); /* * PARAMETERS * port * [in] Pointer to the port object. -* GID of the port to add to the multicast group. +* +* mgrp +* [in] Pointer to multicast group where this port is joined. * * mcmr * [in] Pointer to MCMember record of the join request * -* proxy_join +* proxy * [in] proxy_join state analyzed from the request * * RETURN VALUES diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 0e0d3d2..000e2fe 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -1411,87 +1411,6 @@ osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, * Port *********/ -/****f* OpenSM: Port/osm_port_add_mgrp -* NAME -* osm_port_add_mgrp -* -* DESCRIPTION -* Logically connects a port to a multicast group. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_port_add_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); -/* -* PARAMETERS -* p_port -* [in] Pointer to an osm_port_t object. -* -* mgrp -* [in] Pointer to the multicast group. -* -* RETURN VALUES -* IB_SUCCESS -* IB_INSUFFICIENT_MEMORY -* -* NOTES -* -* SEE ALSO -* Port object -*********/ - -/****f* OpenSM: Port/osm_port_remove_mgrp -* NAME -* osm_port_remove_mgrp -* -* DESCRIPTION -* Logically disconnects a port from a multicast group. -* -* SYNOPSIS -*/ -void -osm_port_remove_mgrp(IN osm_port_t * const p_port, IN struct osm_mgrp *mgrp); -/* -* PARAMETERS -* p_port -* [in] Pointer to an osm_port_t object. -* -* mgrp -* [in] Pointer to the multicast group. -* -* RETURN VALUES -* None. -* -* NOTES -* -* SEE ALSO -* Port object -*********/ - -/****f* OpenSM: Port/osm_port_remove_all_mgrp -* NAME -* osm_port_remove_all_mgrp -* -* DESCRIPTION -* Logically disconnects a port from all its multicast groups. -* -* SYNOPSIS -*/ -void osm_port_remove_all_mgrp(IN osm_port_t * const p_port); -/* -* PARAMETERS -* p_port -* [in] Pointer to an osm_port_t object. -* -* RETURN VALUES -* None. -* -* NOTES -* -* SEE ALSO -* Port object -*********/ - /****f* OpenSM: Physical Port/osm_physp_calc_link_mtu * NAME * osm_physp_calc_link_mtu diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 2d67a95..db7d790 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -31,7 +31,7 @@ opensm_SOURCES = main.c osm_console_io.c osm_console.c osm_db_files.c \ osm_db_pack.c osm_drop_mgr.c \ osm_inform.c osm_lid_mgr.c osm_lin_fwd_rcv.c \ osm_link_mgr.c osm_mcast_fwd_rcv.c \ - osm_mcast_mgr.c osm_mcast_tbl.c osm_mcm_info.c \ + osm_mcast_mgr.c osm_mcast_tbl.c \ osm_mcm_port.c osm_mesh.c osm_mtree.c osm_multicast.c osm_node.c \ osm_node_desc_rcv.c osm_node_info_rcv.c \ osm_opensm.c osm_pkey.c osm_pkey_mgr.c osm_pkey_rcv.c \ @@ -83,7 +83,6 @@ opensminclude_HEADERS = \ $(srcdir)/../include/opensm/osm_mad_pool.h \ $(srcdir)/../include/opensm/osm_madw.h \ $(srcdir)/../include/opensm/osm_mcast_tbl.h \ - $(srcdir)/../include/opensm/osm_mcm_info.h \ $(srcdir)/../include/opensm/osm_mcm_port.h \ $(srcdir)/../include/opensm/osm_mesh.h \ $(srcdir)/../include/opensm/osm_mtree.h \ diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 4891bb8..4f98cc9 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -57,7 +57,6 @@ #include #include #include -#include #include #include #include @@ -157,7 +156,7 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) ib_net64_t port_guid; osm_port_t *p_port_check; cl_qmap_t *p_sm_guid_tbl; - osm_mcm_info_t *p_mcm; + osm_mcm_port_t *mcm_port; cl_ptr_vector_t *p_port_lid_tbl; uint16_t min_lid_ho; uint16_t max_lid_ho; @@ -209,13 +208,11 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) drop_mgr_clean_physp(sm, p_port->p_physp); - p_mcm = (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); - while (p_mcm != (osm_mcm_info_t *) cl_qlist_end(&p_port->mcm_list)) { - osm_mgrp_delete_port(sm->p_subn, sm->p_log, p_mcm->mgrp, + while (!cl_is_qlist_empty(&p_port->mcm_list)) { + mcm_port = cl_item_obj(cl_qlist_head(&p_port->mcm_list), + mcm_port, list_item); + osm_mgrp_delete_port(sm->p_subn, sm->p_log, mcm_port->mgrp, p_port->guid); - osm_mcm_info_delete(p_mcm); - p_mcm = - (osm_mcm_info_t *) cl_qlist_remove_head(&p_port->mcm_list); } /* initialize the p_node - may need to get node_desc later */ diff --git a/opensm/opensm/osm_mcm_port.c b/opensm/opensm/osm_mcm_port.c index 9381bff..56065e6 100644 --- a/opensm/opensm/osm_mcm_port.c +++ b/opensm/opensm/osm_mcm_port.c @@ -47,11 +47,12 @@ #include #include #include +#include /********************************************************************** **********************************************************************/ -osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t *port, IN ib_member_rec_t *mcmr, - IN boolean_t proxy_join) +osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t *port, IN osm_mgrp_t *mgrp, + IN ib_member_rec_t *mcmr, IN boolean_t proxy) { osm_mcm_port_t *p_mcm; @@ -59,9 +60,10 @@ osm_mcm_port_t *osm_mcm_port_new(IN osm_port_t *port, IN ib_member_rec_t *mcmr, if (p_mcm) { memset(p_mcm, 0, sizeof(*p_mcm)); p_mcm->port = port; + p_mcm->mgrp = mgrp; p_mcm->port_gid = mcmr->port_gid; p_mcm->scope_state = mcmr->scope_state; - p_mcm->proxy_join = proxy_join; + p_mcm->proxy_join = proxy; } return (p_mcm); diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index b03af48..e775f02 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -50,7 +50,6 @@ #include #include #include -#include /********************************************************************** **********************************************************************/ @@ -95,11 +94,28 @@ osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mgrp) { - if (mgrp->full_members || mgrp->well_known) + osm_mcm_port_t *mcm_port; + + if (mgrp->full_members) return; - subn->mgroups[cl_ntoh16(mgrp->mlid) - IB_LID_MCAST_START_HO] = NULL; + + osm_mtree_destroy(mgrp->p_root); + mgrp->p_root = NULL; + + while (cl_qmap_count(&mgrp->mcm_port_tbl)) { + mcm_port = (osm_mcm_port_t *)cl_qmap_head(&mgrp->mcm_port_tbl); + cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); + cl_qlist_remove_item(&mcm_port->port->mcm_list, + &mcm_port->list_item); + osm_mcm_port_delete(mcm_port); + } + + if (mgrp->well_known) + return; + cl_fmap_remove_item(&subn->mgrp_mgid_tbl, &mgrp->map_item); - osm_mgrp_delete(mgrp); + subn->mgroups[cl_ntoh16(mgrp->mlid) - IB_LID_MCAST_START_HO] = NULL; + free(mgrp); } /********************************************************************** @@ -151,7 +167,7 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, cl_ntoh16(mgrp->mlid)); } - mcm_port = osm_mcm_port_new(port, mcmr, proxy); + mcm_port = osm_mcm_port_new(port, mgrp, mcmr, proxy); if (!mcm_port) return NULL; @@ -176,10 +192,7 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, ib_member_set_scope_state(prev_scope, prev_join_state | join_state); } else { - if (osm_port_add_mgrp(port, mgrp)) { - osm_mcm_port_delete(mcm_port); - return NULL; - } + cl_qlist_insert_tail(&port->mcm_list, &mcm_port->list_item); osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); } @@ -230,10 +243,11 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, mcmr->scope_state = mcm_port->scope_state; } else { mcmr->scope_state = mcm_port->scope_state; - cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); OSM_LOG(log, OSM_LOG_DEBUG, "removing port 0x%" PRIx64 "\n", cl_ntoh64(mcm_port->port->guid)); - osm_port_remove_mgrp(mcm_port->port, mgrp); + cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); + cl_qlist_remove_item(&mcm_port->port->mcm_list, + &mcm_port->list_item); osm_mcm_port_delete(mcm_port); osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); } @@ -255,8 +269,8 @@ void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, mcmrec.scope_state = 0xf; osm_mgrp_remove_port(subn, log, mgrp, (osm_mcm_port_t *) item, &mcmrec); + osm_mgrp_cleanup(subn, mgrp); } - osm_mgrp_cleanup(subn, mgrp); } /********************************************************************** diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index 3470381..dc8a768 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -51,7 +51,6 @@ #include #include #include -#include #include /********************************************************************** @@ -132,8 +131,6 @@ void osm_physp_init(IN osm_physp_t * p_physp, IN const ib_net64_t port_guid, **********************************************************************/ void osm_port_delete(IN OUT osm_port_t ** pp_port) { - /* cleanup all mcm recs attached */ - osm_port_remove_all_mgrp(*pp_port); free(*pp_port); *pp_port = NULL; } @@ -223,62 +220,6 @@ Found: /********************************************************************** **********************************************************************/ -ib_api_status_t osm_port_add_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t *mgrp) -{ - ib_api_status_t status = IB_SUCCESS; - osm_mcm_info_t *p_mcm; - - p_mcm = osm_mcm_info_new(mgrp); - if (p_mcm) - cl_qlist_insert_tail(&p_port->mcm_list, - (cl_list_item_t *) p_mcm); - else - status = IB_INSUFFICIENT_MEMORY; - - return status; -} - -/********************************************************************** - **********************************************************************/ -static cl_status_t port_mgrp_find_func(IN const cl_list_item_t * p_list_item, - IN void *context) -{ - if (context == ((osm_mcm_info_t *) p_list_item)->mgrp) - return CL_SUCCESS; - else - return CL_NOT_FOUND; -} - -/********************************************************************** - **********************************************************************/ -void osm_port_remove_mgrp(IN osm_port_t * p_port, IN osm_mgrp_t *mgrp) -{ - cl_list_item_t *p_mcm; - - p_mcm = cl_qlist_find_from_head(&p_port->mcm_list, port_mgrp_find_func, - mgrp); - - if (p_mcm != cl_qlist_end(&p_port->mcm_list)) { - cl_qlist_remove_item(&p_port->mcm_list, p_mcm); - osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); - } -} - -/********************************************************************** - **********************************************************************/ -void osm_port_remove_all_mgrp(IN osm_port_t * p_port) -{ - cl_list_item_t *p_mcm; - - p_mcm = cl_qlist_remove_head(&p_port->mcm_list); - while (p_mcm != cl_qlist_end(&p_port->mcm_list)) { - osm_mcm_info_delete((osm_mcm_info_t *) p_mcm); - p_mcm = cl_qlist_remove_head(&p_port->mcm_list); - } -} - -/********************************************************************** - **********************************************************************/ uint8_t osm_physp_calc_link_mtu(IN osm_log_t * p_log, IN const osm_physp_t * p_physp) { diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index 88f5ebe..6eaee0d 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -57,7 +57,6 @@ #include #include #include -#include #include #include -- 1.6.5.rc1 From hnrose at comcast.net Fri Sep 18 07:31:28 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Fri, 18 Sep 2009 10:31:28 -0400 Subject: [ofa-general] [PATCH] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes Message-ID: <20090918143128.GA23618@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..329743a 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include #include +static void free_node(db_node_t * node); + /** ========================================================================= */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,16 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item; + db_node_t *node; + if (db) { + item = cl_qmap_head(&db->pc_data); + while (item != cl_qmap_end(&db->pc_data)) { + node = (db_node_t *)item; + free_node(node); + item = cl_qmap_next(item); + } cl_plock_destroy(&db->lock); free(db); } From hnrose at comcast.net Fri Sep 18 06:32:16 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Fri, 18 Sep 2009 09:32:16 -0400 Subject: [ofa-general] [PATCH] ibsim/sim_cmd.c: Cosmetic change to error message Message-ID: <20090918133216.GA17787@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..6d3a893 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -295,7 +295,7 @@ static int do_seterror(FILE * f, char *line) orig = strsep(&s, "\""); if (!s) { - fprintf(f, "# unlink: bad parameter in \"%s\"\n", line); + fprintf(f, "# set error: bad parameter in \"%s\"\n", line); return -1; } From robyf at tekno-soft.it Fri Sep 18 13:41:35 2009 From: robyf at tekno-soft.it (Roberto Fichera) Date: Fri, 18 Sep 2009 22:41:35 +0200 (CEST) Subject: [ofa-general] Building IB SAN with Linux without switch Message-ID: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> Hi All in the list, I would like to know if it's possible to configure a linux server with 2 or 3 HCAs, with 2 ports each, so that I can connect 4 or 6 nodes without using any switch in the middle. If possible, please show an example of the network configuration. Please CC me since I'm not subscribed in the list. Thanks in advance, Roberto Fichera. From worleys at gmail.com Fri Sep 18 14:31:49 2009 From: worleys at gmail.com (Chris Worley) Date: Fri, 18 Sep 2009 15:31:49 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AA4F561.504@vlnb.net> References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/06/2009 05:41 PM wrote: >> >> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>> >>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>> wrote: >>>> >>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>> >>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>> >>>>>> I've used a couple of initiators (different systems) w/ different >>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>> repeat). >>>>> >>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>> both cases, the problem does not repeat. >>>>> >>>>> That makes it sound like OFED is the cure on either side of the >>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>> w/ either Ununtu or RHEL targets). >>>> >>>> These results are strange. Regarding the Linux-only tests, I was >>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>> each of these components there is at least one test that passes and at >>>> least one test that fails. So either my assumption is wrong or one of >>>> the above test results is not repeatable. Do you have the time to >>>> repeat the Linux-only tests ? >>> >>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>> the problem repeated; now, I can't repeat the case where it didn't >>> fail.  Still, no errors, other than the eventual timeouts previously >>> shown; the target thinks all is fine, the initiator is stuck. >> >> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >> 9.04. > > 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races > you have, although not completely. I finally got around to this test... 1 CPU works very well, w/o hangs (will test all night to see if this holds true), 2 or more don't. This is dual-socket NHM, so I can't specify more than one processor w/o getting more than one socket. Chris > > 2. Try with another hardware, including motherboard. You can have something > like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) > >> Chris >>> >>> Chris >>>> >>>> Bart. >>>> > > From worleys at gmail.com Fri Sep 18 14:33:15 2009 From: worleys at gmail.com (Chris Worley) Date: Fri, 18 Sep 2009 15:33:15 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Fri, Sep 18, 2009 at 3:31 PM, Chris Worley wrote: > On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/06/2009 05:41 PM wrote: >>> >>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>> >>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>>> wrote: >>>>> >>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>>> >>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>>> >>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>> repeat). >>>>>> >>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>> both cases, the problem does not repeat. >>>>>> >>>>>> That makes it sound like OFED is the cure on either side of the >>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>> w/ either Ununtu or RHEL targets). >>>>> >>>>> These results are strange. Regarding the Linux-only tests, I was >>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>> each of these components there is at least one test that passes and at >>>>> least one test that fails. So either my assumption is wrong or one of >>>>> the above test results is not repeatable. Do you have the time to >>>>> repeat the Linux-only tests ? >>>> >>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>> the problem repeated; now, I can't repeat the case where it didn't >>>> fail.  Still, no errors, other than the eventual timeouts previously >>>> shown; the target thinks all is fine, the initiator is stuck. >>> >>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >>> 9.04. >> >> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races >> you have, although not completely. > > I finally got around to this test... 1 CPU works very well, w/o hangs > (will test all night to see if this holds true), 2 or more don't. > This is dual-socket NHM, so I can't specify more than one processor > w/o getting more than one socket. I don't know if this is important, but 1KB block tests didn't have a problem w/ 2 or 4 maxcpus... they didn't hang until 2KB blocks: fio --rw=randrw --bs=2k --rwmixread=100 --numjobs=64 --iodepth=64 --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio --filename=/dev/sdb --filename=/dev/sdc --name=test --loops=10000 --size=32183006002 --runtime=600 --group_reporting Chris > > Chris >> >> 2. Try with another hardware, including motherboard. You can have something >> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) >> >>> Chris >>>> >>>> Chris >>>>> >>>>> Bart. >>>>> >> >> > From weiny2 at llnl.gov Fri Sep 18 14:32:16 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Sep 2009 14:32:16 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> Message-ID: <20090918143216.47fa8d6f.weiny2@llnl.gov> On Thu, 17 Sep 2009 23:09:22 -0700 "Sean Hefty" wrote: > >Although not a fit IMO, the pragmatic solution is to move ib_types,h into > >libibumad. I think it is better there than OpenSM which was never quite right > >either. That can at least start to eliminate the duplications in this area. > > ib_types.h includes complib header files... > Rough hack. Does windows have stdint.h, byteswap.h, and endian.h? To do this right I would remove the "IN" and "OUT" stuff. Ira From: Ira Weiny Date: Fri, 18 Sep 2009 14:29:12 -0700 Subject: [PATCH] Quick hack... Remove CL compatiblity from ib_types.h use stdint.h convert boolean_t typedef int boolean_t FALSE == 0 TRUE == 1 use byteswap.h and endian.h and Copy byte swap macros define "IN", "OUT", and "OPTIONAL" Signed-off-by: Ira Weiny --- opensm/include/iba/ib_types.h | 277 ++++++++++++++++++++++++++-------------- 1 files changed, 180 insertions(+), 97 deletions(-) diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index c9d81cb..c0ce5cd 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -38,8 +38,9 @@ #define __IB_TYPES_H__ #include -#include -#include +#include +#include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -49,6 +50,88 @@ # define END_C_DECLS #endif /* __cplusplus */ +/** ========================================================================= + * complib stuff + */ +#ifndef IN +#define IN /* Function input parameter */ +#endif +#ifndef OUT +#define OUT /* Function output parameter */ +#endif +#ifndef OPTIONAL +#define OPTIONAL /* Optional function parameter - NULL if not used */ +#endif + +#ifndef __BYTE_ORDER +#error "__BYTE_ORDER macro undefined. Missing in endian.h?" +#endif + +/* 16bit */ +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define CL_NTOH16( x ) (uint16_t)( \ + (((uint16_t)(x) & 0x00FF) << 8) | \ + (((uint16_t)(x) & 0xFF00) >> 8) ) +#else +#define CL_NTOH16( x ) (x) +#endif +#define CL_HTON16 CL_NTOH16 + +/* 32bit */ +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define CL_NTOH32( x ) (uint32_t)( \ + (((uint32_t)(x) & 0x000000FF) << 24) | \ + (((uint32_t)(x) & 0x0000FF00) << 8) | \ + (((uint32_t)(x) & 0x00FF0000) >> 8) | \ + (((uint32_t)(x) & 0xFF000000) >> 24) ) +#else +#define CL_NTOH32( x ) (x) +#endif +#define CL_HTON32 CL_NTOH32 + +/* 64bit */ +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define CL_NTOH64( x ) (uint64_t)( \ + (((uint64_t)(x) & 0x00000000000000FFULL) << 56) | \ + (((uint64_t)(x) & 0x000000000000FF00ULL) << 40) | \ + (((uint64_t)(x) & 0x0000000000FF0000ULL) << 24) | \ + (((uint64_t)(x) & 0x00000000FF000000ULL) << 8 ) | \ + (((uint64_t)(x) & 0x000000FF00000000ULL) >> 8 ) | \ + (((uint64_t)(x) & 0x0000FF0000000000ULL) >> 24) | \ + (((uint64_t)(x) & 0x00FF000000000000ULL) >> 40) | \ + (((uint64_t)(x) & 0xFF00000000000000ULL) >> 56) ) +#else +#define CL_NTOH64( x ) (x) +#endif +#define CL_HTON64 CL_NTOH64 + +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define cl_ntoh16(x) bswap_16(x) +#define cl_hton16(x) bswap_16(x) +#define cl_ntoh32(x) bswap_32(x) +#define cl_hton32(x) bswap_32(x) +#define cl_ntoh64(x) (uint64_t)bswap_64(x) +#define cl_hton64(x) (uint64_t)bswap_64(x) +#else /* Big Endian */ +#define cl_ntoh16(x) (x) +#define cl_hton16(x) (x) +#define cl_ntoh32(x) (x) +#define cl_hton32(x) (x) +#define cl_ntoh64(x) (x) +#define cl_hton64(x) (x) +#endif + +#if defined (_DEBUG_) +#include +#define CL_ASSERT assert +#else /* _DEBUG_ */ +#define CL_ASSERT( __exp__ ) +#endif /* _DEBUG_ */ + +/** ========================================================================= + * end complib stuff + */ + BEGIN_C_DECLS #if defined( WIN32 ) || defined( _WIN64 ) #if defined( EXPORT_AL_SYMBOLS ) @@ -564,7 +647,7 @@ BEGIN_C_DECLS * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_class_is_vendor_specific_low(IN const uint8_t class_code) { return ((class_code >= IB_MCLASS_VENDOR_LOW_RANGE_MIN) && @@ -577,8 +660,8 @@ ib_class_is_vendor_specific_low(IN const uint8_t class_code) * [in] The Management Datagram Class Code * * RETURN VALUE -* TRUE if the class is in the Low range of Vendor Specific MADs -* FALSE otherwise. +* 1 if the class is in the Low range of Vendor Specific MADs +* 0 otherwise. * * NOTES * @@ -596,7 +679,7 @@ ib_class_is_vendor_specific_low(IN const uint8_t class_code) * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_class_is_vendor_specific_high(IN const uint8_t class_code) { return ((class_code >= IB_MCLASS_VENDOR_HIGH_RANGE_MIN) && @@ -609,8 +692,8 @@ ib_class_is_vendor_specific_high(IN const uint8_t class_code) * [in] The Management Datagram Class Code * * RETURN VALUE -* TRUE if the class is in the High range of Vendor Specific MADs -* FALSE otherwise. +* 1 if the class is in the High range of Vendor Specific MADs +* 0 otherwise. * * NOTES * @@ -627,7 +710,7 @@ ib_class_is_vendor_specific_high(IN const uint8_t class_code) * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_class_is_vendor_specific(IN const uint8_t class_code) { return (ib_class_is_vendor_specific_low(class_code) || @@ -640,8 +723,8 @@ ib_class_is_vendor_specific(IN const uint8_t class_code) * [in] The Management Datagram Class Code * * RETURN VALUE -* TRUE if the class is a Vendor Specific MAD -* FALSE otherwise. +* 1 if the class is a Vendor Specific MAD +* 0 otherwise. * * NOTES * @@ -658,7 +741,7 @@ ib_class_is_vendor_specific(IN const uint8_t class_code) * * SYNOPSIS */ -static inline boolean_t OSM_API ib_class_is_rmpp(IN const uint8_t class_code) +static inline int OSM_API ib_class_is_rmpp(IN const uint8_t class_code) { return ((class_code == IB_MCLASS_SUBN_ADM) || (class_code == IB_MCLASS_DEV_MGMT) || @@ -673,8 +756,8 @@ static inline boolean_t OSM_API ib_class_is_rmpp(IN const uint8_t class_code) * [in] The Management Datagram Class Code * * RETURN VALUE -* TRUE if the class supports RMPP -* FALSE otherwise. +* 1 if the class supports RMPP +* 0 otherwise. * * NOTES * @@ -2077,7 +2160,7 @@ static inline ib_net16_t OSM_API ib_pkey_get_base(IN const ib_net16_t pkey) * * SYNOPSIS */ -static inline boolean_t OSM_API ib_pkey_is_full_member(IN const ib_net16_t pkey) +static inline int OSM_API ib_pkey_is_full_member(IN const ib_net16_t pkey) { return ((pkey & IB_PKEY_TYPE_MASK) == IB_PKEY_TYPE_MASK); } @@ -2088,8 +2171,8 @@ static inline boolean_t OSM_API ib_pkey_is_full_member(IN const ib_net16_t pkey) * [in] P_Key value * * RETURN VALUE -* TRUE if the port is a full member of the partition. -* FALSE otherwise. +* 1 if the port is a full member of the partition. +* 0 otherwise. * * NOTES * @@ -2102,15 +2185,15 @@ static inline boolean_t OSM_API ib_pkey_is_full_member(IN const ib_net16_t pkey) * ib_pkey_is_invalid * * DESCRIPTION -* Returns TRUE if the given P_Key is an invalid P_Key +* Returns 1 if the given P_Key is an invalid P_Key * C10-116: the CI shall regard a P_Key as invalid if its low-order * 15 bits are all zero... * * SYNOPSIS */ -static inline boolean_t OSM_API ib_pkey_is_invalid(IN const ib_net16_t pkey) +static inline int OSM_API ib_pkey_is_invalid(IN const ib_net16_t pkey) { - return ib_pkey_get_base(pkey) == 0x0000 ? TRUE : FALSE; + return ib_pkey_get_base(pkey) == 0x0000 ? 1 : 0; } /* @@ -2167,11 +2250,11 @@ typedef union _ib_gid { * ib_gid_is_multicast * * DESCRIPTION -* Returns a boolean indicating whether a GID is a multicast GID. +* Returns a int indicating whether a GID is a multicast GID. * * SYNOPSIS */ -static inline boolean_t OSM_API ib_gid_is_multicast(IN const ib_gid_t * p_gid) +static inline int OSM_API ib_gid_is_multicast(IN const ib_gid_t * p_gid) { return (p_gid->raw[0] == 0xFF); } @@ -2273,12 +2356,12 @@ ib_gid_get_subnet_prefix(IN const ib_gid_t * const p_gid) * ib_gid_is_link_local * * DESCRIPTION -* Returns TRUE if the unicast GID scoping indicates link local, -* FALSE otherwise. +* Returns 1 if the unicast GID scoping indicates link local, +* 0 otherwise. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_gid_is_link_local(IN const ib_gid_t * const p_gid) { return ((ib_gid_get_subnet_prefix(p_gid) & @@ -2291,8 +2374,8 @@ ib_gid_is_link_local(IN const ib_gid_t * const p_gid) * [in] Pointer to the GID object. * * RETURN VALUES -* Returns TRUE if the unicast GID scoping indicates link local, -* FALSE otherwise. +* Returns 1 if the unicast GID scoping indicates link local, +* 0 otherwise. * * NOTES * @@ -2305,12 +2388,12 @@ ib_gid_is_link_local(IN const ib_gid_t * const p_gid) * ib_gid_is_site_local * * DESCRIPTION -* Returns TRUE if the unicast GID scoping indicates site local, -* FALSE otherwise. +* Returns 1 if the unicast GID scoping indicates site local, +* 0 otherwise. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_gid_is_site_local(IN const ib_gid_t * const p_gid) { return ((ib_gid_get_subnet_prefix(p_gid) & @@ -2324,8 +2407,8 @@ ib_gid_is_site_local(IN const ib_gid_t * const p_gid) * [in] Pointer to the GID object. * * RETURN VALUES -* Returns TRUE if the unicast GID scoping indicates site local, -* FALSE otherwise. +* Returns 1 if the unicast GID scoping indicates site local, +* 0 otherwise. * * NOTES * @@ -3779,13 +3862,13 @@ ib_mad_init_response(IN const ib_mad_t * const p_req_mad, * ib_mad_is_response * * DESCRIPTION -* Returns TRUE if the MAD is a response ('R' bit set) +* Returns 1 if the MAD is a response ('R' bit set) * or if the MAD is a TRAP REPRESS, -* FALSE otherwise. +* 0 otherwise. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_mad_is_response(IN const ib_mad_t * const p_mad) { CL_ASSERT(p_mad); @@ -3799,8 +3882,8 @@ ib_mad_is_response(IN const ib_mad_t * const p_mad) * [in] Pointer to the MAD. * * RETURN VALUES -* Returns TRUE if the MAD is a response ('R' bit set), -* FALSE otherwise. +* Returns 1 if the MAD is a response ('R' bit set), +* 0 otherwise. * * NOTES * @@ -3836,11 +3919,11 @@ ib_mad_is_response(IN const ib_mad_t * const p_mad) * ib_rmpp_is_flag_set * * DESCRIPTION -* Returns TRUE if the MAD has the given RMPP flag set. +* Returns 1 if the MAD has the given RMPP flag set. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_rmpp_is_flag_set(IN const ib_rmpp_mad_t * const p_rmpp_mad, IN const uint8_t flag) { @@ -3857,7 +3940,7 @@ ib_rmpp_is_flag_set(IN const ib_rmpp_mad_t * const p_rmpp_mad, * [in] The RMPP flag being examined. * * RETURN VALUES -* Returns TRUE if the MAD has the given RMPP flag set. +* Returns 1 if the MAD has the given RMPP flag set. * * NOTES * @@ -4031,11 +4114,11 @@ ib_smp_get_status(IN const ib_smp_t * const p_smp) * ib_smp_is_response * * DESCRIPTION -* Returns TRUE if the SMP is a response MAD, FALSE otherwise. +* Returns 1 if the SMP is a response MAD, 0 otherwise. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_smp_is_response(IN const ib_smp_t * const p_smp) { return (ib_mad_is_response((const ib_mad_t *)p_smp)); @@ -4047,7 +4130,7 @@ ib_smp_is_response(IN const ib_smp_t * const p_smp) * [in] Pointer to the SMP packet. * * RETURN VALUES -* Returns TRUE if the SMP is a response MAD, FALSE otherwise. +* Returns 1 if the SMP is a response MAD, 0 otherwise. * * NOTES * @@ -4060,11 +4143,11 @@ ib_smp_is_response(IN const ib_smp_t * const p_smp) * ib_smp_is_d * * DESCRIPTION -* Returns TRUE if the SMP 'D' (direction) bit is set. +* Returns 1 if the SMP 'D' (direction) bit is set. * * SYNOPSIS */ -static inline boolean_t OSM_API ib_smp_is_d(IN const ib_smp_t * const p_smp) +static inline int OSM_API ib_smp_is_d(IN const ib_smp_t * const p_smp) { return ((p_smp->status & IB_SMP_DIRECTION) == IB_SMP_DIRECTION); } @@ -4075,7 +4158,7 @@ static inline boolean_t OSM_API ib_smp_is_d(IN const ib_smp_t * const p_smp) * [in] Pointer to the SMP packet. * * RETURN VALUES -* Returns TRUE if the SMP 'D' (direction) bit is set. +* Returns 1 if the SMP 'D' (direction) bit is set. * * NOTES * @@ -4306,7 +4389,7 @@ ib_sa_mad_get_payload_ptr(IN const ib_sa_mad_t * const p_sa_mad) #define IB_NODE_INFO_PORT_NUM_MASK (CL_HTON32(0xFF000000)) #define IB_NODE_INFO_VEND_ID_MASK (CL_HTON32(0x00FFFFFF)) -#if CPU_LE +#if __BYTE_ORDER == __LITTLE_ENDIAN #define IB_NODE_INFO_PORT_NUM_SHIFT 0 #else #define IB_NODE_INFO_PORT_NUM_SHIFT 24 @@ -5924,7 +6007,7 @@ typedef struct _ib_switch_info_record { * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_switch_info_get_state_change(IN const ib_switch_info_t * const p_si) { return ((p_si->life_state & IB_SWITCH_PSC) == IB_SWITCH_PSC); @@ -5980,7 +6063,7 @@ ib_switch_info_clear_state_change(IN ib_switch_info_t * const p_si) * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_switch_info_get_opt_sl2vlmapping(IN const ib_switch_info_t * const p_si) { return ((p_si->life_state & 0x01) == 0x01); @@ -6004,13 +6087,13 @@ ib_switch_info_get_opt_sl2vlmapping(IN const ib_switch_info_t * const p_si) * ib_switch_info_is_enhanced_port0 * * DESCRIPTION -* Returns TRUE if the enhancedPort0 bit is on (meaning the switch +* Returns 1 if the enhancedPort0 bit is on (meaning the switch * port zero supports enhanced functions). -* Returns FALSE otherwise. +* Returns 0 otherwise. * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_switch_info_is_enhanced_port0(IN const ib_switch_info_t * const p_si) { return ((p_si->flags & 0x08) == 0x08); @@ -6022,7 +6105,7 @@ ib_switch_info_is_enhanced_port0(IN const ib_switch_info_t * const p_si) * [in] Pointer to a SwitchInfo attribute. * * RETURN VALUES -* Returns TRUE if the switch supports enhanced port 0. FALSE otherwise. +* Returns 1 if the switch supports enhanced port 0. 0 otherwise. * * NOTES * @@ -7257,7 +7340,7 @@ typedef struct _ib_mad_notice_attr // Total Size calc Accumulated * * SYNOPSIS */ -static inline boolean_t OSM_API +static inline int OSM_API ib_notice_is_generic(IN const ib_mad_notice_attr_t * p_ntc) { return (p_ntc->generic_type & 0x80); @@ -7269,7 +7352,7 @@ ib_notice_is_generic(IN const ib_mad_notice_attr_t * p_ntc) * [in] Pointer to the notice MAD attribute * * RETURN VALUES -* TRUE if mad is generic +* 1 if mad is generic * * SEE ALSO * ib_mad_notice_attr_t @@ -7296,7 +7379,7 @@ ib_notice_get_type(IN const ib_mad_notice_attr_t * p_ntc) * [in] Pointer to the notice MAD attribute * * RETURN VALUES -* TRUE if mad is generic +* 1 if mad is generic * * SEE ALSO * ib_mad_notice_attr_t @@ -8687,27 +8770,27 @@ typedef enum _ib_atomic_t { * SYNOPSIS */ typedef struct _ib_port_cap { - boolean_t cm; - boolean_t snmp; - boolean_t dev_mgmt; - boolean_t vend; - boolean_t sm; - boolean_t sm_disable; - boolean_t qkey_ctr; - boolean_t pkey_ctr; - boolean_t notice; - boolean_t trap; - boolean_t apm; - boolean_t slmap; - boolean_t pkey_nvram; - boolean_t mkey_nvram; - boolean_t sysguid; - boolean_t dr_notice; - boolean_t boot_mgmt; - boolean_t capm_notice; - boolean_t reinit; - boolean_t ledinfo; - boolean_t port_active; + int cm; + int snmp; + int dev_mgmt; + int vend; + int sm; + int sm_disable; + int qkey_ctr; + int pkey_ctr; + int notice; + int trap; + int apm; + int slmap; + int pkey_nvram; + int mkey_nvram; + int sysguid; + int dr_notice; + int boot_mgmt; + int capm_notice; + int reinit; + int ledinfo; + int port_active; } ib_port_cap_t; /*****/ @@ -8858,19 +8941,19 @@ typedef struct _ib_ca_attr { * timeout = 4.096 microseconds * 2^local_ack_delay */ uint8_t local_ack_delay; - boolean_t bad_pkey_ctr_support; - boolean_t bad_qkey_ctr_support; - boolean_t raw_mcast_support; - boolean_t apm_support; - boolean_t av_port_check; - boolean_t change_primary_port; - boolean_t modify_wr_depth; - boolean_t current_qp_state_support; - boolean_t shutdown_port_capability; - boolean_t init_type_support; - boolean_t port_active_event_support; - boolean_t system_image_guid_support; - boolean_t hw_agents; + int bad_pkey_ctr_support; + int bad_qkey_ctr_support; + int raw_mcast_support; + int apm_support; + int av_port_check; + int change_primary_port; + int modify_wr_depth; + int current_qp_state_support; + int shutdown_port_capability; + int init_type_support; + int port_active_event_support; + int system_image_guid_support; + int hw_agents; ib_net64_t system_image_guid; uint32_t num_page_sizes; uint8_t num_ports; @@ -9091,7 +9174,7 @@ typedef struct _ib_av_attr { uint8_t port_num; uint8_t sl; ib_net16_t dlid; - boolean_t grh_valid; + int grh_valid; ib_grh_t grh; uint8_t static_rate; uint8_t path_bits; @@ -9252,7 +9335,7 @@ typedef struct _ib_qp_create { uint32_t rq_sge; ib_cq_handle_t h_sq_cq; ib_cq_handle_t h_rq_cq; - boolean_t sq_signaled; + int sq_signaled; } ib_qp_create_t; /* * FIELDS @@ -9301,8 +9384,8 @@ typedef struct _ib_qp_create { * sq_signaled * A flag that is used to indicate whether the queue pair will signal * an event upon completion of a send work request. If set to -* TRUE, send work requests will always generate a completion -* event. If set to FALSE, a completion event will only be +* 1, send work requests will always generate a completion +* event. If set to 0, a completion event will only be * generated if the send_opt field of the send work request has the * IB_SEND_OPT_SIGNALED flag set. * @@ -9333,7 +9416,7 @@ typedef struct _ib_qp_attr { ib_cq_handle_t h_sq_cq; ib_cq_handle_t h_rq_cq; ib_rdd_handle_t h_rdd; - boolean_t sq_signaled; + int sq_signaled; ib_qp_state_t state; ib_net32_t num; ib_net32_t dest_num; @@ -9452,7 +9535,7 @@ typedef struct _ib_qp_mod { uint16_t pkey_index; } rts; struct _qp_sqd { - boolean_t sqd_event; + int sqd_event; } sqd; } state; } ib_qp_mod_t; @@ -9555,7 +9638,7 @@ typedef struct _ib_eec_mod { uint8_t primary_port; } rts; struct _eec_sqd { - boolean_t sqd_event; + int sqd_event; } sqd; } state; } ib_eec_mod_t; -- 1.5.4.5 From sean.hefty at intel.com Fri Sep 18 15:23:40 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 18 Sep 2009 15:23:40 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090918143216.47fa8d6f.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090918143216.47fa8d6f.weiny2@llnl.gov> Message-ID: <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> >Rough hack. Does windows have stdint.h, byteswap.h, and endian.h? If not, adding the headers with the needed definitions is trivial. >+/* 16bit */ >+#if __BYTE_ORDER == __LITTLE_ENDIAN >+#define CL_NTOH16( x ) (uint16_t)( \ >+ (((uint16_t)(x) & 0x00FF) << 8) | \ >+ (((uint16_t)(x) & 0xFF00) >> 8) ) >+#else >+#define CL_NTOH16( x ) (x) >+#endif >+#define CL_HTON16 CL_NTOH16 >+ >+/* 32bit */ >+#if __BYTE_ORDER == __LITTLE_ENDIAN >+#define CL_NTOH32( x ) (uint32_t)( \ >+ (((uint32_t)(x) & 0x000000FF) << 24) | \ >+ (((uint32_t)(x) & 0x0000FF00) << 8) | \ >+ (((uint32_t)(x) & 0x00FF0000) >> 8) | \ >+ (((uint32_t)(x) & 0xFF000000) >> 24) ) >+#else >+#define CL_NTOH32( x ) (x) >+#endif >+#define CL_HTON32 CL_NTOH32 >+ >+/* 64bit */ >+#if __BYTE_ORDER == __LITTLE_ENDIAN >+#define CL_NTOH64( x ) (uint64_t)( \ >+ (((uint64_t)(x) & 0x00000000000000FFULL) << 56) | \ >+ (((uint64_t)(x) & 0x000000000000FF00ULL) << 40) | \ >+ (((uint64_t)(x) & 0x0000000000FF0000ULL) << 24) | \ >+ (((uint64_t)(x) & 0x00000000FF000000ULL) << 8 ) | \ >+ (((uint64_t)(x) & 0x000000FF00000000ULL) >> 8 ) | \ >+ (((uint64_t)(x) & 0x0000FF0000000000ULL) >> 24) | \ >+ (((uint64_t)(x) & 0x00FF000000000000ULL) >> 40) | \ >+ (((uint64_t)(x) & 0xFF00000000000000ULL) >> 56) ) >+#else >+#define CL_NTOH64( x ) (x) >+#endif >+#define CL_HTON64 CL_NTOH64 >+ >+#if __BYTE_ORDER == __LITTLE_ENDIAN >+#define cl_ntoh16(x) bswap_16(x) >+#define cl_hton16(x) bswap_16(x) >+#define cl_ntoh32(x) bswap_32(x) >+#define cl_hton32(x) bswap_32(x) >+#define cl_ntoh64(x) (uint64_t)bswap_64(x) >+#define cl_hton64(x) (uint64_t)bswap_64(x) >+#else /* Big Endian */ >+#define cl_ntoh16(x) (x) >+#define cl_hton16(x) (x) >+#define cl_ntoh32(x) (x) >+#define cl_hton32(x) (x) >+#define cl_ntoh64(x) (x) >+#define cl_hton64(x) (x) >+#endif Why the different defines for cl_noth and CL_NTOH? From weiny2 at llnl.gov Fri Sep 18 15:22:22 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Sep 2009 15:22:22 -0700 Subject: [ofa-general] Multi-threaded diags (Was: Re: [PATCH 4/5] infiniband-diags/libibnetdisc: Introduce a context object.) In-Reply-To: <20090827182056.GV406@obsidianresearch.com> References: <20090813204306.dffc3237.weiny2@llnl.gov> <20090816110200.GS25501@me> <20090817083023.da17378b.weiny2@llnl.gov> <20090823120609.GG9547@me> <20090826164026.8dcce4b2.weiny2@llnl.gov> <20090827002420.GT406@obsidianresearch.com> <20090827094810.6cfe02f5.weiny2@llnl.gov> <20090827182056.GV406@obsidianresearch.com> Message-ID: <20090918152222.367a975b.weiny2@llnl.gov> On Thu, 27 Aug 2009 12:20:56 -0600 Jason Gunthorpe wrote: > On Thu, Aug 27, 2009 at 09:48:10AM -0700, Ira Weiny wrote: > > > > FSM multiplexing the recv path usually gives much better performance, > > > something like net discovery is quite easy.. > > > > Using the original algorithm and data structures lended itself to > > threading. Now that I am neck deep in all this I have thought that > > rewriting it all might be easier. > > Yah. mayhaps.. > > > > main loop: > > > fill tx queue from next list > > > recieve replies and correlate with next list > > > This would still need additional code (or additional synchronization in the > > API to libibnetdisc) if you wanted a user app to be multi-threaded. Someone > > has to be in charge of receiving all replies on that ibmad_port object and > > handing them to the proper owner. Of course one could open multiple > > ibmad_port objects but how is the app writer to know to do that? Digging > > through the code to find out that libibnetdisc is consuming all the replies? > > What is the use case here? I thought the app would be something like: > > main() > { > foo = libibnetdisc_setup(); > libibnetdisc_discover_all(foo,res); > // Do interesting things with res. > } That is the current use case. However I can see use cases were discover is called periodically to get a new snapshot of the fabric. Also since the discover can scan parts of the fabric ("libibnetdisc_discover_part") and return a fabric which represents pieces of the whole I could see "fabric" operations such as merge, update, and replace. > > Where the goal is to have libibnetdisc_discover_all complete > expediently. > > As long as the context 'foo' is re-entrant in all ways with all other > libraries and contexts I think useful threaded apps can be created. Yes absolutely. However, my current issue is with making ibmad_port thread safe so that libibnetdisc_discover can be multithreaded. I have been able to do so but the amount of code it took seems unreasonable to force upon any users ob libibmad. > > > This is what got me on this in the first place. smp_query_via > > (_do_madrpc) is not thread safe. > > Sure, the entire library is not thread safe around the ibmad_port > context. But who cares? If the caller to libibnetdisc wants to thread > that way they need to open another context. Yes, they can but how do they know they need to do this? Furthermore how many context's are required? The bottom line is I wanted multiple outstanding queries. I am not going to open a context for each query. The amount of code required to process and sort Transaction ID's should be provided by libibmad or a layer at that level. It should not be required for every user process or user lib. Furthermore my prototype code does not support redirect. Therefore it makes the code even more difficult. Why make every user suffer this problem? > > > Also, I feel that someone down the road might fall into the same > > trap that I did thinking that smp_query_via is thread safe and I > > would like to fix that. > > Well.. How can it be threaded? umad_send/umad_recv are inherently > single threaded APIs. You have to layer a TID based threading dispatch > mechanism on top of it. Much better to let the kernel do that and open > multiple umad fds. I am a bit confused. Do you mean to open multiple umad fds such that the kernel will do the TID based dispatch for you? Or are you suggesting a different kernel umad implementation? > > > > each entry: > > > add to next list additional ports > > > > > > Repeat until dead. > > > > > > Where a 'next list' would be a set of actions along the lines of > > > 'query node' or 'query port' the action on a 'query node' completion > > > is to generate 'query port' next list items for all the ports, and on > > > 'query port' completion is to generate 'query node' items for all > > > enabled ports.. > > > > > > libumad is nonblocking, parallel, etc... > > > > Yes, and libibmad layers on top of it an easier interface to issue common > > queries. Why should we ask the user to re-implement that code? > > Well, the very best way to do this is to have a FSM engine API at the > core of the MAD libary: > mad_ctx->callback = done_this; > mad_post(mad,mad_ctx) > > done_this(reply): > ... Which way do you propose to do this, have a thread calling "done_this" or having the user call an event loop? > > > For example, mad_rpc now handles redirection. My implementation > > does not yet. So now I have to handle that on my own as well... > > :-( > > To be honest, I don't like the libibmad/libibumad APIs one bit - I'm > not surprised they don't work for you.. > > Frankly, we really need a usable MAD libary with sane APIs, and very > high level APIs on top of that. You cannot make an IB application > without doing SA queries at a minimum and the current process is > HORRID. > > I see nothing of value in libimad and libibumad to support that :| I see some things of value in libibmad. However, I have been reluctant to use it in the past and I agree it needs fixing. I don't want to reinvent the wheel but perhaps that is what needs to be done... Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From weiny2 at llnl.gov Fri Sep 18 15:28:48 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Sep 2009 15:28:48 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090918143216.47fa8d6f.weiny2@llnl.gov> <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> Message-ID: <20090918152848.e3a96862.weiny2@llnl.gov> On Fri, 18 Sep 2009 15:23:40 -0700 "Sean Hefty" wrote: > >Rough hack. Does windows have stdint.h, byteswap.h, and endian.h? > > If not, adding the headers with the needed definitions is trivial. > > >+/* 16bit */ > >+#if __BYTE_ORDER == __LITTLE_ENDIAN > >+#define CL_NTOH16( x ) (uint16_t)( \ > >+ (((uint16_t)(x) & 0x00FF) << 8) | \ > >+ (((uint16_t)(x) & 0xFF00) >> 8) ) > >+#else > >+#define CL_NTOH16( x ) (x) > >+#endif > >+#define CL_HTON16 CL_NTOH16 > >+ > >+/* 32bit */ > >+#if __BYTE_ORDER == __LITTLE_ENDIAN > >+#define CL_NTOH32( x ) (uint32_t)( \ > >+ (((uint32_t)(x) & 0x000000FF) << 24) | \ > >+ (((uint32_t)(x) & 0x0000FF00) << 8) | \ > >+ (((uint32_t)(x) & 0x00FF0000) >> 8) | \ > >+ (((uint32_t)(x) & 0xFF000000) >> 24) ) > >+#else > >+#define CL_NTOH32( x ) (x) > >+#endif > >+#define CL_HTON32 CL_NTOH32 > >+ > >+/* 64bit */ > >+#if __BYTE_ORDER == __LITTLE_ENDIAN > >+#define CL_NTOH64( x ) (uint64_t)( > \ > >+ (((uint64_t)(x) & 0x00000000000000FFULL) << 56) | > \ > >+ (((uint64_t)(x) & 0x000000000000FF00ULL) << 40) | > \ > >+ (((uint64_t)(x) & 0x0000000000FF0000ULL) << 24) | > \ > >+ (((uint64_t)(x) & 0x00000000FF000000ULL) << 8 ) | > \ > >+ (((uint64_t)(x) & 0x000000FF00000000ULL) >> 8 ) | > \ > >+ (((uint64_t)(x) & 0x0000FF0000000000ULL) >> 24) | > \ > >+ (((uint64_t)(x) & 0x00FF000000000000ULL) >> 40) | > \ > >+ (((uint64_t)(x) & 0xFF00000000000000ULL) >> 56) ) > >+#else > >+#define CL_NTOH64( x ) (x) > >+#endif > >+#define CL_HTON64 CL_NTOH64 > >+ > >+#if __BYTE_ORDER == __LITTLE_ENDIAN > >+#define cl_ntoh16(x) bswap_16(x) > >+#define cl_hton16(x) bswap_16(x) > >+#define cl_ntoh32(x) bswap_32(x) > >+#define cl_hton32(x) bswap_32(x) > >+#define cl_ntoh64(x) (uint64_t)bswap_64(x) > >+#define cl_hton64(x) (uint64_t)bswap_64(x) > >+#else /* Big Endian */ > >+#define cl_ntoh16(x) (x) > >+#define cl_hton16(x) (x) > >+#define cl_ntoh32(x) (x) > >+#define cl_hton32(x) (x) > >+#define cl_ntoh64(x) (x) > >+#define cl_hton64(x) (x) > >+#endif > > Why the different defines for cl_noth and CL_NTOH? One is for static defines CL_NTOH and the other is for variables at run time. I found this code in Linux. __bswap_constant_16 __bswap_constant_32 __bswap_constant_64 Which does the same thing, but I am not sure if they are universal or not as they are in rather than byteswap.h Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jgunthorpe at obsidianresearch.com Fri Sep 18 16:09:26 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Fri, 18 Sep 2009 17:09:26 -0600 Subject: [ofa-general] Multi-threaded diags (Was: Re: [PATCH 4/5] infiniband-diags/libibnetdisc: Introduce a context object.) In-Reply-To: <20090918152222.367a975b.weiny2@llnl.gov> References: <20090813204306.dffc3237.weiny2@llnl.gov> <20090816110200.GS25501@me> <20090817083023.da17378b.weiny2@llnl.gov> <20090823120609.GG9547@me> <20090826164026.8dcce4b2.weiny2@llnl.gov> <20090827002420.GT406@obsidianresearch.com> <20090827094810.6cfe02f5.weiny2@llnl.gov> <20090827182056.GV406@obsidianresearch.com> <20090918152222.367a975b.weiny2@llnl.gov> Message-ID: <20090918230926.GD19540@obsidianresearch.com> On Fri, Sep 18, 2009 at 03:22:22PM -0700, Ira Weiny wrote: > > main() > > { > > foo = libibnetdisc_setup(); > > libibnetdisc_discover_all(foo,res); > > // Do interesting things with res. > > } > > That is the current use case. However I can see use cases were discover is > called periodically to get a new snapshot of the fabric. Also since the > discover can scan parts of the fabric ("libibnetdisc_discover_part") and > return a fabric which represents pieces of the whole I could see "fabric" > operations such as merge, update, and replace. Sure, the way I've approached this in the past is that the fabric description is stored as a directect graph, and the usual set of graph manipulation primitives (BFS, difference, join, splice, etc) are available to work on it. This makes alot of the stuff people want to do expressable via quite natural graph concepts. > > Sure, the entire library is not thread safe around the ibmad_port > > context. But who cares? If the caller to libibnetdisc wants to thread > > that way they need to open another context. > > Yes, they can but how do they know they need to do this? > Furthermore how many context's are required? Well, that is a doc question right? In C - no metion of threading in docs == not thread safe. > The bottom line is I wanted multiple outstanding queries. I am not > going to open a context for each query. The amount of code required > to process and sort Transaction ID's should be provided by libibmad > or a layer at that level. It should not be required for every user > process or user lib. Furthermore my prototype code does not support > redirect. Therefore it makes the code even more difficult. Why > make every user suffer this problem? The transaction ID to FD sorting code is provided in the kernel. If someone wants threads they really want TID to thread mapping so that a synchronous control flow is prossible: madSet(foo,value); // Sends a MAD, then blocks on a recieve for a TID match'd reply This is why it is unsuitable for libumad to do any kind of threading how does it handle multiplexing access to the FD from multiple threads without a huge, huge mess internally? > I am a bit confused. Do you mean to open multiple umad fds such that the > kernel will do the TID based dispatch for you? Or are you suggesting a > different kernel umad implementation? Yes, that is what I am suggesting. Every thread you create gets a private FD and a private mad context. The mad layer is not threaded (beyond being re-entrant). Each thread sends and blocks on the thread specific FD and the kernel multiplexes transmits and sorts replies to direct them to the proper waiting thread. Anything else is a big can of worms. But, single threaded event FSM based is almost always dramatically simpler and faster. But even so, it would work the same way with threads, each thread gets an thread specific FSM context. > > Well, the very best way to do this is to have a FSM engine API at the > > core of the MAD libary: > > mad_ctx->callback = done_this; > > mad_post(mad,mad_ctx) > > > > done_this(reply): > > ... > > Which way do you propose to do this, have a thread calling "done_this" or > having the user call an event loop? Look at something like glib to see how this generally works.. but yes, this is done with a top level poll event loop waiting on the umad fd. A reasonable goal would to have an FSM interface API that could be plugged into glib easially. This gives you mad level parallelism without threads. > I see some things of value in libibmad. However, I have been reluctant to use > it in the past and I agree it needs fixing. I don't want to reinvent the > wheel but perhaps that is what needs to be done... I'm just about to the point where I need something alot better for the little app I'm working on - I want to setup multipath IB connections, which means sophisticated PR queries.. So, I'd like to see this fixed up too, and I can probably work on a few things. We can donate our structure parsing codegen framework which is dramatically better than what libibmad uses today. I'm specifically interested in GMP's for PR queries, but in much the same infrastructure covers both SMPs and GMPs. What I'd like is a nice uniform language across libibcm and this the MAD library so I can just do a PR, get the result, pass it to the CM and up into the kernel without huge app specific code to do all that. Nothing like that exists right now and it sucks. Very hard to write IB apps. Jason From jgunthorpe at obsidianresearch.com Fri Sep 18 16:23:36 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Fri, 18 Sep 2009 17:23:36 -0600 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090918152848.e3a96862.weiny2@llnl.gov> References: <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090918143216.47fa8d6f.weiny2@llnl.gov> <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> <20090918152848.e3a96862.weiny2@llnl.gov> Message-ID: <20090918232336.GE19540@obsidianresearch.com> On Fri, Sep 18, 2009 at 03:28:48PM -0700, Ira Weiny wrote: > One is for static defines CL_NTOH and the other is for variables at > run time. I found this code in Linux. Thats gross, and is exactly why you don't do this yourself... bswap64/32/16 do this all automatically. ntohl also do it and are more portable. from glibc: # define ntohl(x) __bswap_32 (x) #define bswap_32(x) __bswap_32 (x) # define __bswap_32(x) \ (__extension__ \ ({ register unsigned int __v, __x = (x); \ if (__builtin_constant_p (__x)) \ __v = __bswap_constant_32 (__x); \ else \ __asm__ ("bswap %0" : "=r" (__v) : "0" (__x)); \ __v; })) I'd say just use the ntohl, ntohs, and bswap64 macros directly and Window can provide headers with whatever it needs instead. They are already doing this.. Ditto for stdint.h and stdbool.h. Those are C99 headers, just use them. Jason From weiny2 at llnl.gov Fri Sep 18 17:05:39 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Sep 2009 17:05:39 -0700 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090918232336.GE19540@obsidianresearch.com> References: <20090917132050.041b077d.weiny2@llnl.gov> <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090918143216.47fa8d6f.weiny2@llnl.gov> <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> <20090918152848.e3a96862.weiny2@llnl.gov> <20090918232336.GE19540@obsidianresearch.com> Message-ID: <20090918170539.ca4b73a7.weiny2@llnl.gov> On Fri, 18 Sep 2009 17:23:36 -0600 Jason Gunthorpe wrote: > On Fri, Sep 18, 2009 at 03:28:48PM -0700, Ira Weiny wrote: > > > One is for static defines CL_NTOH and the other is for variables at > > run time. I found this code in Linux. > > Thats gross, and is exactly why you don't do this yourself... > > bswap64/32/16 do this all automatically. ntohl also do it and are more > portable. > > from glibc: > > # define ntohl(x) __bswap_32 (x) > #define bswap_32(x) __bswap_32 (x) > # define __bswap_32(x) \ > (__extension__ \ > ({ register unsigned int __v, __x = (x); \ > if (__builtin_constant_p (__x)) \ > __v = __bswap_constant_32 (__x); \ > else \ > __asm__ ("bswap %0" : "=r" (__v) : "0" (__x)); \ > __v; })) > > > I'd say just use the ntohl, ntohs, and bswap64 macros directly and > Window can provide headers with whatever it needs instead. They are > already doing this.. I agree but ntohl etc do not seem to work for the macros which are defined. osm_sa_mad_ctrl.c: In function 'sa_mad_ctrl_process': osm_sa_mad_ctrl.c:140: error: case label does not reduce to an integer constant ... switch (p_sa_mad->attr_id) { case IB_MAD_ATTR_CLASS_PORT_INFO: ... >From ib_types.h #define IB_MAD_ATTR_CLASS_PORT_INFO (CL_HTON16(0x0001)) Thus the #define __bswap_constant_16(x) \ ((((x) >> 8) & 0xff) | (((x) & 0xff) << 8)) in bits/byteswap.h Ira > > Ditto for stdint.h and stdbool.h. Those are C99 headers, just use them. > > Jason -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jgunthorpe at obsidianresearch.com Fri Sep 18 17:20:37 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Fri, 18 Sep 2009 18:20:37 -0600 Subject: [ofw] Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm In-Reply-To: <20090918170539.ca4b73a7.weiny2@llnl.gov> References: <95DE322F4F2A423198FC248FDF534C8D@amr.corp.intel.com> <20090918143216.47fa8d6f.weiny2@llnl.gov> <7D7BE366EA2C478CB0CA9FCDD81ED49E@amr.corp.intel.com> <20090918152848.e3a96862.weiny2@llnl.gov> <20090918232336.GE19540@obsidianresearch.com> <20090918170539.ca4b73a7.weiny2@llnl.gov> Message-ID: <20090919002037.GG19540@obsidianresearch.com> On Fri, Sep 18, 2009 at 05:05:39PM -0700, Ira Weiny wrote: > > I'd say just use the ntohl, ntohs, and bswap64 macros directly and > > Window can provide headers with whatever it needs instead. They are > > already doing this.. > > I agree but ntohl etc do not seem to work for the macros which are defined. Yah, thats right they don't work for case labels, that is about the only case they don't work for.. Might even be a gcc bug? > #define IB_MAD_ATTR_CLASS_PORT_INFO (CL_HTON16(0x0001)) OMG that is gross > Thus the > > #define __bswap_constant_16(x) \ > ((((x) >> 8) & 0xff) | (((x) & 0xff) << 8)) Well, no, that is to support the __builtin_constant_p method I clipped, that macro isn't for user use. Guess you are stuck with the upper and lower case versions of the macros. Sucky Jason From vlad at lists.openfabrics.org Sat Sep 19 03:08:08 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 19 Sep 2009 03:08:08 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090919-0200 daily build status Message-ID: <20090919100808.A50EEE61F18@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sashak at voltaire.com Sat Sep 19 09:58:17 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 19:58:17 +0300 Subject: [ofa-general] [PATCH] opensm/multicast: improve function prototypes Message-ID: <20090919165817.GC13667@me> Improve some function prototypes: - osm_mgrp_new() will get MCMember record now. - osm_mgrp_is_port_present() is replaced by cleaner osm_mgrp_get_mcm_port() helper. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_multicast.h | 25 ++++++++++--------------- opensm/opensm/osm_multicast.c | 32 ++++++++++---------------------- opensm/opensm/osm_sa_mcmember_record.c | 26 +++++++++++--------------- 3 files changed, 31 insertions(+), 52 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index 32bcb78..15d7e78 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -142,12 +142,15 @@ typedef struct osm_mgrp { * * SYNOPSIS */ -osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid); +osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid, IN ib_member_rec_t * mcmr); /* * PARAMETERS * mlid * [in] Multicast LID for this multicast group. * +* mcmr +* [in] MCMember Record for this multicast group. +* * RETURN VALUES * IB_SUCCESS if initialization was successful. * @@ -309,20 +312,17 @@ osm_mcm_port_t *osm_mgrp_add_port(osm_subn_t *subn, osm_log_t *log, * SEE ALSO *********/ -/****f* OpenSM: Multicast Group/osm_mgrp_is_port_present +/****f* OpenSM: Multicast Group/osm_mgrp_get_mcm_port * NAME -* osm_mgrp_is_port_present +* osm_mgrp_get_mcm_port * * DESCRIPTION -* checks a port from the multicast group. +* finds a port in the multicast group. * * SYNOPSIS */ - -boolean_t -osm_mgrp_is_port_present(IN const osm_mgrp_t * const p_mgrp, - IN const ib_net64_t port_guid, - OUT osm_mcm_port_t ** const pp_mcm_port); +osm_mcm_port_t *osm_mgrp_get_mcm_port(IN const osm_mgrp_t * const p_mgrp, + IN const ib_net64_t port_guid); /* * PARAMETERS * p_mgrp @@ -331,13 +331,8 @@ osm_mgrp_is_port_present(IN const osm_mgrp_t * const p_mgrp, * port_guid * [in] Port guid of the departing port. * -* pp_mcm_port -* [out] Pointer to a pointer to osm_mcm_port_t -* Updated to the member on success or NULLed -* * RETURN VALUES -* TRUE if port present -* FALSE if port is not present. +* Pointer to the mcm port object when present or NULL otherwise. * * NOTES * diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 5a10003..a674514 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -75,9 +75,7 @@ void osm_mgrp_delete(IN osm_mgrp_t * p_mgrp) free(p_mgrp); } -/********************************************************************** - **********************************************************************/ -osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) +osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid, IN ib_member_rec_t * mcmr) { osm_mgrp_t *p_mgrp; @@ -88,10 +86,13 @@ osm_mgrp_t *osm_mgrp_new(IN const ib_net16_t mlid) memset(p_mgrp, 0, sizeof(*p_mgrp)); cl_qmap_init(&p_mgrp->mcm_port_tbl); p_mgrp->mlid = mlid; + p_mgrp->mcmember_rec = *mcmr; return p_mgrp; } +/********************************************************************** + **********************************************************************/ void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mgrp) { osm_mcm_port_t *mcm_port; @@ -207,8 +208,6 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, return mcm_port; } -/********************************************************************** - **********************************************************************/ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, osm_mcm_port_t * mcm_port, ib_member_rec_t *mcmr) { @@ -277,22 +276,11 @@ void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, /********************************************************************** **********************************************************************/ -boolean_t osm_mgrp_is_port_present(IN const osm_mgrp_t * p_mgrp, - IN const ib_net64_t port_guid, - OUT osm_mcm_port_t ** const pp_mcm_port) +osm_mcm_port_t *osm_mgrp_get_mcm_port(IN const osm_mgrp_t * p_mgrp, + IN const ib_net64_t port_guid) { - cl_map_item_t *p_map_item; - - CL_ASSERT(p_mgrp); - - p_map_item = cl_qmap_get(&p_mgrp->mcm_port_tbl, port_guid); - - if (p_map_item != cl_qmap_end(&p_mgrp->mcm_port_tbl)) { - if (pp_mcm_port) - *pp_mcm_port = (osm_mcm_port_t *) p_map_item; - return TRUE; - } - if (pp_mcm_port) - *pp_mcm_port = NULL; - return FALSE; + cl_map_item_t *item = cl_qmap_get(&p_mgrp->mcm_port_tbl, port_guid); + if (item != cl_qmap_end(&p_mgrp->mcm_port_tbl)) + return (osm_mcm_port_t *)item; + return NULL; } diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8f7816b..f581663 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -340,10 +340,10 @@ static boolean_t validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; - *pp_mcm_port = NULL; + *pp_mcm_port = osm_mgrp_get_mcm_port(p_mgrp, portguid); /* o15-0.2.1: If this is a new port being added - nothing to check */ - if (!osm_mgrp_is_port_present(p_mgrp, portguid, pp_mcm_port)) { + if (!*pp_mcm_port) { OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "This is a new port in the MC group\n"); return TRUE; @@ -428,10 +428,10 @@ static boolean_t validate_delete(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; - *pp_mcm_port = NULL; + *pp_mcm_port = osm_mgrp_get_mcm_port(p_mgrp, portguid); /* 1 */ - if (!osm_mgrp_is_port_present(p_mgrp, portguid, pp_mcm_port)) { + if (!*pp_mcm_port) { OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "Failed to find the port in the MC group\n"); return FALSE; @@ -812,7 +812,8 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, } /* create a new MC Group */ - *pp_mgrp = osm_mgrp_new(mlid); + mcm_rec.mlid = mlid; + *pp_mgrp = osm_mgrp_new(mlid, &mcm_rec); if (*pp_mgrp == NULL) { OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B08: " "osm_mgrp_new failed\n"); @@ -821,10 +822,6 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, goto Exit; } - /* Initialize the mgrp */ - (*pp_mgrp)->mcmember_rec = mcm_rec; - (*pp_mgrp)->mcmember_rec.mlid = mlid; - /* the mcmember_record should have mtu_sel, rate_sel, and pkt_lifetime_sel = 2 */ (*pp_mgrp)->mcmember_rec.mtu &= 0x3f; (*pp_mgrp)->mcmember_rec.mtu |= 2 << 6; /* exactly */ @@ -1307,13 +1304,12 @@ static void mcmr_by_comp_mask(osm_sa_t * sa, const ib_member_rec_t * p_rcvd_rec, /* so did we get the PortGUID mask */ if (IB_MCR_COMPMASK_PORT_GID & comp_mask) { /* try to find this port */ - if (osm_mgrp_is_port_present(p_mgrp, portguid, &p_mcm_port)) { - scope_state = p_mcm_port->scope_state; - memcpy(&port_gid, &(p_mcm_port->port_gid), - sizeof(ib_gid_t)); - proxy_join = p_mcm_port->proxy_join; - } else /* port not in group */ + p_mcm_port = osm_mgrp_get_mcm_port(p_mgrp, portguid); + if (!p_mcm_port) /* port not in group */ goto Exit; + scope_state = p_mcm_port->scope_state; + memcpy(&port_gid, &(p_mcm_port->port_gid), sizeof(ib_gid_t)); + proxy_join = p_mcm_port->proxy_join; } else /* point to the group information */ scope_state = p_mgrp->mcmember_rec.scope_state; -- 1.6.5.rc1 From sashak at voltaire.com Sat Sep 19 09:59:24 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 19:59:24 +0300 Subject: [ofa-general] [PATCH] opensm/osm_sa_path_record.c: validate multicast membership In-Reply-To: <20090919165817.GC13667@me> References: <20090919165817.GC13667@me> Message-ID: <20090919165924.GD13667@me> When PathRecord query has multicast destination and SLID and/or SGID is specified we need to ensure that this path source port is member of the destination multicast group. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sa_path_record.c | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 75d9516..f68be20 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1488,6 +1488,7 @@ static ib_api_status_t pr_match_mgrp_attributes(IN osm_sa_t * sa, const ib_path_rec_t *p_pr; const ib_sa_mad_t *p_sa_mad; ib_net64_t comp_mask; + const osm_port_t *port; ib_api_status_t status = IB_ERROR; uint32_t flow_label; uint8_t sl; @@ -1501,6 +1502,19 @@ static ib_api_status_t pr_match_mgrp_attributes(IN osm_sa_t * sa, comp_mask = p_sa_mad->comp_mask; /* If SGID and/or SLID specified, should validate as member of MC group */ + if (comp_mask & IB_PR_COMPMASK_SGID) { + port = osm_get_port_by_guid(sa->p_subn, + p_pr->sgid.unicast.interface_id); + if (!port || !osm_mgrp_get_mcm_port(p_mgrp, port->guid)) + goto Exit; + } + + if (comp_mask & IB_PR_COMPMASK_SLID) { + if (osm_get_port_by_base_lid(sa->p_subn, p_pr->slid, &port) || + !port || !osm_mgrp_get_mcm_port(p_mgrp, port->guid)) + goto Exit; + } + /* Also, MTU, rate, packet lifetime, and raw traffic requested are not currently checked */ if (comp_mask & IB_PR_COMPMASK_PKEY) { if (p_pr->pkey != p_mgrp->mcmember_rec.pkey) -- 1.6.5.rc1 From sashak at voltaire.com Sat Sep 19 10:00:00 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 20:00:00 +0300 Subject: [ofa-general] [PATCH] opensm: discard multicast SA PR with wildcard DGID In-Reply-To: <20090919165924.GD13667@me> References: <20090919165817.GC13667@me> <20090919165924.GD13667@me> Message-ID: <20090919170000.GE13667@me> IBA 1.2.1 states (Vol.1, 15.2.5.16, p.918) that DGID shell be explicitly specified when path destination of SA PathRecord query is multicast group. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sa_path_record.c | 45 ++++++++++++------------------------ 1 files changed, 15 insertions(+), 30 deletions(-) diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index f68be20..37f32ac 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1437,45 +1437,27 @@ static osm_mgrp_t *pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) { ib_path_rec_t *p_pr; const ib_sa_mad_t *p_sa_mad; - ib_net64_t comp_mask; - osm_mgrp_t *mgrp = NULL; + osm_mgrp_t *mgrp; p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); - p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); + p_pr = ib_sa_mad_get_payload_ptr(p_sa_mad); - comp_mask = p_sa_mad->comp_mask; + if (!(p_sa_mad->comp_mask & IB_PR_COMPMASK_DGID)) { + OSM_LOG(sa->p_log, OSM_LOG_VERBOSE, + "discard multicast target SA PR with wildcarded MGID"); + return NULL; + } - if ((comp_mask & IB_PR_COMPMASK_DGID) && - !(mgrp = osm_get_mgrp_by_mgid(sa, &p_pr->dgid))) { + mgrp = osm_get_mgrp_by_mgid(sa, &p_pr->dgid); + if (!mgrp) { char gid_str[INET6_ADDRSTRLEN]; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1F09: " "No MC group found for PathRecord destination GID %s\n", inet_ntop(AF_INET6, p_pr->dgid.raw, gid_str, sizeof gid_str)); - goto Exit; - } - - if (comp_mask & IB_PR_COMPMASK_DLID) { - if (mgrp) { - /* check that the MLID in the MC group is */ - /* the same as the DLID in the PathRecord */ - if (mgrp->mlid != p_pr->dlid) { - /* Note: perhaps this might be better indicated as an invalid request */ - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1F10: " - "MC group MLID 0x%x does not match " - "PathRecord destination LID 0x%x\n", - mgrp->mlid, p_pr->dlid); - mgrp = NULL; - goto Exit; - } - } else - if (!(mgrp = osm_get_mgrp_by_mlid(sa->p_subn, p_pr->dlid))) - OSM_LOG(sa->p_log, OSM_LOG_ERROR, - "ERR 1F11: " "No MC group found for PathRecord " - "destination LID 0x%x\n", p_pr->dlid); + return NULL; } -Exit: return mgrp; } @@ -1497,10 +1479,14 @@ static ib_api_status_t pr_match_mgrp_attributes(IN osm_sa_t * sa, OSM_LOG_ENTER(sa->p_log); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); - p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); + p_pr = ib_sa_mad_get_payload_ptr(p_sa_mad); comp_mask = p_sa_mad->comp_mask; + /* check that MLID of the MC group matchs the PathRecord DLID */ + if ((comp_mask & IB_PR_COMPMASK_DLID) && p_mgrp->mlid != p_pr->dlid) + goto Exit; + /* If SGID and/or SLID specified, should validate as member of MC group */ if (comp_mask & IB_PR_COMPMASK_SGID) { port = osm_get_port_by_guid(sa->p_subn, @@ -1713,7 +1699,6 @@ McastDest: /* First, get the MC info */ p_mgrp = pr_get_mgrp(sa, p_madw); - if (!p_mgrp) goto Unlock; -- 1.6.5.rc1 From sashak at voltaire.com Sat Sep 19 10:00:47 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 20:00:47 +0300 Subject: [ofa-general] [PATCH] opensm: osm_get_port_by_lid() helper In-Reply-To: <20090919165924.GD13667@me> References: <20090919165817.GC13667@me> <20090919165924.GD13667@me> Message-ID: <20090919170047.GF13667@me> This new helper is similar to osm_get_port_by_base_lid(), but with improved interface (returns port's pointer) and moved to osm_subnet.[ch] (where others osm_get_*_by_*() helpers are located). Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_port.h | 35 ------------------------------- opensm/include/opensm/osm_subnet.h | 25 ++++++++++++++++++++++ opensm/opensm/osm_port.c | 36 -------------------------------- opensm/opensm/osm_sa_link_record.c | 13 +++-------- opensm/opensm/osm_sa_path_record.c | 4 +- opensm/opensm/osm_sa_pkey_record.c | 5 +-- opensm/opensm/osm_sa_portinfo_record.c | 6 +--- opensm/opensm/osm_sa_slvl_record.c | 6 +--- opensm/opensm/osm_sa_sminfo_record.c | 6 +--- opensm/opensm/osm_sa_vlarb_record.c | 6 +--- opensm/opensm/osm_subnet.c | 27 ++++++++++++++++++++++++ 11 files changed, 68 insertions(+), 101 deletions(-) diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 000e2fe..eb77ed9 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -1376,41 +1376,6 @@ osm_port_get_lid_range_ho(IN const osm_port_t * const p_port, * Port *********/ -/****f* OpenSM: Port/osm_get_port_by_base_lid -* NAME -* osm_get_port_by_base_lid -* -* DESCRIPTION -* Returns a status on whether a Port was able to be -* determined based on the LID supplied and if so, return the Port. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, - IN const ib_net16_t lid, - IN OUT const osm_port_t ** const pp_port); -/* -* PARAMETERS -* p_subn -* [in] Pointer to the subnet data structure. -* -* lid -* [in] LID requested. -* -* pp_port -* [in][out] Pointer to pointer to Port object. -* -* RETURN VALUES -* IB_SUCCESS -* IB_NOT_FOUND -* -* NOTES -* -* SEE ALSO -* Port -*********/ - /****f* OpenSM: Physical Port/osm_physp_calc_link_mtu * NAME * osm_physp_calc_link_mtu diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 6c20de8..a8c6b62 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -935,6 +935,31 @@ struct osm_port *osm_get_port_by_guid(IN osm_subn_t const *p_subn, * osm_port_t *********/ +/****f* OpenSM: Port/osm_get_port_by_lid +* NAME +* osm_get_port_by_lid +* +* DESCRIPTION +* Returns a pointer of the port object for given lid value. +* +* SYNOPSIS +*/ +struct osm_port *osm_get_port_by_lid(const osm_subn_t * subn, ib_net16_t lid); +/* +* PARAMETERS +* subn +* [in] Pointer to the subnet data structure. +* +* lid +* [in] LID requested. +* +* RETURN VALUES +* The port structure pointer if found. NULL otherwise. +* +* SEE ALSO +* Subnet object, osm_port_t +*********/ + /****f* OpenSM: Subnet/osm_get_mgrp_by_mlid * NAME * osm_get_mgrp_by_mlid diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index dc8a768..79a8913 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -184,42 +184,6 @@ void osm_port_get_lid_range_ho(IN const osm_port_t * p_port, /********************************************************************** **********************************************************************/ -ib_api_status_t osm_get_port_by_base_lid(IN const osm_subn_t * p_subn, - IN ib_net16_t lid, - IN OUT const osm_port_t ** pp_port) -{ - ib_api_status_t status; - uint16_t base_lid; - uint8_t lmc; - - *pp_port = NULL; - - /* Loop on lmc from 0 up through max LMC possible */ - for (lmc = 0; lmc <= IB_PORT_LMC_MAX; lmc++) { - /* Calculate a base LID assuming this is the real LMC */ - base_lid = cl_ntoh16(lid) & ~((1 << lmc) - 1); - - /* Look for a match */ - status = cl_ptr_vector_at(&p_subn->port_lid_tbl, - base_lid, (void **)pp_port); - if ((status == CL_SUCCESS) && (*pp_port != NULL)) { - /* Determine if base LID "tested" is the real base LID */ - /* This is true if the LMC "tested" is the port's actual LMC */ - if (lmc == osm_port_get_lmc(*pp_port)) { - status = IB_SUCCESS; - goto Found; - } - } - } - *pp_port = NULL; - status = IB_NOT_FOUND; - -Found: - return status; -} - -/********************************************************************** - **********************************************************************/ uint8_t osm_physp_calc_link_mtu(IN osm_log_t * p_log, IN const osm_physp_t * p_physp) { diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c index bf0b5ee..9e55e71 100644 --- a/opensm/opensm/osm_sa_link_record.c +++ b/opensm/opensm/osm_sa_link_record.c @@ -363,7 +363,6 @@ static ib_net16_t lr_rcv_get_end_points(IN osm_sa_t * sa, const ib_link_record_t *p_lr; const ib_sa_mad_t *p_sa_mad; ib_net64_t comp_mask; - ib_api_status_t status; ib_net16_t sa_status = IB_SA_MAD_STATUS_SUCCESS; OSM_LOG_ENTER(sa->p_log); @@ -380,10 +379,8 @@ static ib_net16_t lr_rcv_get_end_points(IN osm_sa_t * sa, *pp_dest_port = NULL; if (p_sa_mad->comp_mask & IB_LR_COMPMASK_FROM_LID) { - status = osm_get_port_by_base_lid(sa->p_subn, p_lr->from_lid, - pp_src_port); - - if (status != IB_SUCCESS || *pp_src_port == NULL) { + *pp_src_port = osm_get_port_by_lid(sa->p_subn, p_lr->from_lid); + if (!*pp_src_port) { /* This 'error' is the client's fault (bad lid) so don't enter it as an error in our own log. @@ -399,10 +396,8 @@ static ib_net16_t lr_rcv_get_end_points(IN osm_sa_t * sa, } if (p_sa_mad->comp_mask & IB_LR_COMPMASK_TO_LID) { - status = osm_get_port_by_base_lid(sa->p_subn, p_lr->to_lid, - pp_dest_port); - - if (status != IB_SUCCESS || *pp_dest_port == NULL) { + *pp_dest_port = osm_get_port_by_lid(sa->p_subn, p_lr->to_lid); + if (!*pp_dest_port) { /* This 'error' is the client's fault (bad lid) so don't enter it as an error in our own log. diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 37f32ac..2247ebe 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1496,8 +1496,8 @@ static ib_api_status_t pr_match_mgrp_attributes(IN osm_sa_t * sa, } if (comp_mask & IB_PR_COMPMASK_SLID) { - if (osm_get_port_by_base_lid(sa->p_subn, p_pr->slid, &port) || - !port || !osm_mgrp_get_mcm_port(p_mgrp, port->guid)) + port = osm_get_port_by_lid(sa->p_subn, p_pr->slid); + if (!port || !osm_mgrp_get_mcm_port(p_mgrp, port->guid)) goto Exit; } diff --git a/opensm/opensm/osm_sa_pkey_record.c b/opensm/opensm/osm_sa_pkey_record.c index 8e47745..a5a3e28 100644 --- a/opensm/opensm/osm_sa_pkey_record.c +++ b/opensm/opensm/osm_sa_pkey_record.c @@ -300,9 +300,8 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) work load, since we don't have to search every port */ if (comp_mask & IB_PKEY_COMPMASK_LID) { - status = osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, - &p_port); - if (status != IB_SUCCESS || p_port == NULL) { + p_port = osm_get_port_by_lid(sa->p_subn, p_rcvd_rec->lid); + if (!p_port) { status = IB_NOT_FOUND; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 460B: " "No port found with LID %u\n", diff --git a/opensm/opensm/osm_sa_portinfo_record.c b/opensm/opensm/osm_sa_portinfo_record.c index b5ef101..448981c 100644 --- a/opensm/opensm/osm_sa_portinfo_record.c +++ b/opensm/opensm/osm_sa_portinfo_record.c @@ -521,10 +521,8 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) work load, since we don't have to search every port */ if (comp_mask & IB_PIR_COMPMASK_LID) { - status = - osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, - &p_port); - if ((status != IB_SUCCESS) || (p_port == NULL)) { + p_port = osm_get_port_by_lid(sa->p_subn, p_rcvd_rec->lid); + if (!p_port) { status = IB_NOT_FOUND; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 2109: " "No port found with LID %u\n", diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c index 061d970..d310df0 100644 --- a/opensm/opensm/osm_sa_slvl_record.c +++ b/opensm/opensm/osm_sa_slvl_record.c @@ -274,10 +274,8 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) work load, since we don't have to search every port */ if (comp_mask & IB_SLVL_COMPMASK_LID) { - status = - osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, - &p_port); - if ((status != IB_SUCCESS) || (p_port == NULL)) { + p_port = osm_get_port_by_lid(sa->p_subn, p_rcvd_rec->lid); + if (!p_port) { status = IB_NOT_FOUND; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 2608: " "No port found with LID %u\n", diff --git a/opensm/opensm/osm_sa_sminfo_record.c b/opensm/opensm/osm_sa_sminfo_record.c index 4d454af..dcc8615 100644 --- a/opensm/opensm/osm_sa_sminfo_record.c +++ b/opensm/opensm/osm_sa_sminfo_record.c @@ -236,10 +236,8 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) work load, since we don't have to search every port */ if (comp_mask & IB_SMIR_COMPMASK_LID) { - status = - osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, - &p_port); - if ((status != IB_SUCCESS) || (p_port == NULL)) { + p_port = osm_get_port_by_lid(sa->p_subn, p_rcvd_rec->lid); + if (!p_port) { status = IB_NOT_FOUND; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 2806: " "No port found with LID %u\n", diff --git a/opensm/opensm/osm_sa_vlarb_record.c b/opensm/opensm/osm_sa_vlarb_record.c index f9f11b7..5c66049 100644 --- a/opensm/opensm/osm_sa_vlarb_record.c +++ b/opensm/opensm/osm_sa_vlarb_record.c @@ -288,10 +288,8 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) work load, since we don't have to search every port */ if (comp_mask & IB_VLA_COMPMASK_LID) { - status = - osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, - &p_port); - if ((status != IB_SUCCESS) || (p_port == NULL)) { + p_port = osm_get_port_by_lid(sa->p_subn, p_rcvd_rec->lid); + if (!p_port) { status = IB_NOT_FOUND; OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 2A09: " "No port found with LID %u\n", diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 8d63a75..b475031 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -643,6 +643,33 @@ osm_port_t *osm_get_port_by_guid(IN osm_subn_t const *p_subn, IN ib_net64_t guid /********************************************************************** **********************************************************************/ +osm_port_t *osm_get_port_by_lid(IN osm_subn_t const * subn, IN ib_net16_t lid) +{ + osm_port_t *port = NULL; + ib_api_status_t stat; + uint16_t base_lid; + uint8_t lmc; + + lid = cl_ntoh16(lid); + + /* Loop on lmc from 0 up through max LMC possible */ + for (lmc = 0; lmc <= IB_PORT_LMC_MAX; lmc++) { + /* Calculate a base LID assuming this is the real LMC */ + base_lid = lid & ~((1 << lmc) - 1); + + stat = cl_ptr_vector_at(&subn->port_lid_tbl, base_lid, + (void *)&port); + /* Determine if base LID "tested" is the real base LID */ + /* This is true if the LMC "tested" is the port's actual LMC */ + if (stat == CL_SUCCESS && port && lmc == osm_port_get_lmc(port)) + return port; + } + + return NULL; +} + +/********************************************************************** + **********************************************************************/ static void subn_set_default_qos_options(IN osm_qos_options_t * opt) { opt->max_vls = OSM_DEFAULT_QOS_MAX_VLS; -- 1.6.5.rc1 From worleys at gmail.com Sat Sep 19 10:29:56 2009 From: worleys at gmail.com (Chris Worley) Date: Sat, 19 Sep 2009 11:29:56 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Fri, Sep 18, 2009 at 3:33 PM, Chris Worley wrote: > On Fri, Sep 18, 2009 at 3:31 PM, Chris Worley wrote: >> On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin wrote: >>> Chris Worley, on 09/06/2009 05:41 PM wrote: >>>> >>>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>>> >>>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>>>> wrote: >>>>>> >>>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>>>> >>>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>>>> >>>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>>> repeat). >>>>>>> >>>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>>> both cases, the problem does not repeat. >>>>>>> >>>>>>> That makes it sound like OFED is the cure on either side of the >>>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>>> w/ either Ununtu or RHEL targets). >>>>>> >>>>>> These results are strange. Regarding the Linux-only tests, I was >>>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>>> each of these components there is at least one test that passes and at >>>>>> least one test that fails. So either my assumption is wrong or one of >>>>>> the above test results is not repeatable. Do you have the time to >>>>>> repeat the Linux-only tests ? >>>>> >>>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>>> the problem repeated; now, I can't repeat the case where it didn't >>>>> fail.  Still, no errors, other than the eventual timeouts previously >>>>> shown; the target thinks all is fine, the initiator is stuck. >>>> >>>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >>>> 9.04. >>> >>> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races >>> you have, although not completely. >> >> I finally got around to this test... 1 CPU works very well, w/o hangs >> (will test all night to see if this holds true), This has run through 1KB-8KB blocks for nearly 24 hours w/o error. The single core case seems to work. Chris > 2 or more don't. >> This is dual-socket NHM, so I can't specify more than one processor >> w/o getting more than one socket. > > I don't know if this is important, but 1KB block tests didn't have a > problem w/ 2 or 4 maxcpus... they didn't hang until 2KB blocks: > > fio --rw=randrw --bs=2k --rwmixread=100 --numjobs=64 --iodepth=64 > --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio > --filename=/dev/sdb --filename=/dev/sdc --name=test --loops=10000 > --size=32183006002 --runtime=600 --group_reporting > > Chris >> >> Chris >>> >>> 2. Try with another hardware, including motherboard. You can have something >>> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) >>> >>>> Chris >>>>> >>>>> Chris >>>>>> >>>>>> Bart. >>>>>> >>> >>> >> > From sashak at voltaire.com Sat Sep 19 10:37:56 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 20:37:56 +0300 Subject: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Cosmetic change to error message In-Reply-To: <20090918133216.GA17787@comcast.net> References: <20090918133216.GA17787@comcast.net> Message-ID: <20090919173756.GG13667@me> On 09:32 Fri 18 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sat Sep 19 10:47:01 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 20:47:01 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes In-Reply-To: <20090918143128.GA23618@comcast.net> References: <20090918143128.GA23618@comcast.net> Message-ID: <20090919174701.GH13667@me> Hi Hal, On 10:31 Fri 18 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c > index e5dfc19..329743a 100644 > --- a/opensm/opensm/osm_perfmgr_db.c > +++ b/opensm/opensm/osm_perfmgr_db.c > @@ -49,6 +49,8 @@ > #include > #include > > +static void free_node(db_node_t * node); > + > /** ========================================================================= > */ > perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) > @@ -68,7 +70,16 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) > */ > void perfmgr_db_destroy(perfmgr_db_t * db) > { > + cl_map_item_t *item; > + db_node_t *node; > + > if (db) { > + item = cl_qmap_head(&db->pc_data); > + while (item != cl_qmap_end(&db->pc_data)) { > + node = (db_node_t *)item; > + free_node(node); > + item = cl_qmap_next(item); Use after free (memory pointed by item is freed already)? Sasha > + } > cl_plock_destroy(&db->lock); > free(db); > } > From sashak at voltaire.com Sat Sep 19 10:58:32 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 20:58:32 +0300 Subject: [ofa-general] Re: [PATCH][TRIVIAL] infiniband-diags/: Cosmetic changes, mostly typos In-Reply-To: <829ded920909150439h585f05e3hf0174713ecd2faf4@mail.gmail.com> References: <829ded920909150439h585f05e3hf0174713ecd2faf4@mail.gmail.com> Message-ID: <20090919175832.GI13667@me> On 17:09 Tue 15 Sep , Keshetti Mahesh wrote: > Cosmetic changes > > Signed-off-by: Keshetti Mahesh < keshetti.mahesh at gmail.com> The patch is malformed (broken long lines and mangled whitespaces). Applied by hands. Thanks. Sasha > --- > > diff --git a/infiniband-diags/scripts/ibcheckwidth.in > b/infiniband-diags/scripts/ibcheckwidth.in > index 6b723c5..cbb154c 100644 > --- a/infiniband-diags/scripts/ibcheckwidth.in > +++ b/infiniband-diags/scripts/ibcheckwidth.in > @@ -77,7 +77,7 @@ BEGIN { > function check_node(lid) > { > nodechecked=1 > - if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' > '$verbose' " lid)) { > + if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' > '$verbose' " lid)) { > ne++ > badnode=1 > return > @@ -113,7 +113,7 @@ function check_node(lid) > } > sub("\\(.*\\)", "", port) > gsub("[\\[\\]]", "", port) > - if (system("'$IBPATH'/ibcheckportwidth'"$ca_info"' > '$gflags' '$verbose' " lid " " port)) { > + if (system("'$IBPATH'/ibcheckportwidth '"$ca_info"' > '$gflags' '$verbose' " lid " " port)) { > if (!'$v' && oldlid != lid) { > print "# Checked " ntype ": nodeguid > 0x" nodeguid " with failure" > oldlid = lid > diff --git a/infiniband-diags/src/ibportstate.c > b/infiniband-diags/src/ibportstate.c > index 76e74f7..018bc9a 100644 > --- a/infiniband-diags/src/ibportstate.c > +++ b/infiniband-diags/src/ibportstate.c > @@ -349,7 +349,7 @@ int main(int argc, char **argv) > get_port_info(&peerportid, data, peerlocalportnum, > port_op); > if (err < 0) > - IBERROR("smp query peer portinfofailed"); > + IBERROR("smp query peer portinfo failed"); > > mad_decode_field(data, IB_PORT_LINK_WIDTH_ENABLED_F, > &peerlwe); > diff --git a/infiniband-diags/man/ibcheckportwidth.8 > b/infiniband-diags/man/ibcheckportwidth.8 > index 85c06fc..c368467 100644 > --- a/infiniband-diags/man/ibcheckportwidth.8 > +++ b/infiniband-diags/man/ibcheckportwidth.8 > @@ -4,7 +4,7 @@ > ibcheckportwidth \- validate IB port for 1x link width > > .SH SYNOPSIS > -.B ibcheckport > +.B ibcheckportwidth > [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] > [\-t(imeout) timeout_ms] > -- > 1.6.4.2 > > -- > Keshetti Mahesh > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > From sashak at voltaire.com Sat Sep 19 11:08:26 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 21:08:26 +0300 Subject: [ofa-general] Re: [PATCH] 'ibcheckportwidth' : Exit if LWS is 1X In-Reply-To: <829ded920909152042y227a709u6e74ee05d6ff05@mail.gmail.com> References: <829ded920909152042y227a709u6e74ee05d6ff05@mail.gmail.com> Message-ID: <20090919180826.GJ13667@me> On 09:12 Wed 16 Sep , Keshetti Mahesh wrote: > Fix: Modify ibcheckportwidth to exit if LWS is 1X instead of processing > next lines. > Trivial: ibcheckportwidth man page cosmetic change Again, the patch is malformed. Please verify the patches before submission. > > Signed-off-by: Keshetti Mahesh > --- > infiniband-diags/man/ibcheckportwidth.8      |    2 +- > infiniband-diags/scripts/ibcheckportwidth.in |    2 +- > > diff --git a/infiniband-diags/man/ibcheckportwidth.8 > b/infiniband-diags/man/ibcheckportwidth.8 > index 85c06fc..c368467 100644 > --- a/infiniband-diags/man/ibcheckportwidth.8 > +++ b/infiniband-diags/man/ibcheckportwidth.8 > @@ -4,7 +4,7 @@ >  ibcheckportwidth \- validate IB port for 1x link width > >  .SH SYNOPSIS > -.B ibcheckport > +.B ibcheckportwidth >  [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] >  [\-t(imeout) timeout_ms]   > This chunk appears already in the previous patch. > diff --git a/infiniband-diags/scripts/ibcheckportwidth.in > b/infiniband-diags/scripts/ibcheckportwidth.in > index 32c5c5e..60a0892 100644 > --- a/infiniband-diags/scripts/ibcheckportwidth.in > +++ b/infiniband-diags/scripts/ibcheckportwidth.in > @@ -103,7 +103,7 @@ function blue(s) >  } > >  # Only check LinkWidthActive if LinkWidthSupported is not 1X > -/^LinkWidthSupported/{ if ($2 != "1X") { next } } > +/^LinkWidthSupported/{ if ($2 == "1X") { exit } } Applied (by hands). Thanks. Sasha > -- > 1.6.4.2 > > -- > Keshetti Mahesh > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > From sashak at voltaire.com Sat Sep 19 11:43:41 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 21:43:41 +0300 Subject: [ofa-general] Re: [PATCHv2] opensm/opensm.8.in: Indicate default rule for Default partition In-Reply-To: <20090914154236.GA11092@comcast.net> References: <20090914154236.GA11092@comcast.net> Message-ID: <20090919184341.GK13667@me> On 11:42 Mon 14 Sep , Hal Rosenstock wrote: > > Also, similar change to doc/partition-config.txt > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sat Sep 19 11:58:04 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 21:58:04 +0300 Subject: [ofa-general] Re: [PATCH] infiniband-diags/ibportstate: Support changing of link width In-Reply-To: <20090908221137.GA24265@comcast.net> References: <20090908221137.GA24265@comcast.net> Message-ID: <20090919185804.GA17656@me> On 18:11 Tue 08 Sep , Hal Rosenstock wrote: > > Also, update man page > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sat Sep 19 12:11:16 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 22:11:16 +0300 Subject: [ofa-general] Re: [PATCHv2] opensm/osm_inform.c: For traps 64-67, use GID from DataDetails in log message In-Reply-To: <20090909130429.GA1946@comcast.net> References: <20090909130429.GA1946@comcast.net> Message-ID: <20090919191116.GB17656@me> On 09:04 Wed 09 Sep , Hal Rosenstock wrote: > > Issuer GID is uninteresting for SM generated notices > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sat Sep 19 12:39:05 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 22:39:05 +0300 Subject: [ofa-general] [PATCH] opensm/osm_notice.c: move logging code to separate function In-Reply-To: <20090909130429.GA1946@comcast.net> References: <20090909130429.GA1946@comcast.net> Message-ID: <20090919193905.GC17656@me> Move report notice logging to separate function log_notice(). Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_inform.c | 70 +++++++++++++++++++++++-------------------- 1 files changed, 37 insertions(+), 33 deletions(-) diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index 6e1a2b5..9b451bd 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -537,15 +537,48 @@ Exit: * element and if it does - call the Report(Notice) for the * target QP registered by the address stored in the InformInfo element **********************************************************************/ +static void log_notice(osm_log_t * log, osm_log_level_t level, + ib_mad_notice_attr_t * ntc) +{ + char gid_str[INET6_ADDRSTRLEN]; + ib_gid_t *gid; + + /* an official Event information log */ + if (ib_notice_is_generic(ntc)) { + if ((ntc->g_or_v.generic.trap_num == CL_HTON16(64)) || + (ntc->g_or_v.generic.trap_num == CL_HTON16(65)) || + (ntc->g_or_v.generic.trap_num == CL_HTON16(66)) || + (ntc->g_or_v.generic.trap_num == CL_HTON16(67))) + gid = &ntc->data_details.ntc_64_67.gid; + else + gid = &ntc->issuer_gid; + OSM_LOG(log, level, + "Reporting Generic Notice type:%u num:%u (%s)" + " from LID:%u GID:%s\n", + ib_notice_get_type(ntc), + cl_ntoh16(ntc->g_or_v.generic.trap_num), + ib_get_trap_str(ntc->g_or_v.generic.trap_num), + cl_ntoh16(ntc->issuer_lid), + inet_ntop(AF_INET6, gid->raw, gid_str, sizeof gid_str)); + } else + OSM_LOG(log, level, + "Reporting Vendor Notice type:%u vend:%u dev:%u" + " from LID:%u GID:%s\n", + ib_notice_get_type(ntc), + cl_ntoh32(ib_notice_get_vend_id(ntc)), + cl_ntoh16(ntc->g_or_v.vend.dev_id), + cl_ntoh16(ntc->issuer_lid), + inet_ntop(AF_INET6, ntc->issuer_gid.raw, gid_str, + sizeof gid_str)); +} + ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, IN ib_mad_notice_attr_t * p_ntc) { - char gid_str[INET6_ADDRSTRLEN]; osm_infr_match_ctxt_t context; cl_list_t infr_to_remove_list; osm_infr_t *p_infr_rec; osm_infr_t *p_next_infr_rec; - ib_gid_t *p_gid; OSM_LOG_ENTER(p_log); @@ -560,38 +593,9 @@ ib_api_status_t osm_report_notice(IN osm_log_t * p_log, IN osm_subn_t * p_subn, return (IB_ERROR); } - if (!osm_log_is_active(p_log, OSM_LOG_INFO)) - goto skip_log; - - /* an official Event information log */ - if (ib_notice_is_generic(p_ntc)) { - if ((p_ntc->g_or_v.generic.trap_num == CL_HTON16(64)) || - (p_ntc->g_or_v.generic.trap_num == CL_HTON16(65)) || - (p_ntc->g_or_v.generic.trap_num == CL_HTON16(66)) || - (p_ntc->g_or_v.generic.trap_num == CL_HTON16(67))) - p_gid = (ib_gid_t *)&p_ntc->data_details.ntc_64_67.gid.raw; - else - p_gid = (ib_gid_t *)&p_ntc->issuer_gid.raw; - OSM_LOG(p_log, OSM_LOG_INFO, - "Reporting Generic Notice type:%u num:%u (%s)" - " from LID:%u GID:%s\n", - ib_notice_get_type(p_ntc), - cl_ntoh16(p_ntc->g_or_v.generic.trap_num), - ib_get_trap_str(p_ntc->g_or_v.generic.trap_num), - cl_ntoh16(p_ntc->issuer_lid), - inet_ntop(AF_INET6, p_gid->raw, gid_str, sizeof gid_str)); - } else - OSM_LOG(p_log, OSM_LOG_INFO, - "Reporting Vendor Notice type:%u vend:%u dev:%u" - " from LID:%u GID:%s\n", - ib_notice_get_type(p_ntc), - cl_ntoh32(ib_notice_get_vend_id(p_ntc)), - cl_ntoh16(p_ntc->g_or_v.vend.dev_id), - cl_ntoh16(p_ntc->issuer_lid), - inet_ntop(AF_INET6, p_ntc->issuer_gid.raw, gid_str, - sizeof gid_str)); + if (osm_log_is_active(p_log, OSM_LOG_INFO)) + log_notice(p_log, OSM_LOG_INFO, p_ntc); -skip_log: /* Create a list that will hold all the infr records that should be removed due to violation. o13-17.1.2 */ cl_list_construct(&infr_to_remove_list); -- 1.6.5.rc1 From sashak at voltaire.com Sat Sep 19 13:05:40 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 23:05:40 +0300 Subject: [ofa-general] Re: [PATCH] ibportstate: add width option In-Reply-To: <4AAD1C71.4010702@voltaire.com> References: <4AAD1C71.4010702@voltaire.com> Message-ID: <20090919200540.GD17656@me> Hi Doron, On 19:23 Sun 13 Sep , Doron Shoham wrote: > ibportstate: add width option. > Similar to the speed option, this option can > explicitly set the port's LinkWidthEnable value. > It supports values from 0-15 and 255. > > Signed-off-by: Doron Shoham Almost equivalent patch was already submitted (and applied) by Hal. So I'm just rebasing this to apply typos fixes: commit cdc6452b0f8e2cd85dd61d5b43c9c65bfd5cc1a8 Author: Doron Shoham Date: Sun Sep 13 19:23:13 2009 +0300 ibportstate: fixes for width option Originally was: ibportstate: add width option. Similar to the speed option, this option can explicitly set the port's LinkWidthEnable value. It supports values from 0-15 and 255. But since similar patch was submitted before, there is only rebase results with couple of typos fixes. Signed-off-by: Doron Shoham Signed-off-by: Sasha Khapyorsky diff --git a/infiniband-diags/man/ibportstate.8 b/infiniband-diags/man/ibportstate.8 index b64c18d..fea860e 100644 --- a/infiniband-diags/man/ibportstate.8 +++ b/infiniband-diags/man/ibportstate.8 @@ -15,7 +15,7 @@ ibportstate allows the port state and port physical state of an IB port to be queried (in addition to link width and speed being validated relative to the peer port when the port queried is a switch port), or a switch port to be disabled, enabled, or reset. It -also allows the link speed enabled on any IB port to be adjusted. +also allows the link speed/width enabled on any IB port to be adjusted. .SH OPTIONS @@ -32,7 +32,7 @@ Port operations allowed speed values are legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) - width valyes are legal values for PortInfo:LinkWidthEnabled + width values are legal values for PortInfo:LinkWidthEnabled (An error is indicated if PortInfo:LinkWidthSupported does not support this setting) (NOTE: Speed and width changes are not effected until the port goes through diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index df13298..6fb97a8 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -204,7 +204,6 @@ int main(int argc, char **argv) int err; int port_op = 0; /* default to query */ int speed = 15; - int new_width = 255; int is_switch = 1; int state, physstate, lwe, lws, lwa, lse, lss, lsa; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, @@ -271,9 +270,9 @@ int main(int argc, char **argv) ("width requires an additional parameter"); port_op = 5; /* Parse width value */ - new_width = strtoul(argv[3], 0, 0); - if (new_width > 255) - IBERROR("invalid width value %d", new_width); + width = strtoul(argv[3], 0, 0); + if (width > 15 && width != 255) + IBERROR("invalid width value %d", width); } } @@ -311,7 +310,7 @@ int main(int argc, char **argv) mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); } else if (port_op == 5) { /* Set width */ mad_set_field(data, 0, IB_PORT_LINK_WIDTH_ENABLED_F, - new_width); + width); mad_set_field(data, 0, IB_PORT_STATE_F, 0); mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); } From sashak at voltaire.com Sat Sep 19 13:15:23 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Sep 2009 23:15:23 +0300 Subject: [ofa-general] Re: [PATCH] infiniband-diags/ibnetdiscover: Add separator when printing chassis type In-Reply-To: <20090914125933.0755a2dc@frecb007965> References: <20090914125933.0755a2dc@frecb007965> Message-ID: <20090919201523.GE17656@me> On 12:59 Mon 14 Sep , sebastien dugue wrote: > > When grouping is enabled add a '#' separator between the switch guid > and the chassis type and slot. I'm fine with this change, but please next time explain the change reason too - this modifies ibnetdiscover output and may break (hypothetically) some third party scripts. > Signed-off-by: Sebastien Dugue Applied. Thanks. Sasha From kalcher at kip.uni-heidelberg.de Sat Sep 19 14:57:01 2009 From: kalcher at kip.uni-heidelberg.de (Sebastian Kalcher) Date: Sat, 19 Sep 2009 23:57:01 +0200 Subject: [ofa-general] OFED interfering with Ethernet Message-ID: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> Hi all, I need a little direction to cope with a strange problem: I have node A and B connected via an HP Procurve Gigabit Ethernet switch. Node B has also an Mellanox MT26428 QDR HCA. I run iperf to test the GbE connection between A and B. Everything works fine. If I load the openib drivers (i.e. "/etc/init.d/openib start") on B. The iperf connection gets very flaky. Sometimes it stalls for several seconds (or tens of seconds) and then starts again, after a while it completely freezes. I don't see any TCP timeouts, the iperf processes are simply sleeping, and no packets are transferred. It doesn't matter if the ib0 device is configured or not, or whether opensm is running. If I do an "openib stop" and restart iperf everything returns to normal. The behavior is reproducible on different nodes with the same constellation. I tried OFED 1.4 and 1.5beta (kernel 2.6.24). What completely confuses me here is that it is the ethernet connection that gets screwed up, IB isn't used at all other than that the drivers are loaded. Any hint where to continue would be greatly appreciated. Sebastian From jgunthorpe at obsidianresearch.com Sat Sep 19 16:02:41 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Sat, 19 Sep 2009 17:02:41 -0600 Subject: [ofa-general] OFED interfering with Ethernet In-Reply-To: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> References: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> Message-ID: <20090919230241.GB22310@obsidianresearch.com> On Sat, Sep 19, 2009 at 11:57:01PM +0200, Sebastian Kalcher wrote: > If I do an "openib stop" and restart iperf everything returns to normal. > > The behavior is reproducible on different nodes with the same > constellation. I tried OFED 1.4 and 1.5beta (kernel 2.6.24). What > completely confuses me here is that it is the ethernet connection that > gets screwed up, IB isn't used at all other than that the drivers are > loaded. > > Any hint where to continue would be greatly appreciated. The openib init script touches all sorts of weird things, betcha one of those is your problem... See if you can narrow it down? You can try loading the modules yourself to confirm it is not a module problem: modprobe mlx4_ib modprobe ib_ipoib mopdrobe ib_ucm modprobe ib_uverbs modprobe ib_umad Jason From vlad at lists.openfabrics.org Sun Sep 20 03:05:53 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 20 Sep 2009 03:05:53 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090920-0200 daily build status Message-ID: <20090920100553.BF41EE6200C@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sashak at voltaire.com Sun Sep 20 03:20:45 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Sep 2009 13:20:45 +0300 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: <20090831192134.GA12094@comcast.net> References: <20090831192134.GA12094@comcast.net> Message-ID: <20090920102045.GH17656@me> Hi Hal, On 15:21 Mon 31 Aug , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/man/osmtest.8 b/opensm/man/osmtest.8 > index fa0cd52..f0d6323 100644 > --- a/opensm/man/osmtest.8 > +++ b/opensm/man/osmtest.8 > @@ -1,4 +1,4 @@ > -.TH OSMTEST 8 "August 11, 2008" "OpenIB" "OpenIB Management" > +.TH OSMTEST 8 "August 31, 2009" "OpenIB" "OpenIB Management" > > .SH NAME > osmtest \- InfiniBand subnet manager and administration (SM/SA) test program > @@ -108,9 +108,10 @@ Stress test options are as follows: > > OPT Description > --- ----------------- > - -s1 - Single-MAD response SA queries > + -s1 - Single-MAD (RMPP) response SA queries > -s2 - Multi-MAD (RMPP) response SA queries > -s3 - Multi-MAD (RMPP) Path Record SA queries > + -s4 - Single-MAD (non RMPP) get Path Record SA queries > > Without -s, stress testing is not performed > .TP > diff --git a/opensm/osmtest/include/osmtest_base.h b/opensm/osmtest/include/osmtest_base.h > index 7c33da3..cda3a31 100644 > --- a/opensm/osmtest/include/osmtest_base.h > +++ b/opensm/osmtest/include/osmtest_base.h > @@ -56,11 +56,12 @@ > > #define STRESS_SMALL_RMPP_THR 100000 > /* > - Take long times when quering big clusters (over 40 nodes) , an average of : 0.25 sec for query > + Take long times when querying big clusters (over 40 nodes), an average of : 0.25 sec for query > each query receives 1000 records > */ > #define STRESS_LARGE_RMPP_THR 4000 > #define STRESS_LARGE_PR_RMPP_THR 20000 > +#define STRESS_GET_PR 100000 > > extern const char *const p_file; > > diff --git a/opensm/osmtest/main.c b/opensm/osmtest/main.c > index bb2d6bc..4bb9f82 100644 > --- a/opensm/osmtest/main.c > +++ b/opensm/osmtest/main.c > @@ -143,9 +143,10 @@ void show_usage() > " Stress test options are as follows:\n" > " OPT Description\n" > " --- -----------------\n" > - " -s1 - Single-MAD response SA queries\n" > + " -s1 - Single-MAD (RMPP) response SA queries\n" > " -s2 - Multi-MAD (RMPP) response SA queries\n" > " -s3 - Multi-MAD (RMPP) Path Record SA queries\n" > + " -s4 - Single-MAD (non RMPP) get Path Record SA queries\n" > " Without -s, stress testing is not performed\n\n"); > printf("-M\n" > "--Multicast_Mode\n" > @@ -499,6 +500,9 @@ int main(int argc, char *argv[]) > case 3: > printf("Large Path Record SA queries\n"); > break; > + case 4: > + printf("SA Get Path Record queries\n"); > + break; > default: > printf("Unknown value %u (ignored)\n", > opt.stress); > diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c > index 986a8d2..8357d90 100644 > --- a/opensm/osmtest/osmtest.c > +++ b/opensm/osmtest/osmtest.c > @@ -2882,6 +2882,151 @@ Exit: > > /********************************************************************** > **********************************************************************/ > +ib_api_status_t > +osmtest_stress_path_recs_by_lid(IN osmtest_t * const p_osmt, > + IN int mode, > + OUT uint32_t * const p_num_recs, > + OUT uint32_t * const p_num_queries) > +{ > + osmtest_req_context_t context; > + ib_path_rec_t *p_rec; > + cl_status_t status; > + ib_net16_t dlid, slid; > + int num_recs, i; > + > + OSM_LOG_ENTER(&p_osmt->log); > + > + memset(&context, 0, sizeof(context)); > + > + slid = cl_ntoh16(p_osmt->local_port.lid); > + if (!mode) > + dlid = cl_ntoh16(p_osmt->local_port.sm_lid); > + else > + dlid = cl_ntoh16(p_osmt->local_port.lid); What is purpose of this "mode" variable? I see (below) that it is not used. > + > + /* > + * Do a blocking query for the PathRecord. > + */ > + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, &context); > + if (status != IB_SUCCESS) { > + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 000A: " > + "osmtest_get_path_rec_by_lid_pair failed (%s)\n", > + ib_get_err_str(status)); > + goto Exit; > + } It is not really "stress" testing, just pinging. Shouldn't it be clarified in test description? > + > + /* > + * Populate the database with the received records. > + */ > + num_recs = context.result.result_cnt; > + *p_num_recs += num_recs; > + ++*p_num_queries; > + > + if (osm_log_is_active(&p_osmt->log, OSM_LOG_VERBOSE)) { > + OSM_LOG(&p_osmt->log, OSM_LOG_VERBOSE, > + "Received %u records\n", num_recs); > + > + for (i = 0; i < num_recs; i++) { > + p_rec = osmv_get_query_path_rec(context.result.p_result_madw, 0); > + osm_dump_path_record(&p_osmt->log, p_rec, OSM_LOG_VERBOSE); > + } > + } > + > +Exit: > + /* > + * Return the IB query MAD to the pool as necessary. > + */ > + if (context.result.p_result_madw != NULL) { > + osm_mad_pool_put(&p_osmt->mad_pool, > + context.result.p_result_madw); > + context.result.p_result_madw = NULL; > + } > + > + OSM_LOG_EXIT(&p_osmt->log); > + return (status); > +} > + > +/********************************************************************** > + **********************************************************************/ > +static ib_api_status_t osmtest_stress_get_pr(IN osmtest_t * const p_osmt, > + IN int mode) > +{ > + ib_api_status_t status = IB_SUCCESS; > + uint64_t num_recs = 0; > + uint64_t num_queries = 0; > + uint32_t delta_recs; > + uint32_t delta_queries; > + uint32_t print_freq = 0; > + int num_timeouts = 0; > + struct timeval start_tv, end_tv; > + long sec_diff, usec_diff; > + > + OSM_LOG_ENTER(&p_osmt->log); > + gettimeofday(&start_tv, NULL); > + printf("-I- Start time is : %09ld:%06ld [sec:usec]\n", > + start_tv.tv_sec, (long)start_tv.tv_usec); > + > + while ((num_queries < STRESS_GET_PR) && (num_timeouts < 100)) { > + delta_recs = 0; > + delta_queries = 0; > + > + status = osmtest_stress_path_recs_by_lid(p_osmt, mode, > + &delta_recs, > + &delta_queries); > + if (status != IB_SUCCESS) > + goto Exit; > + > + num_recs += delta_recs; > + num_queries += delta_queries; > + > + print_freq += delta_recs; > + if (print_freq > 5000) { > + gettimeofday(&end_tv, NULL); > + printf("%" PRIu64 " records, %" PRIu64 " queries\n", > + num_recs, num_queries); > + if (end_tv.tv_usec > start_tv.tv_usec) { > + sec_diff = end_tv.tv_sec - start_tv.tv_sec; > + usec_diff = end_tv.tv_usec - start_tv.tv_usec; > + } else { > + sec_diff = end_tv.tv_sec - start_tv.tv_sec - 1; > + usec_diff = > + 1000000 - (start_tv.tv_usec - > + end_tv.tv_usec); > + } > + printf("-I- End time is : %09ld:%06ld [sec:usec]\n", > + end_tv.tv_sec, (long)end_tv.tv_usec); > + printf("-I- Querying %" PRId64 > + " path_rec queries took %04ld:%06ld [sec:usec]\n", > + num_queries, sec_diff, usec_diff); > + print_freq = 0; > + } > + } > + > +Exit: > + gettimeofday(&end_tv, NULL); > + printf("-I- End time is : %09ld:%06ld [sec:usec]\n", > + end_tv.tv_sec, (long)end_tv.tv_usec); > + if (end_tv.tv_usec > start_tv.tv_usec) { > + sec_diff = end_tv.tv_sec - start_tv.tv_sec; > + usec_diff = end_tv.tv_usec - start_tv.tv_usec; > + } else { > + sec_diff = end_tv.tv_sec - start_tv.tv_sec - 1; > + usec_diff = 1000000 - (start_tv.tv_usec - end_tv.tv_usec); > + } Not for specific patch, but in general for osmtest - it would be really nice to consolidate all those duplications over osmtest code. Sasha > + > + printf("-I- Querying %" PRId64 > + " path_rec queries took %04ld:%06ld [sec:usec]\n", > + num_queries, sec_diff, usec_diff); > + if (num_timeouts > 50) { > + status = IB_TIMEOUT; > + } > + /* Exit: */ > + OSM_LOG_EXIT(&p_osmt->log); > + return (status); > +} > + > +/********************************************************************** > + **********************************************************************/ > static void > osmtest_prepare_db_generic(IN osmtest_t * const p_osmt, > IN cl_qmap_t * const p_tbl) > @@ -7247,6 +7392,16 @@ ib_api_status_t osmtest_run(IN osmtest_t * const p_osmt) > goto Exit; > } > break; > + case 4: /* SA Get PR to SA LID */ > + status = osmtest_stress_get_pr(p_osmt, 0); > + if (status != IB_SUCCESS) { > + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, > + "ERR 014B: " > + "SA Get PR stress test failed (%s)\n", > + ib_get_err_str(status)); > + goto Exit; > + } > + break; > default: > OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, > "ERR 0144: " From sashak at voltaire.com Sun Sep 20 05:49:14 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Sep 2009 15:49:14 +0300 Subject: [ofa-general] Re: [PATCH] opensm: Add infrastructure support for more newly allocated PortInfo CapabilityMask bits In-Reply-To: <20090901125505.GA9455@comcast.net> References: <20090901125505.GA9455@comcast.net> Message-ID: <20090920124914.GI17656@me> On 08:55 Tue 01 Sep , Hal Rosenstock wrote: > > Per published MgtWG errata: > RefID 4484 - vendor specific MADs > RefID 4575 - multicast PKey trap suppression > RefID 4641 - hierarchy info > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From kalcher at kip.uni-heidelberg.de Sun Sep 20 16:17:55 2009 From: kalcher at kip.uni-heidelberg.de (Sebastian Kalcher) Date: Mon, 21 Sep 2009 01:17:55 +0200 Subject: [ofa-general] OFED interfering with Ethernet In-Reply-To: <20090919230241.GB22310@obsidianresearch.com> References: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> <20090919230241.GB22310@obsidianresearch.com> Message-ID: <20090921011755.57846mh1c5eqopc8@mail.kip.uni-heidelberg.de> > The openib init script touches all sorts of weird things, betcha one > of those is your problem... See if you can narrow it down? > > You can try loading the modules yourself to confirm it is not a module > problem: > modprobe mlx4_ib > modprobe ib_ipoib > mopdrobe ib_ucm > modprobe ib_uverbs > modprobe ib_umad Thanks a lot Jason! Loading the modules directly w/o using the init script is the solution. No more side effects in the ethernet. The ipoib bandwidth drops significantly though. I will look into the init script and try to narrow it down. Interestingly the problem does not show up w/o the HP GbE switch. I isolated the nodes and connected them back to back, as well as with an older 8-port unmanaged switch. In both cases I don't see the negative effect. Sebastian From jgunthorpe at obsidianresearch.com Sun Sep 20 16:38:13 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Sun, 20 Sep 2009 17:38:13 -0600 Subject: [ofa-general] OFED interfering with Ethernet In-Reply-To: <20090921011755.57846mh1c5eqopc8@mail.kip.uni-heidelberg.de> References: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> <20090919230241.GB22310@obsidianresearch.com> <20090921011755.57846mh1c5eqopc8@mail.kip.uni-heidelberg.de> Message-ID: <20090920233813.GC22310@obsidianresearch.com> On Mon, Sep 21, 2009 at 01:17:55AM +0200, Sebastian Kalcher wrote: > Interestingly the problem does not show up w/o the HP GbE switch. I > isolated the nodes and connected them back to back, as well as with an > older 8-port unmanaged switch. In both cases I don't see the negative > effect. Hmm, double weird.. bonding related perhaps? -- Jason Gunthorpe (780)4406067x832 Chief Technology Officer, Obsidian Research Corp Edmonton, Canada From dotanba at gmail.com Sun Sep 20 23:04:31 2009 From: dotanba at gmail.com (Dotan Barak) Date: Mon, 21 Sep 2009 09:04:31 +0300 Subject: [ofa-general] Building IB SAN with Linux without switch In-Reply-To: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> References: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> Message-ID: <2f3bf9a60909202304r33c97d6dx37c48b1eb8d319eb@mail.gmail.com> Hi. On Fri, Sep 18, 2009 at 11:41 PM, Roberto Fichera wrote: > Hi All in the list, > > I would like to know if it's possible to configure a linux server with 2 > or 3 HCAs, with 2 ports each, so that I can connect 4 or 6 nodes without > using any switch in the middle. If possible, please show an example of the > network configuration. Yes, this is possible. But please pay attention: Every port will be in a different Infiniband subnet (from the local host point of view). What do you plan to do with this setup? (which SW/program to use?) Dotan From monis at Voltaire.COM Mon Sep 21 00:18:10 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Mon, 21 Sep 2009 10:18:10 +0300 Subject: [ofa-general] OFED interfering with Ethernet In-Reply-To: <20090920233813.GC22310@obsidianresearch.com> References: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> <20090919230241.GB22310@obsidianresearch.com> <20090921011755.57846mh1c5eqopc8@mail.kip.uni-heidelberg.de> <20090920233813.GC22310@obsidianresearch.com> Message-ID: <4AB728B2.7020201@Voltaire.COM> Jason Gunthorpe wrote: > On Mon, Sep 21, 2009 at 01:17:55AM +0200, Sebastian Kalcher wrote: > >> Interestingly the problem does not show up w/o the HP GbE switch. I >> isolated the nodes and connected them back to back, as well as with an >> older 8-port unmanaged switch. In both cases I don't see the negative >> effect. > > Hmm, double weird.. > > bonding related perhaps? > Hi Jason, How do you think bonding could have caused that? From kliteyn at dev.mellanox.co.il Mon Sep 21 00:27:23 2009 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 21 Sep 2009 10:27:23 +0300 Subject: [ofa-general] Re: [PATCH] ibdm/ibnl/SUNDSC*.ibnl Corrected ibnl definition files for Sun IB QDR products In-Reply-To: <4AA60EAB.1020101@Sun.COM> References: <4AA60EAB.1020101@Sun.COM> Message-ID: <4AB72ADB.2040503@dev.mellanox.co.il> Lars Paul Huse wrote: > Updated ibnl definition files for Sun IB QDR products: > SUNDCS648QDR: Corrected card numbering & plugg row > SUNDCS72QDR: Corrected plugg row Thanks, applied. -- Yevgeny > Signed-off-by: Lars Paul Huse > > --- > diff --git a/ibdm/ibnl/SUNDCS648QDR.ibnl b/ibdm/ibnl/SUNDCS648QDR.ibnl > index a8b6558..af4ab6b 100644 > --- a/ibdm/ibnl/SUNDCS648QDR.ibnl > +++ b/ibdm/ibnl/SUNDCS648QDR.ibnl > @@ -80,2054 +80,2054 @@ NODE SW 36 MT48436 U1 > > TOPSYSTEM SUNDCS648QDR,SUN-M9-648 > > +SUBSYSTEM SPINE fc0A > + P1 -10G-> lc0A P13 > + P2 -10G-> lc0B P14 > + P3 -10G-> lc0C P13 > + P4 -10G-> lc0D P14 > + P5 -10G-> lc8A P13 > + P6 -10G-> lc8C P13 > + P7 -10G-> lc8B P14 > + P8 -10G-> lc7A P13 > + P9 -10G-> lc8D P14 > + P10 -10G-> lc7C P13 > + P11 -10G-> lc7B P140 > + P12 -10G-> lc6A P13 > + P13 -10G-> lc5B P14 > + P14 -10G-> lc5A P13 > + P15 -10G-> lc6D P14 > + P16 -10G-> lc6C P13 > + P17 -10G-> lc6B P14 > + P18 -10G-> lc7D P14 > + P19 -10G-> lc1D P14 > + P20 -10G-> lc1C P13 > + P21 -10G-> lc1B P14 > + P22 -10G-> lc1A P13 > + P23 -10G-> lc2D P14 > + P24 -10G-> lc2B P14 > + P25 -10G-> lc2C P13 > + P26 -10G-> lc3D P14 > + P27 -10G-> lc2A P13 > + P28 -10G-> lc3B P14 > + P29 -10G-> lc3C P13 > + P30 -10G-> lc4D P14 > + P31 -10G-> lc5C P13 > + P32 -10G-> lc5D P14 > + P33 -10G-> lc4A P13 > + P34 -10G-> lc4B P14 > + P35 -10G-> lc4C P13 > + P36 -10G-> lc3A P13 > + > +SUBSYSTEM SPINE fc0B > + P1 -10G-> lc7D P13 > + P2 -10G-> lc7A P14 > + P3 -10G-> lc7B P13 > + P4 -10G-> lc7C P14 > + P5 -10G-> lc6D P13 > + P6 -10G-> lc6B P13 > + P7 -10G-> lc6A P14 > + P8 -10G-> lc5D P13 > + P9 -10G-> lc6C P14 > + P10 -10G-> lc5B P13 > + P11 -10G-> lc5A P14 > + P12 -10G-> lc4D P13 > + P13 -10G-> lc3A P14 > + P14 -10G-> lc3D P13 > + P15 -10G-> lc4C P14 > + P16 -10G-> lc4B P13 > + P17 -10G-> lc4A P14 > + P18 -10G-> lc5C P14 > + P19 -10G-> lc8C P14 > + P20 -10G-> lc8B P13 > + P21 -10G-> lc8A P14 > + P22 -10G-> lc8D P13 > + P23 -10G-> lc0C P14 > + P24 -10G-> lc0A P14 > + P25 -10G-> lc0B P13 > + P26 -10G-> lc1C P14 > + P27 -10G-> lc0D P13 > + P28 -10G-> lc1A P14 > + P29 -10G-> lc1B P13 > + P30 -10G-> lc2C P14 > + P31 -10G-> lc3B P13 > + P32 -10G-> lc3C P14 > + P33 -10G-> lc2D P13 > + P34 -10G-> lc2A P14 > + P35 -10G-> lc2B P13 > + P36 -10G-> lc1D P13 > + > SUBSYSTEM SPINE fc1A > - P1 -10G-> lc1A P13 > - P2 -10G-> lc1B P14 > - P3 -10G-> lc1C P13 > - P4 -10G-> lc1D P14 > - P5 -10G-> lc9A P13 > - P6 -10G-> lc9C P13 > - P7 -10G-> lc9B P14 > - P8 -10G-> lc8A P13 > - P9 -10G-> lc9D P14 > - P10 -10G-> lc8C P13 > - P11 -10G-> lc8B P140 > - P12 -10G-> lc7A P13 > - P13 -10G-> lc6B P14 > - P14 -10G-> lc6A P13 > - P15 -10G-> lc7D P14 > - P16 -10G-> lc7C P13 > - P17 -10G-> lc7B P14 > - P18 -10G-> lc8D P14 > - P19 -10G-> lc2D P14 > - P20 -10G-> lc2C P13 > - P21 -10G-> lc2B P14 > - P22 -10G-> lc2A P13 > - P23 -10G-> lc3D P14 > - P24 -10G-> lc3B P14 > - P25 -10G-> lc3C P13 > - P26 -10G-> lc4D P14 > - P27 -10G-> lc3A P13 > - P28 -10G-> lc4B P14 > - P29 -10G-> lc4C P13 > - P30 -10G-> lc5D P14 > - P31 -10G-> lc6C P13 > - P32 -10G-> lc6D P14 > - P33 -10G-> lc5A P13 > - P34 -10G-> lc5B P14 > - P35 -10G-> lc5C P13 > - P36 -10G-> lc4A P13 > + P1 -10G-> lc0A P15 > + P2 -10G-> lc0B P16 > + P3 -10G-> lc0C P15 > + P4 -10G-> lc0D P16 > + P5 -10G-> lc8A P15 > + P6 -10G-> lc8C P15 > + P7 -10G-> lc8B P16 > + P8 -10G-> lc7A P15 > + P9 -10G-> lc8D P16 > + P10 -10G-> lc7C P15 > + P11 -10G-> lc7B P16 > + P12 -10G-> lc6A P15 > + P13 -10G-> lc5B P16 > + P14 -10G-> lc5A P15 > + P15 -10G-> lc6D P16 > + P16 -10G-> lc6C P15 > + P17 -10G-> lc6B P16 > + P18 -10G-> lc7D P16 > + P19 -10G-> lc1D P16 > + P20 -10G-> lc1C P15 > + P21 -10G-> lc1B P16 > + P22 -10G-> lc1A P15 > + P23 -10G-> lc2D P16 > + P24 -10G-> lc2B P16 > + P25 -10G-> lc2C P15 > + P26 -10G-> lc3D P16 > + P27 -10G-> lc2A P15 > + P28 -10G-> lc3B P16 > + P29 -10G-> lc3C P15 > + P30 -10G-> lc4D P16 > + P31 -10G-> lc5C P15 > + P32 -10G-> lc5D P16 > + P33 -10G-> lc4A P15 > + P34 -10G-> lc4B P16 > + P35 -10G-> lc4C P15 > + P36 -10G-> lc3A P15 > > SUBSYSTEM SPINE fc1B > - P1 -10G-> lc8D P13 > - P2 -10G-> lc8A P14 > - P3 -10G-> lc8B P13 > - P4 -10G-> lc8C P14 > - P5 -10G-> lc7D P13 > - P6 -10G-> lc7B P13 > - P7 -10G-> lc7A P14 > - P8 -10G-> lc6D P13 > - P9 -10G-> lc7C P14 > - P10 -10G-> lc6B P13 > - P11 -10G-> lc6A P14 > - P12 -10G-> lc5D P13 > - P13 -10G-> lc4A P14 > - P14 -10G-> lc4D P13 > - P15 -10G-> lc5C P14 > - P16 -10G-> lc5B P13 > - P17 -10G-> lc5A P14 > - P18 -10G-> lc6C P14 > - P19 -10G-> lc9C P14 > - P20 -10G-> lc9B P13 > - P21 -10G-> lc9A P14 > - P22 -10G-> lc9D P13 > - P23 -10G-> lc1C P14 > - P24 -10G-> lc1A P14 > - P25 -10G-> lc1B P13 > - P26 -10G-> lc2C P14 > - P27 -10G-> lc1D P13 > - P28 -10G-> lc2A P14 > - P29 -10G-> lc2B P13 > - P30 -10G-> lc3C P14 > - P31 -10G-> lc4B P13 > - P32 -10G-> lc4C P14 > - P33 -10G-> lc3D P13 > - P34 -10G-> lc3A P14 > - P35 -10G-> lc3B P13 > - P36 -10G-> lc2D P13 > + P1 -10G-> lc7D P15 > + P2 -10G-> lc7A P16 > + P3 -10G-> lc7B P15 > + P4 -10G-> lc7C P16 > + P5 -10G-> lc6D P15 > + P6 -10G-> lc6B P15 > + P7 -10G-> lc6A P16 > + P8 -10G-> lc5D P15 > + P9 -10G-> lc6C P16 > + P10 -10G-> lc5B P15 > + P11 -10G-> lc5A P16 > + P12 -10G-> lc4D P15 > + P13 -10G-> lc3A P16 > + P14 -10G-> lc3D P15 > + P15 -10G-> lc4C P16 > + P16 -10G-> lc4B P15 > + P17 -10G-> lc4A P16 > + P18 -10G-> lc5C P16 > + P19 -10G-> lc8C P16 > + P20 -10G-> lc8B P15 > + P21 -10G-> lc8A P16 > + P22 -10G-> lc8D P15 > + P23 -10G-> lc0C P16 > + P24 -10G-> lc0A P16 > + P25 -10G-> lc0B P15 > + P26 -10G-> lc1C P16 > + P27 -10G-> lc0D P15 > + P28 -10G-> lc1A P16 > + P29 -10G-> lc1B P15 > + P30 -10G-> lc2C P16 > + P31 -10G-> lc3B P15 > + P32 -10G-> lc3C P16 > + P33 -10G-> lc2D P15 > + P34 -10G-> lc2A P16 > + P35 -10G-> lc2B P15 > + P36 -10G-> lc1D P15 > > SUBSYSTEM SPINE fc2A > - P1 -10G-> lc1A P15 > - P2 -10G-> lc1B P16 > - P3 -10G-> lc1C P15 > - P4 -10G-> lc1D P16 > - P5 -10G-> lc9A P15 > - P6 -10G-> lc9C P15 > - P7 -10G-> lc9B P16 > - P8 -10G-> lc8A P15 > - P9 -10G-> lc9D P16 > - P10 -10G-> lc8C P15 > - P11 -10G-> lc8B P16 > - P12 -10G-> lc7A P15 > - P13 -10G-> lc6B P16 > - P14 -10G-> lc6A P15 > - P15 -10G-> lc7D P16 > - P16 -10G-> lc7C P15 > - P17 -10G-> lc7B P16 > - P18 -10G-> lc8D P16 > - P19 -10G-> lc2D P16 > - P20 -10G-> lc2C P15 > - P21 -10G-> lc2B P16 > - P22 -10G-> lc2A P15 > - P23 -10G-> lc3D P16 > - P24 -10G-> lc3B P16 > - P25 -10G-> lc3C P15 > - P26 -10G-> lc4D P16 > - P27 -10G-> lc3A P15 > - P28 -10G-> lc4B P16 > - P29 -10G-> lc4C P15 > - P30 -10G-> lc5D P16 > - P31 -10G-> lc6C P15 > - P32 -10G-> lc6D P16 > - P33 -10G-> lc5A P15 > - P34 -10G-> lc5B P16 > - P35 -10G-> lc5C P15 > - P36 -10G-> lc4A P15 > + P1 -10G-> lc0A P17 > + P2 -10G-> lc0B P18 > + P3 -10G-> lc0C P17 > + P4 -10G-> lc0D P18 > + P5 -10G-> lc8A P17 > + P6 -10G-> lc8C P17 > + P7 -10G-> lc8B P18 > + P8 -10G-> lc7A P17 > + P9 -10G-> lc8D P18 > + P10 -10G-> lc7C P17 > + P11 -10G-> lc7B P18 > + P12 -10G-> lc6A P17 > + P13 -10G-> lc5B P18 > + P14 -10G-> lc5A P17 > + P15 -10G-> lc6D P18 > + P16 -10G-> lc6C P17 > + P17 -10G-> lc6B P18 > + P18 -10G-> lc7D P18 > + P19 -10G-> lc1D P18 > + P20 -10G-> lc1C P17 > + P21 -10G-> lc1B P18 > + P22 -10G-> lc1A P17 > + P23 -10G-> lc2D P18 > + P24 -10G-> lc2B P18 > + P25 -10G-> lc2C P17 > + P26 -10G-> lc3D P18 > + P27 -10G-> lc2A P17 > + P28 -10G-> lc3B P18 > + P29 -10G-> lc3C P17 > + P30 -10G-> lc4D P18 > + P31 -10G-> lc5C P17 > + P32 -10G-> lc5D P18 > + P33 -10G-> lc4A P17 > + P34 -10G-> lc4B P18 > + P35 -10G-> lc4C P17 > + P36 -10G-> lc3A P17 > > SUBSYSTEM SPINE fc2B > - P1 -10G-> lc8D P15 > - P2 -10G-> lc8A P16 > - P3 -10G-> lc8B P15 > - P4 -10G-> lc8C P16 > - P5 -10G-> lc7D P15 > - P6 -10G-> lc7B P15 > - P7 -10G-> lc7A P16 > - P8 -10G-> lc6D P15 > - P9 -10G-> lc7C P16 > - P10 -10G-> lc6B P15 > - P11 -10G-> lc6A P16 > - P12 -10G-> lc5D P15 > - P13 -10G-> lc4A P16 > - P14 -10G-> lc4D P15 > - P15 -10G-> lc5C P16 > - P16 -10G-> lc5B P15 > - P17 -10G-> lc5A P16 > - P18 -10G-> lc6C P16 > - P19 -10G-> lc9C P16 > - P20 -10G-> lc9B P15 > - P21 -10G-> lc9A P16 > - P22 -10G-> lc9D P15 > - P23 -10G-> lc1C P16 > - P24 -10G-> lc1A P16 > - P25 -10G-> lc1B P15 > - P26 -10G-> lc2C P16 > - P27 -10G-> lc1D P15 > - P28 -10G-> lc2A P16 > - P29 -10G-> lc2B P15 > - P30 -10G-> lc3C P16 > - P31 -10G-> lc4B P15 > - P32 -10G-> lc4C P16 > - P33 -10G-> lc3D P15 > - P34 -10G-> lc3A P16 > - P35 -10G-> lc3B P15 > - P36 -10G-> lc2D P15 > + P1 -10G-> lc7D P17 > + P2 -10G-> lc7A P18 > + P3 -10G-> lc7B P17 > + P4 -10G-> lc7C P18 > + P5 -10G-> lc6D P17 > + P6 -10G-> lc6B P17 > + P7 -10G-> lc6A P18 > + P8 -10G-> lc5D P17 > + P9 -10G-> lc6C P18 > + P10 -10G-> lc5B P17 > + P11 -10G-> lc5A P18 > + P12 -10G-> lc4D P17 > + P13 -10G-> lc3A P18 > + P14 -10G-> lc3D P17 > + P15 -10G-> lc4C P18 > + P16 -10G-> lc4B P17 > + P17 -10G-> lc4A P18 > + P18 -10G-> lc5C P18 > + P19 -10G-> lc8C P18 > + P20 -10G-> lc8B P17 > + P21 -10G-> lc8A P18 > + P22 -10G-> lc8D P17 > + P23 -10G-> lc0C P18 > + P24 -10G-> lc0A P18 > + P25 -10G-> lc0B P17 > + P26 -10G-> lc1C P18 > + P27 -10G-> lc0D P17 > + P28 -10G-> lc1A P18 > + P29 -10G-> lc1B P17 > + P30 -10G-> lc2C P18 > + P31 -10G-> lc3B P17 > + P32 -10G-> lc3C P18 > + P33 -10G-> lc2D P17 > + P34 -10G-> lc2A P18 > + P35 -10G-> lc2B P17 > + P36 -10G-> lc1D P17 > > SUBSYSTEM SPINE fc3A > - P1 -10G-> lc1A P17 > - P2 -10G-> lc1B P18 > - P3 -10G-> lc1C P17 > - P4 -10G-> lc1D P18 > - P5 -10G-> lc9A P17 > - P6 -10G-> lc9C P17 > - P7 -10G-> lc9B P18 > - P8 -10G-> lc8A P17 > - P9 -10G-> lc9D P18 > - P10 -10G-> lc8C P17 > - P11 -10G-> lc8B P18 > - P12 -10G-> lc7A P17 > - P13 -10G-> lc6B P18 > - P14 -10G-> lc6A P17 > - P15 -10G-> lc7D P18 > - P16 -10G-> lc7C P17 > - P17 -10G-> lc7B P18 > - P18 -10G-> lc8D P18 > - P19 -10G-> lc2D P18 > - P20 -10G-> lc2C P17 > - P21 -10G-> lc2B P18 > - P22 -10G-> lc2A P17 > - P23 -10G-> lc3D P18 > - P24 -10G-> lc3B P18 > - P25 -10G-> lc3C P17 > - P26 -10G-> lc4D P18 > - P27 -10G-> lc3A P17 > - P28 -10G-> lc4B P18 > - P29 -10G-> lc4C P17 > - P30 -10G-> lc5D P18 > - P31 -10G-> lc6C P17 > - P32 -10G-> lc6D P18 > - P33 -10G-> lc5A P17 > - P34 -10G-> lc5B P18 > - P35 -10G-> lc5C P17 > - P36 -10G-> lc4A P17 > + P1 -10G-> lc0A P12 > + P2 -10G-> lc0B P11 > + P3 -10G-> lc0C P12 > + P4 -10G-> lc0D P11 > + P5 -10G-> lc8A P12 > + P6 -10G-> lc8C P12 > + P7 -10G-> lc8B P11 > + P8 -10G-> lc7A P12 > + P9 -10G-> lc8D P11 > + P10 -10G-> lc7C P12 > + P11 -10G-> lc7B P11 > + P12 -10G-> lc6A P12 > + P13 -10G-> lc5B P11 > + P14 -10G-> lc5A P12 > + P15 -10G-> lc6D P11 > + P16 -10G-> lc6C P12 > + P17 -10G-> lc6B P11 > + P18 -10G-> lc7D P11 > + P19 -10G-> lc1D P11 > + P20 -10G-> lc1C P12 > + P21 -10G-> lc1B P11 > + P22 -10G-> lc1A P12 > + P23 -10G-> lc2D P11 > + P24 -10G-> lc2B P11 > + P25 -10G-> lc2C P12 > + P26 -10G-> lc3D P11 > + P27 -10G-> lc2A P12 > + P28 -10G-> lc3B P11 > + P29 -10G-> lc3C P12 > + P30 -10G-> lc4D P11 > + P31 -10G-> lc5C P12 > + P32 -10G-> lc5D P11 > + P33 -10G-> lc4A P12 > + P34 -10G-> lc4B P11 > + P35 -10G-> lc4C P12 > + P36 -10G-> lc3A P12 > > SUBSYSTEM SPINE fc3B > - P1 -10G-> lc8D P17 > - P2 -10G-> lc8A P18 > - P3 -10G-> lc8B P17 > - P4 -10G-> lc8C P18 > - P5 -10G-> lc7D P17 > - P6 -10G-> lc7B P17 > - P7 -10G-> lc7A P18 > - P8 -10G-> lc6D P17 > - P9 -10G-> lc7C P18 > - P10 -10G-> lc6B P17 > - P11 -10G-> lc6A P18 > - P12 -10G-> lc5D P17 > - P13 -10G-> lc4A P18 > - P14 -10G-> lc4D P17 > - P15 -10G-> lc5C P18 > - P16 -10G-> lc5B P17 > - P17 -10G-> lc5A P18 > - P18 -10G-> lc6C P18 > - P19 -10G-> lc9C P18 > - P20 -10G-> lc9B P17 > - P21 -10G-> lc9A P18 > - P22 -10G-> lc9D P17 > - P23 -10G-> lc1C P18 > - P24 -10G-> lc1A P18 > - P25 -10G-> lc1B P17 > - P26 -10G-> lc2C P18 > - P27 -10G-> lc1D P17 > - P28 -10G-> lc2A P18 > - P29 -10G-> lc2B P17 > - P30 -10G-> lc3C P18 > - P31 -10G-> lc4B P17 > - P32 -10G-> lc4C P18 > - P33 -10G-> lc3D P17 > - P34 -10G-> lc3A P18 > - P35 -10G-> lc3B P17 > - P36 -10G-> lc2D P17 > + P1 -10G-> lc7D P12 > + P2 -10G-> lc7A P11 > + P3 -10G-> lc7B P12 > + P4 -10G-> lc7C P11 > + P5 -10G-> lc6D P12 > + P6 -10G-> lc6B P12 > + P7 -10G-> lc6A P11 > + P8 -10G-> lc5D P12 > + P9 -10G-> lc6C P11 > + P10 -10G-> lc5B P12 > + P11 -10G-> lc5A P11 > + P12 -10G-> lc4D P12 > + P13 -10G-> lc3A P11 > + P14 -10G-> lc3D P12 > + P15 -10G-> lc4C P11 > + P16 -10G-> lc4B P12 > + P17 -10G-> lc4A P11 > + P18 -10G-> lc5C P11 > + P19 -10G-> lc8C P11 > + P20 -10G-> lc8B P12 > + P21 -10G-> lc8A P11 > + P22 -10G-> lc8D P12 > + P23 -10G-> lc0C P11 > + P24 -10G-> lc0A P11 > + P25 -10G-> lc0B P12 > + P26 -10G-> lc1C P11 > + P27 -10G-> lc0D P12 > + P28 -10G-> lc1A P11 > + P29 -10G-> lc1B P12 > + P30 -10G-> lc2C P11 > + P31 -10G-> lc3B P12 > + P32 -10G-> lc3C P11 > + P33 -10G-> lc2D P12 > + P34 -10G-> lc2A P11 > + P35 -10G-> lc2B P12 > + P36 -10G-> lc1D P12 > > SUBSYSTEM SPINE fc4A > - P1 -10G-> lc1A P12 > - P2 -10G-> lc1B P11 > - P3 -10G-> lc1C P12 > - P4 -10G-> lc1D P11 > - P5 -10G-> lc9A P12 > - P6 -10G-> lc9C P12 > - P7 -10G-> lc9B P11 > - P8 -10G-> lc8A P12 > - P9 -10G-> lc9D P11 > - P10 -10G-> lc8C P12 > - P11 -10G-> lc8B P11 > - P12 -10G-> lc7A P12 > - P13 -10G-> lc6B P11 > - P14 -10G-> lc6A P12 > - P15 -10G-> lc7D P11 > - P16 -10G-> lc7C P12 > - P17 -10G-> lc7B P11 > - P18 -10G-> lc8D P11 > - P19 -10G-> lc2D P11 > - P20 -10G-> lc2C P12 > - P21 -10G-> lc2B P11 > - P22 -10G-> lc2A P12 > - P23 -10G-> lc3D P11 > - P24 -10G-> lc3B P11 > - P25 -10G-> lc3C P12 > - P26 -10G-> lc4D P11 > - P27 -10G-> lc3A P12 > - P28 -10G-> lc4B P11 > - P29 -10G-> lc4C P12 > - P30 -10G-> lc5D P11 > - P31 -10G-> lc6C P12 > - P32 -10G-> lc6D P11 > - P33 -10G-> lc5A P12 > - P34 -10G-> lc5B P11 > - P35 -10G-> lc5C P12 > - P36 -10G-> lc4A P12 > + P1 -10G-> lc0A P10 > + P2 -10G-> lc0B P9 > + P3 -10G-> lc0C P10 > + P4 -10G-> lc0D P9 > + P5 -10G-> lc8A P10 > + P6 -10G-> lc8C P10 > + P7 -10G-> lc8B P9 > + P8 -10G-> lc7A P10 > + P9 -10G-> lc8D P9 > + P10 -10G-> lc7C P10 > + P11 -10G-> lc7B P9 > + P12 -10G-> lc6A P10 > + P13 -10G-> lc5B P9 > + P14 -10G-> lc5A P10 > + P15 -10G-> lc6D P9 > + P16 -10G-> lc6C P10 > + P17 -10G-> lc6B P9 > + P18 -10G-> lc7D P9 > + P19 -10G-> lc1D P9 > + P20 -10G-> lc1C P10 > + P21 -10G-> lc1B P9 > + P22 -10G-> lc1A P10 > + P23 -10G-> lc2D P9 > + P24 -10G-> lc2B P9 > + P25 -10G-> lc2C P10 > + P26 -10G-> lc3D P9 > + P27 -10G-> lc2A P10 > + P28 -10G-> lc3B P9 > + P29 -10G-> lc3C P10 > + P30 -10G-> lc4D P9 > + P31 -10G-> lc5C P10 > + P32 -10G-> lc5D P9 > + P33 -10G-> lc4A P10 > + P34 -10G-> lc4B P9 > + P35 -10G-> lc4C P10 > + P36 -10G-> lc3A P10 > > SUBSYSTEM SPINE fc4B > - P1 -10G-> lc8D P12 > - P2 -10G-> lc8A P11 > - P3 -10G-> lc8B P12 > - P4 -10G-> lc8C P11 > - P5 -10G-> lc7D P12 > - P6 -10G-> lc7B P12 > - P7 -10G-> lc7A P11 > - P8 -10G-> lc6D P12 > - P9 -10G-> lc7C P11 > - P10 -10G-> lc6B P12 > - P11 -10G-> lc6A P11 > - P12 -10G-> lc5D P12 > - P13 -10G-> lc4A P11 > - P14 -10G-> lc4D P12 > - P15 -10G-> lc5C P11 > - P16 -10G-> lc5B P12 > - P17 -10G-> lc5A P11 > - P18 -10G-> lc6C P11 > - P19 -10G-> lc9C P11 > - P20 -10G-> lc9B P12 > - P21 -10G-> lc9A P11 > - P22 -10G-> lc9D P12 > - P23 -10G-> lc1C P11 > - P24 -10G-> lc1A P11 > - P25 -10G-> lc1B P12 > - P26 -10G-> lc2C P11 > - P27 -10G-> lc1D P12 > - P28 -10G-> lc2A P11 > - P29 -10G-> lc2B P12 > - P30 -10G-> lc3C P11 > - P31 -10G-> lc4B P12 > - P32 -10G-> lc4C P11 > - P33 -10G-> lc3D P12 > - P34 -10G-> lc3A P11 > - P35 -10G-> lc3B P12 > - P36 -10G-> lc2D P12 > + P1 -10G-> lc7D P10 > + P2 -10G-> lc7A P9 > + P3 -10G-> lc7B P10 > + P4 -10G-> lc7C P9 > + P5 -10G-> lc6D P10 > + P6 -10G-> lc6B P10 > + P7 -10G-> lc6A P9 > + P8 -10G-> lc5D P10 > + P9 -10G-> lc6C P9 > + P10 -10G-> lc5B P10 > + P11 -10G-> lc5A P9 > + P12 -10G-> lc4D P10 > + P13 -10G-> lc3A P9 > + P14 -10G-> lc3D P10 > + P15 -10G-> lc4C P9 > + P16 -10G-> lc4B P10 > + P17 -10G-> lc4A P9 > + P18 -10G-> lc5C P9 > + P19 -10G-> lc8C P9 > + P20 -10G-> lc8B P10 > + P21 -10G-> lc8A P9 > + P22 -10G-> lc8D P10 > + P23 -10G-> lc0C P9 > + P24 -10G-> lc0A P9 > + P25 -10G-> lc0B P10 > + P26 -10G-> lc1C P9 > + P27 -10G-> lc0D P10 > + P28 -10G-> lc1A P9 > + P29 -10G-> lc1B P10 > + P30 -10G-> lc2C P9 > + P31 -10G-> lc3B P10 > + P32 -10G-> lc3C P9 > + P33 -10G-> lc2D P10 > + P34 -10G-> lc2A P9 > + P35 -10G-> lc2B P10 > + P36 -10G-> lc1D P10 > > SUBSYSTEM SPINE fc5A > - P1 -10G-> lc1A P10 > - P2 -10G-> lc1B P9 > - P3 -10G-> lc1C P10 > - P4 -10G-> lc1D P9 > - P5 -10G-> lc9A P10 > - P6 -10G-> lc9C P10 > - P7 -10G-> lc9B P9 > - P8 -10G-> lc8A P10 > - P9 -10G-> lc9D P9 > - P10 -10G-> lc8C P10 > - P11 -10G-> lc8B P9 > - P12 -10G-> lc7A P10 > - P13 -10G-> lc6B P9 > - P14 -10G-> lc6A P10 > - P15 -10G-> lc7D P9 > - P16 -10G-> lc7C P10 > - P17 -10G-> lc7B P9 > - P18 -10G-> lc8D P9 > - P19 -10G-> lc2D P9 > - P20 -10G-> lc2C P10 > - P21 -10G-> lc2B P9 > - P22 -10G-> lc2A P10 > - P23 -10G-> lc3D P9 > - P24 -10G-> lc3B P9 > - P25 -10G-> lc3C P10 > - P26 -10G-> lc4D P9 > - P27 -10G-> lc3A P10 > - P28 -10G-> lc4B P9 > - P29 -10G-> lc4C P10 > - P30 -10G-> lc5D P9 > - P31 -10G-> lc6C P10 > - P32 -10G-> lc6D P9 > - P33 -10G-> lc5A P10 > - P34 -10G-> lc5B P9 > - P35 -10G-> lc5C P10 > - P36 -10G-> lc4A P10 > + P1 -10G-> lc0A P8 > + P2 -10G-> lc0B P7 > + P3 -10G-> lc0C P8 > + P4 -10G-> lc0D P7 > + P5 -10G-> lc8A P8 > + P6 -10G-> lc8C P8 > + P7 -10G-> lc8B P7 > + P8 -10G-> lc7A P8 > + P9 -10G-> lc8D P7 > + P10 -10G-> lc7C P8 > + P11 -10G-> lc7B P7 > + P12 -10G-> lc6A P8 > + P13 -10G-> lc5B P7 > + P14 -10G-> lc5A P8 > + P15 -10G-> lc6D P7 > + P16 -10G-> lc6C P8 > + P17 -10G-> lc6B P7 > + P18 -10G-> lc7D P7 > + P19 -10G-> lc1D P7 > + P20 -10G-> lc1C P8 > + P21 -10G-> lc1B P7 > + P22 -10G-> lc1A P8 > + P23 -10G-> lc2D P7 > + P24 -10G-> lc2B P7 > + P25 -10G-> lc2C P8 > + P26 -10G-> lc3D P7 > + P27 -10G-> lc2A P8 > + P28 -10G-> lc3B P7 > + P29 -10G-> lc3C P8 > + P30 -10G-> lc4D P7 > + P31 -10G-> lc5C P8 > + P32 -10G-> lc5D P7 > + P33 -10G-> lc4A P8 > + P34 -10G-> lc4B P7 > + P35 -10G-> lc4C P8 > + P36 -10G-> lc3A P8 > > SUBSYSTEM SPINE fc5B > - P1 -10G-> lc8D P10 > - P2 -10G-> lc8A P9 > - P3 -10G-> lc8B P10 > - P4 -10G-> lc8C P9 > - P5 -10G-> lc7D P10 > - P6 -10G-> lc7B P10 > - P7 -10G-> lc7A P9 > - P8 -10G-> lc6D P10 > - P9 -10G-> lc7C P9 > - P10 -10G-> lc6B P10 > - P11 -10G-> lc6A P9 > - P12 -10G-> lc5D P10 > - P13 -10G-> lc4A P9 > - P14 -10G-> lc4D P10 > - P15 -10G-> lc5C P9 > - P16 -10G-> lc5B P10 > - P17 -10G-> lc5A P9 > - P18 -10G-> lc6C P9 > - P19 -10G-> lc9C P9 > - P20 -10G-> lc9B P10 > - P21 -10G-> lc9A P9 > - P22 -10G-> lc9D P10 > - P23 -10G-> lc1C P9 > - P24 -10G-> lc1A P9 > - P25 -10G-> lc1B P10 > - P26 -10G-> lc2C P9 > - P27 -10G-> lc1D P10 > - P28 -10G-> lc2A P9 > - P29 -10G-> lc2B P10 > - P30 -10G-> lc3C P9 > - P31 -10G-> lc4B P10 > - P32 -10G-> lc4C P9 > - P33 -10G-> lc3D P10 > - P34 -10G-> lc3A P9 > - P35 -10G-> lc3B P10 > - P36 -10G-> lc2D P10 > + P1 -10G-> lc7D P8 > + P2 -10G-> lc7A P7 > + P3 -10G-> lc7B P8 > + P4 -10G-> lc7C P7 > + P5 -10G-> lc6D P8 > + P6 -10G-> lc6B P8 > + P7 -10G-> lc6A P7 > + P8 -10G-> lc5D P8 > + P9 -10G-> lc6C P7 > + P10 -10G-> lc5B P8 > + P11 -10G-> lc5A P7 > + P12 -10G-> lc4D P8 > + P13 -10G-> lc3A P7 > + P14 -10G-> lc3D P8 > + P15 -10G-> lc4C P7 > + P16 -10G-> lc4B P8 > + P17 -10G-> lc4A P7 > + P18 -10G-> lc5C P7 > + P19 -10G-> lc8C P7 > + P20 -10G-> lc8B P8 > + P21 -10G-> lc8A P7 > + P22 -10G-> lc8D P8 > + P23 -10G-> lc0C P7 > + P24 -10G-> lc0A P7 > + P25 -10G-> lc0B P8 > + P26 -10G-> lc1C P7 > + P27 -10G-> lc0D P8 > + P28 -10G-> lc1A P7 > + P29 -10G-> lc1B P8 > + P30 -10G-> lc2C P7 > + P31 -10G-> lc3B P8 > + P32 -10G-> lc3C P7 > + P33 -10G-> lc2D P8 > + P34 -10G-> lc2A P7 > + P35 -10G-> lc2B P8 > + P36 -10G-> lc1D P8 > > SUBSYSTEM SPINE fc6A > - P1 -10G-> lc1A P8 > - P2 -10G-> lc1B P7 > - P3 -10G-> lc1C P8 > - P4 -10G-> lc1D P7 > - P5 -10G-> lc9A P8 > - P6 -10G-> lc9C P8 > - P7 -10G-> lc9B P7 > - P8 -10G-> lc8A P8 > - P9 -10G-> lc9D P7 > - P10 -10G-> lc8C P8 > - P11 -10G-> lc8B P7 > - P12 -10G-> lc7A P8 > - P13 -10G-> lc6B P7 > - P14 -10G-> lc6A P8 > - P15 -10G-> lc7D P7 > - P16 -10G-> lc7C P8 > - P17 -10G-> lc7B P7 > - P18 -10G-> lc8D P7 > - P19 -10G-> lc2D P7 > - P20 -10G-> lc2C P8 > - P21 -10G-> lc2B P7 > - P22 -10G-> lc2A P8 > - P23 -10G-> lc3D P7 > - P24 -10G-> lc3B P7 > - P25 -10G-> lc3C P8 > - P26 -10G-> lc4D P7 > - P27 -10G-> lc3A P8 > - P28 -10G-> lc4B P7 > - P29 -10G-> lc4C P8 > - P30 -10G-> lc5D P7 > - P31 -10G-> lc6C P8 > - P32 -10G-> lc6D P7 > - P33 -10G-> lc5A P8 > - P34 -10G-> lc5B P7 > - P35 -10G-> lc5C P8 > - P36 -10G-> lc4A P8 > + P1 -10G-> lc0A P6 > + P2 -10G-> lc0B P5 > + P3 -10G-> lc0C P6 > + P4 -10G-> lc0D P5 > + P5 -10G-> lc8A P6 > + P6 -10G-> lc8C P6 > + P7 -10G-> lc8B P5 > + P8 -10G-> lc7A P6 > + P9 -10G-> lc8D P5 > + P10 -10G-> lc7C P6 > + P11 -10G-> lc7B P5 > + P12 -10G-> lc6A P6 > + P13 -10G-> lc5B P5 > + P14 -10G-> lc5A P6 > + P15 -10G-> lc6D P5 > + P16 -10G-> lc6C P6 > + P17 -10G-> lc6B P5 > + P18 -10G-> lc7D P5 > + P19 -10G-> lc1D P5 > + P20 -10G-> lc1C P6 > + P21 -10G-> lc1B P5 > + P22 -10G-> lc1A P6 > + P23 -10G-> lc2D P5 > + P24 -10G-> lc2B P5 > + P25 -10G-> lc2C P6 > + P26 -10G-> lc3D P5 > + P27 -10G-> lc2A P6 > + P28 -10G-> lc3B P5 > + P29 -10G-> lc3C P6 > + P30 -10G-> lc4D P5 > + P31 -10G-> lc5C P6 > + P32 -10G-> lc5D P5 > + P33 -10G-> lc4A P6 > + P34 -10G-> lc4B P5 > + P35 -10G-> lc4C P6 > + P36 -10G-> lc3A P6 > > SUBSYSTEM SPINE fc6B > - P1 -10G-> lc8D P8 > - P2 -10G-> lc8A P7 > - P3 -10G-> lc8B P8 > - P4 -10G-> lc8C P7 > - P5 -10G-> lc7D P8 > - P6 -10G-> lc7B P8 > - P7 -10G-> lc7A P7 > - P8 -10G-> lc6D P8 > - P9 -10G-> lc7C P7 > - P10 -10G-> lc6B P8 > - P11 -10G-> lc6A P7 > - P12 -10G-> lc5D P8 > - P13 -10G-> lc4A P7 > - P14 -10G-> lc4D P8 > - P15 -10G-> lc5C P7 > - P16 -10G-> lc5B P8 > - P17 -10G-> lc5A P7 > - P18 -10G-> lc6C P7 > - P19 -10G-> lc9C P7 > - P20 -10G-> lc9B P8 > - P21 -10G-> lc9A P7 > - P22 -10G-> lc9D P8 > - P23 -10G-> lc1C P7 > - P24 -10G-> lc1A P7 > - P25 -10G-> lc1B P8 > - P26 -10G-> lc2C P7 > - P27 -10G-> lc1D P8 > - P28 -10G-> lc2A P7 > - P29 -10G-> lc2B P8 > - P30 -10G-> lc3C P7 > - P31 -10G-> lc4B P8 > - P32 -10G-> lc4C P7 > - P33 -10G-> lc3D P8 > - P34 -10G-> lc3A P7 > - P35 -10G-> lc3B P8 > - P36 -10G-> lc2D P8 > + P1 -10G-> lc7D P6 > + P2 -10G-> lc7A P5 > + P3 -10G-> lc7B P6 > + P4 -10G-> lc7C P5 > + P5 -10G-> lc6D P6 > + P6 -10G-> lc6B P6 > + P7 -10G-> lc6A P5 > + P8 -10G-> lc5D P6 > + P9 -10G-> lc6C P5 > + P10 -10G-> lc5B P6 > + P11 -10G-> lc5A P5 > + P12 -10G-> lc4D P6 > + P13 -10G-> lc3A P5 > + P14 -10G-> lc3D P6 > + P15 -10G-> lc4C P5 > + P16 -10G-> lc4B P6 > + P17 -10G-> lc4A P5 > + P18 -10G-> lc5C P5 > + P19 -10G-> lc8C P5 > + P20 -10G-> lc8B P6 > + P21 -10G-> lc8A P5 > + P22 -10G-> lc8D P6 > + P23 -10G-> lc0C P5 > + P24 -10G-> lc0A P5 > + P25 -10G-> lc0B P6 > + P26 -10G-> lc1C P5 > + P27 -10G-> lc0D P6 > + P28 -10G-> lc1A P5 > + P29 -10G-> lc1B P6 > + P30 -10G-> lc2C P5 > + P31 -10G-> lc3B P6 > + P32 -10G-> lc3C P5 > + P33 -10G-> lc2D P6 > + P34 -10G-> lc2A P5 > + P35 -10G-> lc2B P6 > + P36 -10G-> lc1D P6 > > SUBSYSTEM SPINE fc7A > - P1 -10G-> lc1A P6 > - P2 -10G-> lc1B P5 > - P3 -10G-> lc1C P6 > - P4 -10G-> lc1D P5 > - P5 -10G-> lc9A P6 > - P6 -10G-> lc9C P6 > - P7 -10G-> lc9B P5 > - P8 -10G-> lc8A P6 > - P9 -10G-> lc9D P5 > - P10 -10G-> lc8C P6 > - P11 -10G-> lc8B P5 > - P12 -10G-> lc7A P6 > - P13 -10G-> lc6B P5 > - P14 -10G-> lc6A P6 > - P15 -10G-> lc7D P5 > - P16 -10G-> lc7C P6 > - P17 -10G-> lc7B P5 > - P18 -10G-> lc8D P5 > - P19 -10G-> lc2D P5 > - P20 -10G-> lc2C P6 > - P21 -10G-> lc2B P5 > - P22 -10G-> lc2A P6 > - P23 -10G-> lc3D P5 > - P24 -10G-> lc3B P5 > - P25 -10G-> lc3C P6 > - P26 -10G-> lc4D P5 > - P27 -10G-> lc3A P6 > - P28 -10G-> lc4B P5 > - P29 -10G-> lc4C P6 > - P30 -10G-> lc5D P5 > - P31 -10G-> lc6C P6 > - P32 -10G-> lc6D P5 > - P33 -10G-> lc5A P6 > - P34 -10G-> lc5B P5 > - P35 -10G-> lc5C P6 > - P36 -10G-> lc4A P6 > + P1 -10G-> lc0A P4 > + P2 -10G-> lc0B P3 > + P3 -10G-> lc0C P4 > + P4 -10G-> lc0D P3 > + P5 -10G-> lc8A P4 > + P6 -10G-> lc8C P4 > + P7 -10G-> lc8B P3 > + P8 -10G-> lc7A P4 > + P9 -10G-> lc8D P3 > + P10 -10G-> lc7C P4 > + P11 -10G-> lc7B P3 > + P12 -10G-> lc6A P4 > + P13 -10G-> lc5B P3 > + P14 -10G-> lc5A P4 > + P15 -10G-> lc6D P3 > + P16 -10G-> lc6C P4 > + P17 -10G-> lc6B P3 > + P18 -10G-> lc7D P3 > + P19 -10G-> lc1D P3 > + P20 -10G-> lc1C P4 > + P21 -10G-> lc1B P3 > + P22 -10G-> lc1A P4 > + P23 -10G-> lc2D P3 > + P24 -10G-> lc2B P3 > + P25 -10G-> lc2C P4 > + P26 -10G-> lc3D P3 > + P27 -10G-> lc2A P4 > + P28 -10G-> lc3B P3 > + P29 -10G-> lc3C P4 > + P30 -10G-> lc4D P3 > + P31 -10G-> lc5C P4 > + P32 -10G-> lc5D P3 > + P33 -10G-> lc4A P4 > + P34 -10G-> lc4B P3 > + P35 -10G-> lc4C P4 > + P36 -10G-> lc3A P4 > > SUBSYSTEM SPINE fc7B > - P1 -10G-> lc8D P6 > - P2 -10G-> lc8A P5 > - P3 -10G-> lc8B P6 > - P4 -10G-> lc8C P5 > - P5 -10G-> lc7D P6 > - P6 -10G-> lc7B P6 > - P7 -10G-> lc7A P5 > - P8 -10G-> lc6D P6 > - P9 -10G-> lc7C P5 > - P10 -10G-> lc6B P6 > - P11 -10G-> lc6A P5 > - P12 -10G-> lc5D P6 > - P13 -10G-> lc4A P5 > - P14 -10G-> lc4D P6 > - P15 -10G-> lc5C P5 > - P16 -10G-> lc5B P6 > - P17 -10G-> lc5A P5 > - P18 -10G-> lc6C P5 > - P19 -10G-> lc9C P5 > - P20 -10G-> lc9B P6 > - P21 -10G-> lc9A P5 > - P22 -10G-> lc9D P6 > - P23 -10G-> lc1C P5 > - P24 -10G-> lc1A P5 > - P25 -10G-> lc1B P6 > - P26 -10G-> lc2C P5 > - P27 -10G-> lc1D P6 > - P28 -10G-> lc2A P5 > - P29 -10G-> lc2B P6 > - P30 -10G-> lc3C P5 > - P31 -10G-> lc4B P6 > - P32 -10G-> lc4C P5 > - P33 -10G-> lc3D P6 > - P34 -10G-> lc3A P5 > - P35 -10G-> lc3B P6 > - P36 -10G-> lc2D P6 > + P1 -10G-> lc7D P4 > + P2 -10G-> lc7A P3 > + P3 -10G-> lc7B P4 > + P4 -10G-> lc7C P3 > + P5 -10G-> lc6D P4 > + P6 -10G-> lc6B P4 > + P7 -10G-> lc6A P3 > + P8 -10G-> lc5D P4 > + P9 -10G-> lc6C P3 > + P10 -10G-> lc5B P4 > + P11 -10G-> lc5A P3 > + P12 -10G-> lc4D P4 > + P13 -10G-> lc3A P3 > + P14 -10G-> lc3D P4 > + P15 -10G-> lc4C P3 > + P16 -10G-> lc4B P4 > + P17 -10G-> lc4A P3 > + P18 -10G-> lc5C P3 > + P19 -10G-> lc8C P3 > + P20 -10G-> lc8B P4 > + P21 -10G-> lc8A P3 > + P22 -10G-> lc8D P4 > + P23 -10G-> lc0C P3 > + P24 -10G-> lc0A P3 > + P25 -10G-> lc0B P4 > + P26 -10G-> lc1C P3 > + P27 -10G-> lc0D P4 > + P28 -10G-> lc1A P3 > + P29 -10G-> lc1B P4 > + P30 -10G-> lc2C P3 > + P31 -10G-> lc3B P4 > + P32 -10G-> lc3C P3 > + P33 -10G-> lc2D P4 > + P34 -10G-> lc2A P3 > + P35 -10G-> lc2B P4 > + P36 -10G-> lc1D P4 > > SUBSYSTEM SPINE fc8A > - P1 -10G-> lc1A P4 > - P2 -10G-> lc1B P3 > - P3 -10G-> lc1C P4 > - P4 -10G-> lc1D P3 > - P5 -10G-> lc9A P4 > - P6 -10G-> lc9C P4 > - P7 -10G-> lc9B P3 > - P8 -10G-> lc8A P4 > - P9 -10G-> lc9D P3 > - P10 -10G-> lc8C P4 > - P11 -10G-> lc8B P3 > - P12 -10G-> lc7A P4 > - P13 -10G-> lc6B P3 > - P14 -10G-> lc6A P4 > - P15 -10G-> lc7D P3 > - P16 -10G-> lc7C P4 > - P17 -10G-> lc7B P3 > - P18 -10G-> lc8D P3 > - P19 -10G-> lc2D P3 > - P20 -10G-> lc2C P4 > - P21 -10G-> lc2B P3 > - P22 -10G-> lc2A P4 > - P23 -10G-> lc3D P3 > - P24 -10G-> lc3B P3 > - P25 -10G-> lc3C P4 > - P26 -10G-> lc4D P3 > - P27 -10G-> lc3A P4 > - P28 -10G-> lc4B P3 > - P29 -10G-> lc4C P4 > - P30 -10G-> lc5D P3 > - P31 -10G-> lc6C P4 > - P32 -10G-> lc6D P3 > - P33 -10G-> lc5A P4 > - P34 -10G-> lc5B P3 > - P35 -10G-> lc5C P4 > - P36 -10G-> lc4A P4 > + P1 -10G-> lc0A P2 > + P2 -10G-> lc0B P1 > + P3 -10G-> lc0C P2 > + P4 -10G-> lc0D P1 > + P5 -10G-> lc8A P2 > + P6 -10G-> lc8C P2 > + P7 -10G-> lc8B P1 > + P8 -10G-> lc7A P2 > + P9 -10G-> lc8D P1 > + P10 -10G-> lc7C P2 > + P11 -10G-> lc7B P1 > + P12 -10G-> lc6A P2 > + P13 -10G-> lc5B P1 > + P14 -10G-> lc5A P2 > + P15 -10G-> lc6D P1 > + P16 -10G-> lc6C P2 > + P17 -10G-> lc6B P1 > + P18 -10G-> lc7D P1 > + P19 -10G-> lc1D P1 > + P20 -10G-> lc1C P2 > + P21 -10G-> lc1B P1 > + P22 -10G-> lc1A P2 > + P23 -10G-> lc2D P1 > + P24 -10G-> lc2B P1 > + P25 -10G-> lc2C P2 > + P26 -10G-> lc3D P1 > + P27 -10G-> lc2A P2 > + P28 -10G-> lc3B P1 > + P29 -10G-> lc3C P2 > + P30 -10G-> lc4D P1 > + P31 -10G-> lc5C P2 > + P32 -10G-> lc5D P1 > + P33 -10G-> lc4A P2 > + P34 -10G-> lc4B P1 > + P35 -10G-> lc4C P2 > + P36 -10G-> lc3A P2 > > SUBSYSTEM SPINE fc8B > - P1 -10G-> lc8D P4 > - P2 -10G-> lc8A P3 > - P3 -10G-> lc8B P4 > - P4 -10G-> lc8C P3 > - P5 -10G-> lc7D P4 > - P6 -10G-> lc7B P4 > - P7 -10G-> lc7A P3 > - P8 -10G-> lc6D P4 > - P9 -10G-> lc7C P3 > - P10 -10G-> lc6B P4 > - P11 -10G-> lc6A P3 > - P12 -10G-> lc5D P4 > - P13 -10G-> lc4A P3 > - P14 -10G-> lc4D P4 > - P15 -10G-> lc5C P3 > - P16 -10G-> lc5B P4 > - P17 -10G-> lc5A P3 > - P18 -10G-> lc6C P3 > - P19 -10G-> lc9C P3 > - P20 -10G-> lc9B P4 > - P21 -10G-> lc9A P3 > - P22 -10G-> lc9D P4 > - P23 -10G-> lc1C P3 > - P24 -10G-> lc1A P3 > - P25 -10G-> lc1B P4 > - P26 -10G-> lc2C P3 > - P27 -10G-> lc1D P4 > - P28 -10G-> lc2A P3 > - P29 -10G-> lc2B P4 > - P30 -10G-> lc3C P3 > - P31 -10G-> lc4B P4 > - P32 -10G-> lc4C P3 > - P33 -10G-> lc3D P4 > - P34 -10G-> lc3A P3 > - P35 -10G-> lc3B P4 > - P36 -10G-> lc2D P4 > + P1 -10G-> lc7D P2 > + P2 -10G-> lc7A P1 > + P3 -10G-> lc7B P2 > + P4 -10G-> lc7C P1 > + P5 -10G-> lc6D P2 > + P6 -10G-> lc6B P2 > + P7 -10G-> lc6A P1 > + P8 -10G-> lc5D P2 > + P9 -10G-> lc6C P1 > + P10 -10G-> lc5B P2 > + P11 -10G-> lc5A P1 > + P12 -10G-> lc4D P2 > + P13 -10G-> lc3A P1 > + P14 -10G-> lc3D P2 > + P15 -10G-> lc4C P1 > + P16 -10G-> lc4B P2 > + P17 -10G-> lc4A P1 > + P18 -10G-> lc5C P1 > + P19 -10G-> lc8C P1 > + P20 -10G-> lc8B P2 > + P21 -10G-> lc8A P1 > + P22 -10G-> lc8D P2 > + P23 -10G-> lc0C P1 > + P24 -10G-> lc0A P1 > + P25 -10G-> lc0B P2 > + P26 -10G-> lc1C P1 > + P27 -10G-> lc0D P2 > + P28 -10G-> lc1A P1 > + P29 -10G-> lc1B P2 > + P30 -10G-> lc2C P1 > + P31 -10G-> lc3B P2 > + P32 -10G-> lc3C P1 > + P33 -10G-> lc2D P2 > + P34 -10G-> lc2A P1 > + P35 -10G-> lc2B P2 > + P36 -10G-> lc1D P2 > + > +SUBSYSTEM LEAF lc0A > + P1 -10G-> fc8B P24 > + P2 -10G-> fc8A P1 > + P3 -10G-> fc7B P24 > + P4 -10G-> fc7A P1 > + P5 -10G-> fc6B P24 > + P6 -10G-> fc6A P1 > + P7 -10G-> fc5B P24 > + P8 -10G-> fc5A P1 > + P9 -10G-> fc4B P24 > + P10 -10G-> fc4A P1 > + P11 -10G-> fc3B P24 > + P12 -10G-> fc3A P1 > + P13 -10G-> fc0A P1 > + P14 -10G-> fc0B P24 > + P15 -10G-> fc1A P1 > + P16 -10G-> fc1B P24 > + P17 -10G-> fc2A P1 > + P18 -10G-> fc2B P24 > + P19 -10G-> lc0-0B/P3 > + P20 -10G-> lc0-0A/P3 > + P21 -10G-> lc0-0A/P2 > + P22 -10G-> lc0-0A/P1 > + P23 -10G-> lc0-0B/P2 > + P24 -10G-> lc0-0B/P1 > + P25 -10G-> lc0-1B/P3 > + P26 -10G-> lc0-1A/P3 > + P27 -10G-> lc0-1A/P2 > + P28 -10G-> lc0-1A/P1 > + P29 -10G-> lc0-1B/P2 > + P30 -10G-> lc0-1B/P1 > + P31 -10G-> lc0-2B/P1 > + P32 -10G-> lc0-2B/P2 > + P33 -10G-> lc0-2A/P1 > + P34 -10G-> lc0-2A/P2 > + P35 -10G-> lc0-2A/P3 > + P36 -10G-> lc0-2B/P3 > + > +SUBSYSTEM LEAF lc0B > + P1 -10G-> fc8A P2 > + P2 -10G-> fc8B P25 > + P3 -10G-> fc7A P2 > + P4 -10G-> fc7B P25 > + P5 -10G-> fc6A P2 > + P6 -10G-> fc6B P25 > + P7 -10G-> fc5A P2 > + P8 -10G-> fc5B P25 > + P9 -10G-> fc4A P2 > + P10 -10G-> fc4B P25 > + P11 -10G-> fc3A P2 > + P12 -10G-> fc3B P25 > + P13 -10G-> fc0B P25 > + P14 -10G-> fc0A P2 > + P15 -10G-> fc1B P25 > + P16 -10G-> fc1A P2 > + P17 -10G-> fc2B P25 > + P18 -10G-> fc2A P2 > + P19 -10G-> lc0-3B/P3 > + P20 -10G-> lc0-3A/P3 > + P21 -10G-> lc0-3A/P2 > + P22 -10G-> lc0-3A/P1 > + P23 -10G-> lc0-3B/P2 > + P24 -10G-> lc0-3B/P1 > + P25 -10G-> lc0-4B/P3 > + P26 -10G-> lc0-4A/P3 > + P27 -10G-> lc0-4A/P2 > + P28 -10G-> lc0-4A/P1 > + P29 -10G-> lc0-4B/P2 > + P30 -10G-> lc0-4B/P1 > + P31 -10G-> lc0-5B/P1 > + P32 -10G-> lc0-5B/P2 > + P33 -10G-> lc0-5A/P1 > + P34 -10G-> lc0-5A/P2 > + P35 -10G-> lc0-5A/P3 > + P36 -10G-> lc0-5B/P3 > > -SUBSYSTEM SPINE fc9A > - P1 -10G-> lc1A P2 > - P2 -10G-> lc1B P1 > - P3 -10G-> lc1C P2 > - P4 -10G-> lc1D P1 > - P5 -10G-> lc9A P2 > - P6 -10G-> lc9C P2 > - P7 -10G-> lc9B P1 > - P8 -10G-> lc8A P2 > - P9 -10G-> lc9D P1 > - P10 -10G-> lc8C P2 > - P11 -10G-> lc8B P1 > - P12 -10G-> lc7A P2 > - P13 -10G-> lc6B P1 > - P14 -10G-> lc6A P2 > - P15 -10G-> lc7D P1 > - P16 -10G-> lc7C P2 > - P17 -10G-> lc7B P1 > - P18 -10G-> lc8D P1 > - P19 -10G-> lc2D P1 > - P20 -10G-> lc2C P2 > - P21 -10G-> lc2B P1 > - P22 -10G-> lc2A P2 > - P23 -10G-> lc3D P1 > - P24 -10G-> lc3B P1 > - P25 -10G-> lc3C P2 > - P26 -10G-> lc4D P1 > - P27 -10G-> lc3A P2 > - P28 -10G-> lc4B P1 > - P29 -10G-> lc4C P2 > - P30 -10G-> lc5D P1 > - P31 -10G-> lc6C P2 > - P32 -10G-> lc6D P1 > - P33 -10G-> lc5A P2 > - P34 -10G-> lc5B P1 > - P35 -10G-> lc5C P2 > - P36 -10G-> lc4A P2 > +SUBSYSTEM LEAF lc0C > + P1 -10G-> fc8B P23 > + P2 -10G-> fc8A P3 > + P3 -10G-> fc7B P23 > + P4 -10G-> fc7A P3 > + P5 -10G-> fc6B P23 > + P6 -10G-> fc6A P3 > + P7 -10G-> fc5B P23 > + P8 -10G-> fc5A P3 > + P9 -10G-> fc4B P23 > + P10 -10G-> fc4A P3 > + P11 -10G-> fc3B P23 > + P12 -10G-> fc3A P3 > + P13 -10G-> fc0A P3 > + P14 -10G-> fc0B P23 > + P15 -10G-> fc1A P3 > + P16 -10G-> fc1B P23 > + P17 -10G-> fc2A P3 > + P18 -10G-> fc2B P23 > + P19 -10G-> lc0-6B/P3 > + P20 -10G-> lc0-6A/P3 > + P21 -10G-> lc0-6A/P2 > + P22 -10G-> lc0-6A/P1 > + P23 -10G-> lc0-6B/P2 > + P24 -10G-> lc0-6B/P1 > + P25 -10G-> lc0-7B/P3 > + P26 -10G-> lc0-7A/P3 > + P27 -10G-> lc0-7A/P2 > + P28 -10G-> lc0-7A/P1 > + P29 -10G-> lc0-7B/P2 > + P30 -10G-> lc0-7B/P1 > + P31 -10G-> lc0-8B/P1 > + P32 -10G-> lc0-8B/P2 > + P33 -10G-> lc0-8A/P1 > + P34 -10G-> lc0-8A/P2 > + P35 -10G-> lc0-8A/P3 > + P36 -10G-> lc0-8B/P3 > > -SUBSYSTEM SPINE fc9B > - P1 -10G-> lc8D P2 > - P2 -10G-> lc8A P1 > - P3 -10G-> lc8B P2 > - P4 -10G-> lc8C P1 > - P5 -10G-> lc7D P2 > - P6 -10G-> lc7B P2 > - P7 -10G-> lc7A P1 > - P8 -10G-> lc6D P2 > - P9 -10G-> lc7C P1 > - P10 -10G-> lc6B P2 > - P11 -10G-> lc6A P1 > - P12 -10G-> lc5D P2 > - P13 -10G-> lc4A P1 > - P14 -10G-> lc4D P2 > - P15 -10G-> lc5C P1 > - P16 -10G-> lc5B P2 > - P17 -10G-> lc5A P1 > - P18 -10G-> lc6C P1 > - P19 -10G-> lc9C P1 > - P20 -10G-> lc9B P2 > - P21 -10G-> lc9A P1 > - P22 -10G-> lc9D P2 > - P23 -10G-> lc1C P1 > - P24 -10G-> lc1A P1 > - P25 -10G-> lc1B P2 > - P26 -10G-> lc2C P1 > - P27 -10G-> lc1D P2 > - P28 -10G-> lc2A P1 > - P29 -10G-> lc2B P2 > - P30 -10G-> lc3C P1 > - P31 -10G-> lc4B P2 > - P32 -10G-> lc4C P1 > - P33 -10G-> lc3D P2 > - P34 -10G-> lc3A P1 > - P35 -10G-> lc3B P2 > - P36 -10G-> lc2D P2 > +SUBSYSTEM LEAF lc0D > + P1 -10G-> fc8A P4 > + P2 -10G-> fc8B P27 > + P3 -10G-> fc7A P4 > + P4 -10G-> fc7B P27 > + P5 -10G-> fc6A P4 > + P6 -10G-> fc6B P27 > + P7 -10G-> fc5A P4 > + P8 -10G-> fc5B P27 > + P9 -10G-> fc4A P4 > + P10 -10G-> fc4B P27 > + P11 -10G-> fc3A P4 > + P12 -10G-> fc3B P27 > + P13 -10G-> fc0B P27 > + P14 -10G-> fc0A P4 > + P15 -10G-> fc1B P27 > + P16 -10G-> fc1A P4 > + P17 -10G-> fc2B P27 > + P18 -10G-> fc2A P4 > + P19 -10G-> lc0-9B/P3 > + P20 -10G-> lc0-9A/P3 > + P21 -10G-> lc0-9A/P2 > + P22 -10G-> lc0-9A/P1 > + P23 -10G-> lc0-9B/P2 > + P24 -10G-> lc0-9B/P1 > + P25 -10G-> lc0-10B/P3 > + P26 -10G-> lc0-10A/P3 > + P27 -10G-> lc0-10A/P2 > + P28 -10G-> lc0-10A/P1 > + P29 -10G-> lc0-10B/P2 > + P30 -10G-> lc0-10B/P1 > + P31 -10G-> lc0-11B/P1 > + P32 -10G-> lc0-11B/P2 > + P33 -10G-> lc0-11A/P1 > + P34 -10G-> lc0-11A/P2 > + P35 -10G-> lc0-11A/P3 > + P36 -10G-> lc0-11B/P3 > > SUBSYSTEM LEAF lc1A > - P1 -10G-> fc9B P24 > - P2 -10G-> fc9A P1 > - P3 -10G-> fc8B P24 > - P4 -10G-> fc8A P1 > - P5 -10G-> fc7B P24 > - P6 -10G-> fc7A P1 > - P7 -10G-> fc6B P24 > - P8 -10G-> fc6A P1 > - P9 -10G-> fc5B P24 > - P10 -10G-> fc5A P1 > - P11 -10G-> fc4B P24 > - P12 -10G-> fc4A P1 > - P13 -10G-> fc1A P1 > - P14 -10G-> fc1B P24 > - P15 -10G-> fc2A P1 > - P16 -10G-> fc2B P24 > - P17 -10G-> fc3A P1 > - P18 -10G-> fc3B P24 > - P19 -10G-> lc1-0A/P3 > - P20 -10G-> lc1-0B/P3 > - P21 -10G-> lc1-0B/P2 > - P22 -10G-> lc1-0B/P1 > - P23 -10G-> lc1-0A/P2 > - P24 -10G-> lc1-0A/P1 > - P25 -10G-> lc1-1A/P3 > - P26 -10G-> lc1-1B/P3 > - P27 -10G-> lc1-1B/P2 > - P28 -10G-> lc1-1B/P1 > - P29 -10G-> lc1-1A/P2 > - P30 -10G-> lc1-1A/P1 > - P31 -10G-> lc1-2A/P1 > - P32 -10G-> lc1-2A/P2 > - P33 -10G-> lc1-2B/P1 > - P34 -10G-> lc1-2B/P2 > - P35 -10G-> lc1-2B/P3 > - P36 -10G-> lc1-2A/P3 > + P1 -10G-> fc8B P28 > + P2 -10G-> fc8A P22 > + P3 -10G-> fc7B P28 > + P4 -10G-> fc7A P22 > + P5 -10G-> fc6B P28 > + P6 -10G-> fc6A P22 > + P7 -10G-> fc5B P28 > + P8 -10G-> fc5A P22 > + P9 -10G-> fc4B P28 > + P10 -10G-> fc4A P22 > + P11 -10G-> fc3B P28 > + P12 -10G-> fc3A P22 > + P13 -10G-> fc0A P22 > + P14 -10G-> fc0B P28 > + P15 -10G-> fc1A P22 > + P16 -10G-> fc1B P28 > + P17 -10G-> fc2A P22 > + P18 -10G-> fc2B P28 > + P19 -10G-> lc1-0B/P3 > + P20 -10G-> lc1-0A/P3 > + P21 -10G-> lc1-0A/P2 > + P22 -10G-> lc1-0A/P1 > + P23 -10G-> lc1-0B/P2 > + P24 -10G-> lc1-0B/P1 > + P25 -10G-> lc1-1B/P3 > + P26 -10G-> lc1-1A/P3 > + P27 -10G-> lc1-1A/P2 > + P28 -10G-> lc1-1A/P1 > + P29 -10G-> lc1-1B/P2 > + P30 -10G-> lc1-1B/P1 > + P31 -10G-> lc1-2B/P1 > + P32 -10G-> lc1-2B/P2 > + P33 -10G-> lc1-2A/P1 > + P34 -10G-> lc1-2A/P2 > + P35 -10G-> lc1-2A/P3 > + P36 -10G-> lc1-2B/P3 > > SUBSYSTEM LEAF lc1B > - P1 -10G-> fc9A P2 > - P2 -10G-> fc9B P25 > - P3 -10G-> fc8A P2 > - P4 -10G-> fc8B P25 > - P5 -10G-> fc7A P2 > - P6 -10G-> fc7B P25 > - P7 -10G-> fc6A P2 > - P8 -10G-> fc6B P25 > - P9 -10G-> fc5A P2 > - P10 -10G-> fc5B P25 > - P11 -10G-> fc4A P2 > - P12 -10G-> fc4B P25 > - P13 -10G-> fc1B P25 > - P14 -10G-> fc1A P2 > - P15 -10G-> fc2B P25 > - P16 -10G-> fc2A P2 > - P17 -10G-> fc3B P25 > - P18 -10G-> fc3A P2 > - P19 -10G-> lc1-3A/P3 > - P20 -10G-> lc1-3B/P3 > - P21 -10G-> lc1-3B/P2 > - P22 -10G-> lc1-3B/P1 > - P23 -10G-> lc1-3A/P2 > - P24 -10G-> lc1-3A/P1 > - P25 -10G-> lc1-4A/P3 > - P26 -10G-> lc1-4B/P3 > - P27 -10G-> lc1-4B/P2 > - P28 -10G-> lc1-4B/P1 > - P29 -10G-> lc1-4A/P2 > - P30 -10G-> lc1-4A/P1 > - P31 -10G-> lc1-5A/P1 > - P32 -10G-> lc1-5A/P2 > - P33 -10G-> lc1-5B/P1 > - P34 -10G-> lc1-5B/P2 > - P35 -10G-> lc1-5B/P3 > - P36 -10G-> lc1-5A/P3 > + P1 -10G-> fc8A P21 > + P2 -10G-> fc8B P29 > + P3 -10G-> fc7A P21 > + P4 -10G-> fc7B P29 > + P5 -10G-> fc6A P21 > + P6 -10G-> fc6B P29 > + P7 -10G-> fc5A P21 > + P8 -10G-> fc5B P29 > + P9 -10G-> fc4A P21 > + P10 -10G-> fc4B P29 > + P11 -10G-> fc3A P21 > + P12 -10G-> fc3B P29 > + P13 -10G-> fc0B P29 > + P14 -10G-> fc0A P21 > + P15 -10G-> fc1B P29 > + P16 -10G-> fc1A P21 > + P17 -10G-> fc2B P29 > + P18 -10G-> fc2A P21 > + P19 -10G-> lc1-3B/P3 > + P20 -10G-> lc1-3A/P3 > + P21 -10G-> lc1-3A/P2 > + P22 -10G-> lc1-3A/P1 > + P23 -10G-> lc1-3B/P2 > + P24 -10G-> lc1-3B/P1 > + P25 -10G-> lc1-4B/P3 > + P26 -10G-> lc1-4A/P3 > + P27 -10G-> lc1-4A/P2 > + P28 -10G-> lc1-4A/P1 > + P29 -10G-> lc1-4B/P2 > + P30 -10G-> lc1-4B/P1 > + P31 -10G-> lc1-5B/P1 > + P32 -10G-> lc1-5B/P2 > + P33 -10G-> lc1-5A/P1 > + P34 -10G-> lc1-5A/P2 > + P35 -10G-> lc1-5A/P3 > + P36 -10G-> lc1-5B/P3 > > SUBSYSTEM LEAF lc1C > - P1 -10G-> fc9B P23 > - P2 -10G-> fc9A P3 > - P3 -10G-> fc8B P23 > - P4 -10G-> fc8A P3 > - P5 -10G-> fc7B P23 > - P6 -10G-> fc7A P3 > - P7 -10G-> fc6B P23 > - P8 -10G-> fc6A P3 > - P9 -10G-> fc5B P23 > - P10 -10G-> fc5A P3 > - P11 -10G-> fc4B P23 > - P12 -10G-> fc4A P3 > - P13 -10G-> fc1A P3 > - P14 -10G-> fc1B P23 > - P15 -10G-> fc2A P3 > - P16 -10G-> fc2B P23 > - P17 -10G-> fc3A P3 > - P18 -10G-> fc3B P23 > - P19 -10G-> lc1-6A/P3 > - P20 -10G-> lc1-6B/P3 > - P21 -10G-> lc1-6B/P2 > - P22 -10G-> lc1-6B/P1 > - P23 -10G-> lc1-6A/P2 > - P24 -10G-> lc1-6A/P1 > - P25 -10G-> lc1-7A/P3 > - P26 -10G-> lc1-7B/P3 > - P27 -10G-> lc1-7B/P2 > - P28 -10G-> lc1-7B/P1 > - P29 -10G-> lc1-7A/P2 > - P30 -10G-> lc1-7A/P1 > - P31 -10G-> lc1-8A/P1 > - P32 -10G-> lc1-8A/P2 > - P33 -10G-> lc1-8B/P1 > - P34 -10G-> lc1-8B/P2 > - P35 -10G-> lc1-8B/P3 > - P36 -10G-> lc1-8A/P3 > + P1 -10G-> fc8B P26 > + P2 -10G-> fc8A P20 > + P3 -10G-> fc7B P26 > + P4 -10G-> fc7A P20 > + P5 -10G-> fc6B P26 > + P6 -10G-> fc6A P20 > + P7 -10G-> fc5B P26 > + P8 -10G-> fc5A P20 > + P9 -10G-> fc4B P26 > + P10 -10G-> fc4A P20 > + P11 -10G-> fc3B P26 > + P12 -10G-> fc3A P20 > + P13 -10G-> fc0A P20 > + P14 -10G-> fc0B P26 > + P15 -10G-> fc1A P20 > + P16 -10G-> fc1B P26 > + P17 -10G-> fc2A P20 > + P18 -10G-> fc2B P26 > + P19 -10G-> lc1-6B/P3 > + P20 -10G-> lc1-6A/P3 > + P21 -10G-> lc1-6A/P2 > + P22 -10G-> lc1-6A/P1 > + P23 -10G-> lc1-6B/P2 > + P24 -10G-> lc1-6B/P1 > + P25 -10G-> lc1-7B/P3 > + P26 -10G-> lc1-7A/P3 > + P27 -10G-> lc1-7A/P2 > + P28 -10G-> lc1-7A/P1 > + P29 -10G-> lc1-7B/P2 > + P30 -10G-> lc1-7B/P1 > + P31 -10G-> lc1-8B/P1 > + P32 -10G-> lc1-8B/P2 > + P33 -10G-> lc1-8A/P1 > + P34 -10G-> lc1-8A/P2 > + P35 -10G-> lc1-8A/P3 > + P36 -10G-> lc1-8B/P3 > > SUBSYSTEM LEAF lc1D > - P1 -10G-> fc9A P4 > - P2 -10G-> fc9B P27 > - P3 -10G-> fc8A P4 > - P4 -10G-> fc8B P27 > - P5 -10G-> fc7A P4 > - P6 -10G-> fc7B P27 > - P7 -10G-> fc6A P4 > - P8 -10G-> fc6B P27 > - P9 -10G-> fc5A P4 > - P10 -10G-> fc5B P27 > - P11 -10G-> fc4A P4 > - P12 -10G-> fc4B P27 > - P13 -10G-> fc1B P27 > - P14 -10G-> fc1A P4 > - P15 -10G-> fc2B P27 > - P16 -10G-> fc2A P4 > - P17 -10G-> fc3B P27 > - P18 -10G-> fc3A P4 > - P19 -10G-> lc1-9A/P3 > - P20 -10G-> lc1-9B/P3 > - P21 -10G-> lc1-9B/P2 > - P22 -10G-> lc1-9B/P1 > - P23 -10G-> lc1-9A/P2 > - P24 -10G-> lc1-9A/P1 > - P25 -10G-> lc1-10A/P3 > - P26 -10G-> lc1-10B/P3 > - P27 -10G-> lc1-10B/P2 > - P28 -10G-> lc1-10B/P1 > - P29 -10G-> lc1-10A/P2 > - P30 -10G-> lc1-10A/P1 > - P31 -10G-> lc1-11A/P1 > - P32 -10G-> lc1-11A/P2 > - P33 -10G-> lc1-11B/P1 > - P34 -10G-> lc1-11B/P2 > - P35 -10G-> lc1-11B/P3 > - P36 -10G-> lc1-11A/P3 > + P1 -10G-> fc8A P19 > + P2 -10G-> fc8B P36 > + P3 -10G-> fc7A P19 > + P4 -10G-> fc7B P36 > + P5 -10G-> fc6A P19 > + P6 -10G-> fc6B P36 > + P7 -10G-> fc5A P19 > + P8 -10G-> fc5B P36 > + P9 -10G-> fc4A P19 > + P10 -10G-> fc4B P36 > + P11 -10G-> fc3A P19 > + P12 -10G-> fc3B P36 > + P13 -10G-> fc0B P36 > + P14 -10G-> fc0A P19 > + P15 -10G-> fc1B P36 > + P16 -10G-> fc1A P19 > + P17 -10G-> fc2B P36 > + P18 -10G-> fc2A P19 > + P19 -10G-> lc1-9B/P3 > + P20 -10G-> lc1-9A/P3 > + P21 -10G-> lc1-9A/P2 > + P22 -10G-> lc1-9A/P1 > + P23 -10G-> lc1-9B/P2 > + P24 -10G-> lc1-9B/P1 > + P25 -10G-> lc1-10B/P3 > + P26 -10G-> lc1-10A/P3 > + P27 -10G-> lc1-10A/P2 > + P28 -10G-> lc1-10A/P1 > + P29 -10G-> lc1-10B/P2 > + P30 -10G-> lc1-10B/P1 > + P31 -10G-> lc1-11B/P1 > + P32 -10G-> lc1-11B/P2 > + P33 -10G-> lc1-11A/P1 > + P34 -10G-> lc1-11A/P2 > + P35 -10G-> lc1-11A/P3 > + P36 -10G-> lc1-11B/P3 > > SUBSYSTEM LEAF lc2A > - P1 -10G-> fc9B P28 > - P2 -10G-> fc9A P22 > - P3 -10G-> fc8B P28 > - P4 -10G-> fc8A P22 > - P5 -10G-> fc7B P28 > - P6 -10G-> fc7A P22 > - P7 -10G-> fc6B P28 > - P8 -10G-> fc6A P22 > - P9 -10G-> fc5B P28 > - P10 -10G-> fc5A P22 > - P11 -10G-> fc4B P28 > - P12 -10G-> fc4A P22 > - P13 -10G-> fc1A P22 > - P14 -10G-> fc1B P28 > - P15 -10G-> fc2A P22 > - P16 -10G-> fc2B P28 > - P17 -10G-> fc3A P22 > - P18 -10G-> fc3B P28 > - P19 -10G-> lc2-0A/P3 > - P20 -10G-> lc2-0B/P3 > - P21 -10G-> lc2-0B/P2 > - P22 -10G-> lc2-0B/P1 > - P23 -10G-> lc2-0A/P2 > - P24 -10G-> lc2-0A/P1 > - P25 -10G-> lc2-1A/P3 > - P26 -10G-> lc2-1B/P3 > - P27 -10G-> lc2-1B/P2 > - P28 -10G-> lc2-1B/P1 > - P29 -10G-> lc2-1A/P2 > - P30 -10G-> lc2-1A/P1 > - P31 -10G-> lc2-2A/P1 > - P32 -10G-> lc2-2A/P2 > - P33 -10G-> lc2-2B/P1 > - P34 -10G-> lc2-2B/P2 > - P35 -10G-> lc2-2B/P3 > - P36 -10G-> lc2-2A/P3 > + P1 -10G-> fc8B P34 > + P2 -10G-> fc8A P27 > + P3 -10G-> fc7B P34 > + P4 -10G-> fc7A P27 > + P5 -10G-> fc6B P34 > + P6 -10G-> fc6A P27 > + P7 -10G-> fc5B P34 > + P8 -10G-> fc5A P27 > + P9 -10G-> fc4B P34 > + P10 -10G-> fc4A P27 > + P11 -10G-> fc3B P34 > + P12 -10G-> fc3A P27 > + P13 -10G-> fc0A P27 > + P14 -10G-> fc0B P34 > + P15 -10G-> fc1A P27 > + P16 -10G-> fc1B P34 > + P17 -10G-> fc2A P27 > + P18 -10G-> fc2B P34 > + P19 -10G-> lc2-0B/P3 > + P20 -10G-> lc2-0A/P3 > + P21 -10G-> lc2-0A/P2 > + P22 -10G-> lc2-0A/P1 > + P23 -10G-> lc2-0B/P2 > + P24 -10G-> lc2-0B/P1 > + P25 -10G-> lc2-1B/P3 > + P26 -10G-> lc2-1A/P3 > + P27 -10G-> lc2-1A/P2 > + P28 -10G-> lc2-1A/P1 > + P29 -10G-> lc2-1B/P2 > + P30 -10G-> lc2-1B/P1 > + P31 -10G-> lc2-2B/P1 > + P32 -10G-> lc2-2B/P2 > + P33 -10G-> lc2-2A/P1 > + P34 -10G-> lc2-2A/P2 > + P35 -10G-> lc2-2A/P3 > + P36 -10G-> lc2-2B/P3 > > SUBSYSTEM LEAF lc2B > - P1 -10G-> fc9A P21 > - P2 -10G-> fc9B P29 > - P3 -10G-> fc8A P21 > - P4 -10G-> fc8B P29 > - P5 -10G-> fc7A P21 > - P6 -10G-> fc7B P29 > - P7 -10G-> fc6A P21 > - P8 -10G-> fc6B P29 > - P9 -10G-> fc5A P21 > - P10 -10G-> fc5B P29 > - P11 -10G-> fc4A P21 > - P12 -10G-> fc4B P29 > - P13 -10G-> fc1B P29 > - P14 -10G-> fc1A P21 > - P15 -10G-> fc2B P29 > - P16 -10G-> fc2A P21 > - P17 -10G-> fc3B P29 > - P18 -10G-> fc3A P21 > - P19 -10G-> lc2-3A/P3 > - P20 -10G-> lc2-3B/P3 > - P21 -10G-> lc2-3B/P2 > - P22 -10G-> lc2-3B/P1 > - P23 -10G-> lc2-3A/P2 > - P24 -10G-> lc2-3A/P1 > - P25 -10G-> lc2-4A/P3 > - P26 -10G-> lc2-4B/P3 > - P27 -10G-> lc2-4B/P2 > - P28 -10G-> lc2-4B/P1 > - P29 -10G-> lc2-4A/P2 > - P30 -10G-> lc2-4A/P1 > - P31 -10G-> lc2-5A/P1 > - P32 -10G-> lc2-5A/P2 > - P33 -10G-> lc2-5B/P1 > - P34 -10G-> lc2-5B/P2 > - P35 -10G-> lc2-5B/P3 > - P36 -10G-> lc2-5A/P3 > + P1 -10G-> fc8A P24 > + P2 -10G-> fc8B P35 > + P3 -10G-> fc7A P24 > + P4 -10G-> fc7B P35 > + P5 -10G-> fc6A P24 > + P6 -10G-> fc6B P35 > + P7 -10G-> fc5A P24 > + P8 -10G-> fc5B P35 > + P9 -10G-> fc4A P24 > + P10 -10G-> fc4B P35 > + P11 -10G-> fc3A P24 > + P12 -10G-> fc3B P35 > + P13 -10G-> fc0B P35 > + P14 -10G-> fc0A P24 > + P15 -10G-> fc1B P35 > + P16 -10G-> fc1A P24 > + P17 -10G-> fc2B P35 > + P18 -10G-> fc2A P24 > + P19 -10G-> lc2-3B/P3 > + P20 -10G-> lc2-3A/P3 > + P21 -10G-> lc2-3A/P2 > + P22 -10G-> lc2-3A/P1 > + P23 -10G-> lc2-3B/P2 > + P24 -10G-> lc2-3B/P1 > + P25 -10G-> lc2-4B/P3 > + P26 -10G-> lc2-4A/P3 > + P27 -10G-> lc2-4A/P2 > + P28 -10G-> lc2-4A/P1 > + P29 -10G-> lc2-4B/P2 > + P30 -10G-> lc2-4B/P1 > + P31 -10G-> lc2-5B/P1 > + P32 -10G-> lc2-5B/P2 > + P33 -10G-> lc2-5A/P1 > + P34 -10G-> lc2-5A/P2 > + P35 -10G-> lc2-5A/P3 > + P36 -10G-> lc2-5B/P3 > > SUBSYSTEM LEAF lc2C > - P1 -10G-> fc9B P26 > - P2 -10G-> fc9A P20 > - P3 -10G-> fc8B P26 > - P4 -10G-> fc8A P20 > - P5 -10G-> fc7B P26 > - P6 -10G-> fc7A P20 > - P7 -10G-> fc6B P26 > - P8 -10G-> fc6A P20 > - P9 -10G-> fc5B P26 > - P10 -10G-> fc5A P20 > - P11 -10G-> fc4B P26 > - P12 -10G-> fc4A P20 > - P13 -10G-> fc1A P20 > - P14 -10G-> fc1B P26 > - P15 -10G-> fc2A P20 > - P16 -10G-> fc2B P26 > - P17 -10G-> fc3A P20 > - P18 -10G-> fc3B P26 > - P19 -10G-> lc2-6A/P3 > - P20 -10G-> lc2-6B/P3 > - P21 -10G-> lc2-6B/P2 > - P22 -10G-> lc2-6B/P1 > - P23 -10G-> lc2-6A/P2 > - P24 -10G-> lc2-6A/P1 > - P25 -10G-> lc2-7A/P3 > - P26 -10G-> lc2-7B/P3 > - P27 -10G-> lc2-7B/P2 > - P28 -10G-> lc2-7B/P1 > - P29 -10G-> lc2-7A/P2 > - P30 -10G-> lc2-7A/P1 > - P31 -10G-> lc2-8A/P1 > - P32 -10G-> lc2-8A/P2 > - P33 -10G-> lc2-8B/P1 > - P34 -10G-> lc2-8B/P2 > - P35 -10G-> lc2-8B/P3 > - P36 -10G-> lc2-8A/P3 > + P1 -10G-> fc8B P30 > + P2 -10G-> fc8A P25 > + P3 -10G-> fc7B P30 > + P4 -10G-> fc7A P25 > + P5 -10G-> fc6B P30 > + P6 -10G-> fc6A P25 > + P7 -10G-> fc5B P30 > + P8 -10G-> fc5A P25 > + P9 -10G-> fc4B P30 > + P10 -10G-> fc4A P25 > + P11 -10G-> fc3B P30 > + P12 -10G-> fc3A P25 > + P13 -10G-> fc0A P25 > + P14 -10G-> fc0B P30 > + P15 -10G-> fc1A P25 > + P16 -10G-> fc1B P30 > + P17 -10G-> fc2A P25 > + P18 -10G-> fc2B P30 > + P19 -10G-> lc2-6B/P3 > + P20 -10G-> lc2-6A/P3 > + P21 -10G-> lc2-6A/P2 > + P22 -10G-> lc2-6A/P1 > + P23 -10G-> lc2-6B/P2 > + P24 -10G-> lc2-6B/P1 > + P25 -10G-> lc2-7B/P3 > + P26 -10G-> lc2-7A/P3 > + P27 -10G-> lc2-7A/P2 > + P28 -10G-> lc2-7A/P1 > + P29 -10G-> lc2-7B/P2 > + P30 -10G-> lc2-7B/P1 > + P31 -10G-> lc2-8B/P1 > + P32 -10G-> lc2-8B/P2 > + P33 -10G-> lc2-8A/P1 > + P34 -10G-> lc2-8A/P2 > + P35 -10G-> lc2-8A/P3 > + P36 -10G-> lc2-8B/P3 > > SUBSYSTEM LEAF lc2D > - P1 -10G-> fc9A P19 > - P2 -10G-> fc9B P36 > - P3 -10G-> fc8A P19 > - P4 -10G-> fc8B P36 > - P5 -10G-> fc7A P19 > - P6 -10G-> fc7B P36 > - P7 -10G-> fc6A P19 > - P8 -10G-> fc6B P36 > - P9 -10G-> fc5A P19 > - P10 -10G-> fc5B P36 > - P11 -10G-> fc4A P19 > - P12 -10G-> fc4B P36 > - P13 -10G-> fc1B P36 > - P14 -10G-> fc1A P19 > - P15 -10G-> fc2B P36 > - P16 -10G-> fc2A P19 > - P17 -10G-> fc3B P36 > - P18 -10G-> fc3A P19 > - P19 -10G-> lc2-9A/P3 > - P20 -10G-> lc2-9B/P3 > - P21 -10G-> lc2-9B/P2 > - P22 -10G-> lc2-9B/P1 > - P23 -10G-> lc2-9A/P2 > - P24 -10G-> lc2-9A/P1 > - P25 -10G-> lc2-10A/P3 > - P26 -10G-> lc2-10B/P3 > - P27 -10G-> lc2-10B/P2 > - P28 -10G-> lc2-10B/P1 > - P29 -10G-> lc2-10A/P2 > - P30 -10G-> lc2-10A/P1 > - P31 -10G-> lc2-11A/P1 > - P32 -10G-> lc2-11A/P2 > - P33 -10G-> lc2-11B/P1 > - P34 -10G-> lc2-11B/P2 > - P35 -10G-> lc2-11B/P3 > - P36 -10G-> lc2-11A/P3 > + P1 -10G-> fc8A P23 > + P2 -10G-> fc8B P33 > + P3 -10G-> fc7A P23 > + P4 -10G-> fc7B P33 > + P5 -10G-> fc6A P23 > + P6 -10G-> fc6B P33 > + P7 -10G-> fc5A P23 > + P8 -10G-> fc5B P33 > + P9 -10G-> fc4A P23 > + P10 -10G-> fc4B P33 > + P11 -10G-> fc3A P23 > + P12 -10G-> fc3B P33 > + P13 -10G-> fc0B P33 > + P14 -10G-> fc0A P23 > + P15 -10G-> fc1B P33 > + P16 -10G-> fc1A P23 > + P17 -10G-> fc2B P33 > + P18 -10G-> fc2A P23 > + P19 -10G-> lc2-9B/P3 > + P20 -10G-> lc2-9A/P3 > + P21 -10G-> lc2-9A/P2 > + P22 -10G-> lc2-9A/P1 > + P23 -10G-> lc2-9B/P2 > + P24 -10G-> lc2-9B/P1 > + P25 -10G-> lc2-10B/P3 > + P26 -10G-> lc2-10A/P3 > + P27 -10G-> lc2-10A/P2 > + P28 -10G-> lc2-10A/P1 > + P29 -10G-> lc2-10B/P2 > + P30 -10G-> lc2-10B/P1 > + P31 -10G-> lc2-11B/P1 > + P32 -10G-> lc2-11B/P2 > + P33 -10G-> lc2-11A/P1 > + P34 -10G-> lc2-11A/P2 > + P35 -10G-> lc2-11A/P3 > + P36 -10G-> lc2-11B/P3 > > SUBSYSTEM LEAF lc3A > - P1 -10G-> fc9B P34 > - P2 -10G-> fc9A P27 > - P3 -10G-> fc8B P34 > - P4 -10G-> fc8A P27 > - P5 -10G-> fc7B P34 > - P6 -10G-> fc7A P27 > - P7 -10G-> fc6B P34 > - P8 -10G-> fc6A P27 > - P9 -10G-> fc5B P34 > - P10 -10G-> fc5A P27 > - P11 -10G-> fc4B P34 > - P12 -10G-> fc4A P27 > - P13 -10G-> fc1A P27 > - P14 -10G-> fc1B P34 > - P15 -10G-> fc2A P27 > - P16 -10G-> fc2B P34 > - P17 -10G-> fc3A P27 > - P18 -10G-> fc3B P34 > - P19 -10G-> lc3-0A/P3 > - P20 -10G-> lc3-0B/P3 > - P21 -10G-> lc3-0B/P2 > - P22 -10G-> lc3-0B/P1 > - P23 -10G-> lc3-0A/P2 > - P24 -10G-> lc3-0A/P1 > - P25 -10G-> lc3-1A/P3 > - P26 -10G-> lc3-1B/P3 > - P27 -10G-> lc3-1B/P2 > - P28 -10G-> lc3-1B/P1 > - P29 -10G-> lc3-1A/P2 > - P30 -10G-> lc3-1A/P1 > - P31 -10G-> lc3-2A/P1 > - P32 -10G-> lc3-2A/P2 > - P33 -10G-> lc3-2B/P1 > - P34 -10G-> lc3-2B/P2 > - P35 -10G-> lc3-2B/P3 > - P36 -10G-> lc3-2A/P3 > + P1 -10G-> fc8B P13 > + P2 -10G-> fc8A P36 > + P3 -10G-> fc7B P13 > + P4 -10G-> fc7A P36 > + P5 -10G-> fc6B P13 > + P6 -10G-> fc6A P36 > + P7 -10G-> fc5B P13 > + P8 -10G-> fc5A P36 > + P9 -10G-> fc4B P13 > + P10 -10G-> fc4A P36 > + P11 -10G-> fc3B P13 > + P12 -10G-> fc3A P36 > + P13 -10G-> fc0A P36 > + P14 -10G-> fc0B P13 > + P15 -10G-> fc1A P36 > + P16 -10G-> fc1B P13 > + P17 -10G-> fc2A P36 > + P18 -10G-> fc2B P13 > + P19 -10G-> lc3-0B/P3 > + P20 -10G-> lc3-0A/P3 > + P21 -10G-> lc3-0A/P2 > + P22 -10G-> lc3-0A/P1 > + P23 -10G-> lc3-0B/P2 > + P24 -10G-> lc3-0B/P1 > + P25 -10G-> lc3-1B/P3 > + P26 -10G-> lc3-1A/P3 > + P27 -10G-> lc3-1A/P2 > + P28 -10G-> lc3-1A/P1 > + P29 -10G-> lc3-1B/P2 > + P30 -10G-> lc3-1B/P1 > + P31 -10G-> lc3-2B/P1 > + P32 -10G-> lc3-2B/P2 > + P33 -10G-> lc3-2A/P1 > + P34 -10G-> lc3-2A/P2 > + P35 -10G-> lc3-2A/P3 > + P36 -10G-> lc3-2B/P3 > > SUBSYSTEM LEAF lc3B > - P1 -10G-> fc9A P24 > - P2 -10G-> fc9B P35 > - P3 -10G-> fc8A P24 > - P4 -10G-> fc8B P35 > - P5 -10G-> fc7A P24 > - P6 -10G-> fc7B P35 > - P7 -10G-> fc6A P24 > - P8 -10G-> fc6B P35 > - P9 -10G-> fc5A P24 > - P10 -10G-> fc5B P35 > - P11 -10G-> fc4A P24 > - P12 -10G-> fc4B P35 > - P13 -10G-> fc1B P35 > - P14 -10G-> fc1A P24 > - P15 -10G-> fc2B P35 > - P16 -10G-> fc2A P24 > - P17 -10G-> fc3B P35 > - P18 -10G-> fc3A P24 > - P19 -10G-> lc3-3A/P3 > - P20 -10G-> lc3-3B/P3 > - P21 -10G-> lc3-3B/P2 > - P22 -10G-> lc3-3B/P1 > - P23 -10G-> lc3-3A/P2 > - P24 -10G-> lc3-3A/P1 > - P25 -10G-> lc3-4A/P3 > - P26 -10G-> lc3-4B/P3 > - P27 -10G-> lc3-4B/P2 > - P28 -10G-> lc3-4B/P1 > - P29 -10G-> lc3-4A/P2 > - P30 -10G-> lc3-4A/P1 > - P31 -10G-> lc3-5A/P1 > - P32 -10G-> lc3-5A/P2 > - P33 -10G-> lc3-5B/P1 > - P34 -10G-> lc3-5B/P2 > - P35 -10G-> lc3-5B/P3 > - P36 -10G-> lc3-5A/P3 > + P1 -10G-> fc8A P28 > + P2 -10G-> fc8B P31 > + P3 -10G-> fc7A P28 > + P4 -10G-> fc7B P31 > + P5 -10G-> fc6A P28 > + P6 -10G-> fc6B P31 > + P7 -10G-> fc5A P28 > + P8 -10G-> fc5B P31 > + P9 -10G-> fc4A P28 > + P10 -10G-> fc4B P31 > + P11 -10G-> fc3A P28 > + P12 -10G-> fc3B P31 > + P13 -10G-> fc0B P31 > + P14 -10G-> fc0A P28 > + P15 -10G-> fc1B P31 > + P16 -10G-> fc1A P28 > + P17 -10G-> fc2B P31 > + P18 -10G-> fc2A P28 > + P19 -10G-> lc3-3B/P3 > + P20 -10G-> lc3-3A/P3 > + P21 -10G-> lc3-3A/P2 > + P22 -10G-> lc3-3A/P1 > + P23 -10G-> lc3-3B/P2 > + P24 -10G-> lc3-3B/P1 > + P25 -10G-> lc3-4B/P3 > + P26 -10G-> lc3-4A/P3 > + P27 -10G-> lc3-4A/P2 > + P28 -10G-> lc3-4A/P1 > + P29 -10G-> lc3-4B/P2 > + P30 -10G-> lc3-4B/P1 > + P31 -10G-> lc3-5B/P1 > + P32 -10G-> lc3-5B/P2 > + P33 -10G-> lc3-5A/P1 > + P34 -10G-> lc3-5A/P2 > + P35 -10G-> lc3-5A/P3 > + P36 -10G-> lc3-5B/P3 > > SUBSYSTEM LEAF lc3C > - P1 -10G-> fc9B P30 > - P2 -10G-> fc9A P25 > - P3 -10G-> fc8B P30 > - P4 -10G-> fc8A P25 > - P5 -10G-> fc7B P30 > - P6 -10G-> fc7A P25 > - P7 -10G-> fc6B P30 > - P8 -10G-> fc6A P25 > - P9 -10G-> fc5B P30 > - P10 -10G-> fc5A P25 > - P11 -10G-> fc4B P30 > - P12 -10G-> fc4A P25 > - P13 -10G-> fc1A P25 > - P14 -10G-> fc1B P30 > - P15 -10G-> fc2A P25 > - P16 -10G-> fc2B P30 > - P17 -10G-> fc3A P25 > - P18 -10G-> fc3B P30 > - P19 -10G-> lc3-6A/P3 > - P20 -10G-> lc3-6B/P3 > - P21 -10G-> lc3-6B/P2 > - P22 -10G-> lc3-6B/P1 > - P23 -10G-> lc3-6A/P2 > - P24 -10G-> lc3-6A/P1 > - P25 -10G-> lc3-7A/P3 > - P26 -10G-> lc3-7B/P3 > - P27 -10G-> lc3-7B/P2 > - P28 -10G-> lc3-7B/P1 > - P29 -10G-> lc3-7A/P2 > - P30 -10G-> lc3-7A/P1 > - P31 -10G-> lc3-8A/P1 > - P32 -10G-> lc3-8A/P2 > - P33 -10G-> lc3-8B/P1 > - P34 -10G-> lc3-8B/P2 > - P35 -10G-> lc3-8B/P3 > - P36 -10G-> lc3-8A/P3 > + P1 -10G-> fc8B P32 > + P2 -10G-> fc8A P29 > + P3 -10G-> fc7B P32 > + P4 -10G-> fc7A P29 > + P5 -10G-> fc6B P32 > + P6 -10G-> fc6A P29 > + P7 -10G-> fc5B P32 > + P8 -10G-> fc5A P29 > + P9 -10G-> fc4B P32 > + P10 -10G-> fc4A P29 > + P11 -10G-> fc3B P32 > + P12 -10G-> fc3A P29 > + P13 -10G-> fc0A P29 > + P14 -10G-> fc0B P32 > + P15 -10G-> fc1A P29 > + P16 -10G-> fc1B P32 > + P17 -10G-> fc2A P29 > + P18 -10G-> fc2B P32 > + P19 -10G-> lc3-6B/P3 > + P20 -10G-> lc3-6A/P3 > + P21 -10G-> lc3-6A/P2 > + P22 -10G-> lc3-6A/P1 > + P23 -10G-> lc3-6B/P2 > + P24 -10G-> lc3-6B/P1 > + P25 -10G-> lc3-7B/P3 > + P26 -10G-> lc3-7A/P3 > + P27 -10G-> lc3-7A/P2 > + P28 -10G-> lc3-7A/P1 > + P29 -10G-> lc3-7B/P2 > + P30 -10G-> lc3-7B/P1 > + P31 -10G-> lc3-8B/P1 > + P32 -10G-> lc3-8B/P2 > + P33 -10G-> lc3-8A/P1 > + P34 -10G-> lc3-8A/P2 > + P35 -10G-> lc3-8A/P3 > + P36 -10G-> lc3-8B/P3 > > SUBSYSTEM LEAF lc3D > - P1 -10G-> fc9A P23 > - P2 -10G-> fc9B P33 > - P3 -10G-> fc8A P23 > - P4 -10G-> fc8B P33 > - P5 -10G-> fc7A P23 > - P6 -10G-> fc7B P33 > - P7 -10G-> fc6A P23 > - P8 -10G-> fc6B P33 > - P9 -10G-> fc5A P23 > - P10 -10G-> fc5B P33 > - P11 -10G-> fc4A P23 > - P12 -10G-> fc4B P33 > - P13 -10G-> fc1B P33 > - P14 -10G-> fc1A P23 > - P15 -10G-> fc2B P33 > - P16 -10G-> fc2A P23 > - P17 -10G-> fc3B P33 > - P18 -10G-> fc3A P23 > - P19 -10G-> lc3-9A/P3 > - P20 -10G-> lc3-9B/P3 > - P21 -10G-> lc3-9B/P2 > - P22 -10G-> lc3-9B/P1 > - P23 -10G-> lc3-9A/P2 > - P24 -10G-> lc3-9A/P1 > - P25 -10G-> lc3-10A/P3 > - P26 -10G-> lc3-10B/P3 > - P27 -10G-> lc3-10B/P2 > - P28 -10G-> lc3-10B/P1 > - P29 -10G-> lc3-10A/P2 > - P30 -10G-> lc3-10A/P1 > - P31 -10G-> lc3-11A/P1 > - P32 -10G-> lc3-11A/P2 > - P33 -10G-> lc3-11B/P1 > - P34 -10G-> lc3-11B/P2 > - P35 -10G-> lc3-11B/P3 > - P36 -10G-> lc3-11A/P3 > + P1 -10G-> fc8A P26 > + P2 -10G-> fc8B P14 > + P3 -10G-> fc7A P26 > + P4 -10G-> fc7B P14 > + P5 -10G-> fc6A P26 > + P6 -10G-> fc6B P14 > + P7 -10G-> fc5A P26 > + P8 -10G-> fc5B P14 > + P9 -10G-> fc4A P26 > + P10 -10G-> fc4B P14 > + P11 -10G-> fc3A P26 > + P12 -10G-> fc3B P14 > + P13 -10G-> fc0B P14 > + P14 -10G-> fc0A P26 > + P15 -10G-> fc1B P14 > + P16 -10G-> fc1A P26 > + P17 -10G-> fc2B P14 > + P18 -10G-> fc2A P26 > + P19 -10G-> lc3-9B/P3 > + P20 -10G-> lc3-9A/P3 > + P21 -10G-> lc3-9A/P2 > + P22 -10G-> lc3-9A/P1 > + P23 -10G-> lc3-9B/P2 > + P24 -10G-> lc3-9B/P1 > + P25 -10G-> lc3-10B/P3 > + P26 -10G-> lc3-10A/P3 > + P27 -10G-> lc3-10A/P2 > + P28 -10G-> lc3-10A/P1 > + P29 -10G-> lc3-10B/P2 > + P30 -10G-> lc3-10B/P1 > + P31 -10G-> lc3-11B/P1 > + P32 -10G-> lc3-11B/P2 > + P33 -10G-> lc3-11A/P1 > + P34 -10G-> lc3-11A/P2 > + P35 -10G-> lc3-11A/P3 > + P36 -10G-> lc3-11B/P3 > > SUBSYSTEM LEAF lc4A > - P1 -10G-> fc9B P13 > - P2 -10G-> fc9A P36 > - P3 -10G-> fc8B P13 > - P4 -10G-> fc8A P36 > - P5 -10G-> fc7B P13 > - P6 -10G-> fc7A P36 > - P7 -10G-> fc6B P13 > - P8 -10G-> fc6A P36 > - P9 -10G-> fc5B P13 > - P10 -10G-> fc5A P36 > - P11 -10G-> fc4B P13 > - P12 -10G-> fc4A P36 > - P13 -10G-> fc1A P36 > - P14 -10G-> fc1B P13 > - P15 -10G-> fc2A P36 > - P16 -10G-> fc2B P13 > - P17 -10G-> fc3A P36 > - P18 -10G-> fc3B P13 > - P19 -10G-> lc4-0A/P3 > - P20 -10G-> lc4-0B/P3 > - P21 -10G-> lc4-0B/P2 > - P22 -10G-> lc4-0B/P1 > - P23 -10G-> lc4-0A/P2 > - P24 -10G-> lc4-0A/P1 > - P25 -10G-> lc4-1A/P3 > - P26 -10G-> lc4-1B/P3 > - P27 -10G-> lc4-1B/P2 > - P28 -10G-> lc4-1B/P1 > - P29 -10G-> lc4-1A/P2 > - P30 -10G-> lc4-1A/P1 > - P31 -10G-> lc4-2A/P1 > - P32 -10G-> lc4-2A/P2 > - P33 -10G-> lc4-2B/P1 > - P34 -10G-> lc4-2B/P2 > - P35 -10G-> lc4-2B/P3 > - P36 -10G-> lc4-2A/P3 > + P1 -10G-> fc8B P17 > + P2 -10G-> fc8A P33 > + P3 -10G-> fc7B P17 > + P4 -10G-> fc7A P33 > + P5 -10G-> fc6B P17 > + P6 -10G-> fc6A P33 > + P7 -10G-> fc5B P17 > + P8 -10G-> fc5A P33 > + P9 -10G-> fc4B P17 > + P10 -10G-> fc4A P33 > + P11 -10G-> fc3B P17 > + P12 -10G-> fc3A P33 > + P13 -10G-> fc0A P33 > + P14 -10G-> fc0B P17 > + P15 -10G-> fc1A P33 > + P16 -10G-> fc1B P17 > + P17 -10G-> fc2A P33 > + P18 -10G-> fc2B P17 > + P19 -10G-> lc4-0B/P3 > + P20 -10G-> lc4-0A/P3 > + P21 -10G-> lc4-0A/P2 > + P22 -10G-> lc4-0A/P1 > + P23 -10G-> lc4-0B/P2 > + P24 -10G-> lc4-0B/P1 > + P25 -10G-> lc4-1B/P3 > + P26 -10G-> lc4-1A/P3 > + P27 -10G-> lc4-1A/P2 > + P28 -10G-> lc4-1A/P1 > + P29 -10G-> lc4-1B/P2 > + P30 -10G-> lc4-1B/P1 > + P31 -10G-> lc4-2B/P1 > + P32 -10G-> lc4-2B/P2 > + P33 -10G-> lc4-2A/P1 > + P34 -10G-> lc4-2A/P2 > + P35 -10G-> lc4-2A/P3 > + P36 -10G-> lc4-2B/P3 > > SUBSYSTEM LEAF lc4B > - P1 -10G-> fc9A P28 > - P2 -10G-> fc9B P31 > - P3 -10G-> fc8A P28 > - P4 -10G-> fc8B P31 > - P5 -10G-> fc7A P28 > - P6 -10G-> fc7B P31 > - P7 -10G-> fc6A P28 > - P8 -10G-> fc6B P31 > - P9 -10G-> fc5A P28 > - P10 -10G-> fc5B P31 > - P11 -10G-> fc4A P28 > - P12 -10G-> fc4B P31 > - P13 -10G-> fc1B P31 > - P14 -10G-> fc1A P28 > - P15 -10G-> fc2B P31 > - P16 -10G-> fc2A P28 > - P17 -10G-> fc3B P31 > - P18 -10G-> fc3A P28 > - P19 -10G-> lc4-3A/P3 > - P20 -10G-> lc4-3B/P3 > - P21 -10G-> lc4-3B/P2 > - P22 -10G-> lc4-3B/P1 > - P23 -10G-> lc4-3A/P2 > - P24 -10G-> lc4-3A/P1 > - P25 -10G-> lc4-4A/P3 > - P26 -10G-> lc4-4B/P3 > - P27 -10G-> lc4-4B/P2 > - P28 -10G-> lc4-4B/P1 > - P29 -10G-> lc4-4A/P2 > - P30 -10G-> lc4-4A/P1 > - P31 -10G-> lc4-5A/P1 > - P32 -10G-> lc4-5A/P2 > - P33 -10G-> lc4-5B/P1 > - P34 -10G-> lc4-5B/P2 > - P35 -10G-> lc4-5B/P3 > - P36 -10G-> lc4-5A/P3 > + P1 -10G-> fc8A P34 > + P2 -10G-> fc8B P16 > + P3 -10G-> fc7A P34 > + P4 -10G-> fc7B P16 > + P5 -10G-> fc6A P34 > + P6 -10G-> fc6B P16 > + P7 -10G-> fc5A P34 > + P8 -10G-> fc5B P16 > + P9 -10G-> fc4A P34 > + P10 -10G-> fc4B P16 > + P11 -10G-> fc3A P34 > + P12 -10G-> fc3B P16 > + P13 -10G-> fc0B P16 > + P14 -10G-> fc0A P34 > + P15 -10G-> fc1B P16 > + P16 -10G-> fc1A P34 > + P17 -10G-> fc2B P16 > + P18 -10G-> fc2A P34 > + P19 -10G-> lc4-3B/P3 > + P20 -10G-> lc4-3A/P3 > + P21 -10G-> lc4-3A/P2 > + P22 -10G-> lc4-3A/P1 > + P23 -10G-> lc4-3B/P2 > + P24 -10G-> lc4-3B/P1 > + P25 -10G-> lc4-4B/P3 > + P26 -10G-> lc4-4A/P3 > + P27 -10G-> lc4-4A/P2 > + P28 -10G-> lc4-4A/P1 > + P29 -10G-> lc4-4B/P2 > + P30 -10G-> lc4-4B/P1 > + P31 -10G-> lc4-5B/P1 > + P32 -10G-> lc4-5B/P2 > + P33 -10G-> lc4-5A/P1 > + P34 -10G-> lc4-5A/P2 > + P35 -10G-> lc4-5A/P3 > + P36 -10G-> lc4-5B/P3 > > SUBSYSTEM LEAF lc4C > - P1 -10G-> fc9B P32 > - P2 -10G-> fc9A P29 > - P3 -10G-> fc8B P32 > - P4 -10G-> fc8A P29 > - P5 -10G-> fc7B P32 > - P6 -10G-> fc7A P29 > - P7 -10G-> fc6B P32 > - P8 -10G-> fc6A P29 > - P9 -10G-> fc5B P32 > - P10 -10G-> fc5A P29 > - P11 -10G-> fc4B P32 > - P12 -10G-> fc4A P29 > - P13 -10G-> fc1A P29 > - P14 -10G-> fc1B P32 > - P15 -10G-> fc2A P29 > - P16 -10G-> fc2B P32 > - P17 -10G-> fc3A P29 > - P18 -10G-> fc3B P32 > - P19 -10G-> lc4-6A/P3 > - P20 -10G-> lc4-6B/P3 > - P21 -10G-> lc4-6B/P2 > - P22 -10G-> lc4-6B/P1 > - P23 -10G-> lc4-6A/P2 > - P24 -10G-> lc4-6A/P1 > - P25 -10G-> lc4-7A/P3 > - P26 -10G-> lc4-7B/P3 > - P27 -10G-> lc4-7B/P2 > - P28 -10G-> lc4-7B/P1 > - P29 -10G-> lc4-7A/P2 > - P30 -10G-> lc4-7A/P1 > - P31 -10G-> lc4-8A/P1 > - P32 -10G-> lc4-8A/P2 > - P33 -10G-> lc4-8B/P1 > - P34 -10G-> lc4-8B/P2 > - P35 -10G-> lc4-8B/P3 > - P36 -10G-> lc4-8A/P3 > + P1 -10G-> fc8B P15 > + P2 -10G-> fc8A P35 > + P3 -10G-> fc7B P15 > + P4 -10G-> fc7A P35 > + P5 -10G-> fc6B P15 > + P6 -10G-> fc6A P35 > + P7 -10G-> fc5B P15 > + P8 -10G-> fc5A P35 > + P9 -10G-> fc4B P15 > + P10 -10G-> fc4A P35 > + P11 -10G-> fc3B P15 > + P12 -10G-> fc3A P35 > + P13 -10G-> fc0A P35 > + P14 -10G-> fc0B P15 > + P15 -10G-> fc1A P35 > + P16 -10G-> fc1B P15 > + P17 -10G-> fc2A P35 > + P18 -10G-> fc2B P15 > + P19 -10G-> lc4-6B/P3 > + P20 -10G-> lc4-6A/P3 > + P21 -10G-> lc4-6A/P2 > + P22 -10G-> lc4-6A/P1 > + P23 -10G-> lc4-6B/P2 > + P24 -10G-> lc4-6B/P1 > + P25 -10G-> lc4-7B/P3 > + P26 -10G-> lc4-7A/P3 > + P27 -10G-> lc4-7A/P2 > + P28 -10G-> lc4-7A/P1 > + P29 -10G-> lc4-7B/P2 > + P30 -10G-> lc4-7B/P1 > + P31 -10G-> lc4-8B/P1 > + P32 -10G-> lc4-8B/P2 > + P33 -10G-> lc4-8A/P1 > + P34 -10G-> lc4-8A/P2 > + P35 -10G-> lc4-8A/P3 > + P36 -10G-> lc4-8B/P3 > > SUBSYSTEM LEAF lc4D > - P1 -10G-> fc9A P26 > - P2 -10G-> fc9B P14 > - P3 -10G-> fc8A P26 > - P4 -10G-> fc8B P14 > - P5 -10G-> fc7A P26 > - P6 -10G-> fc7B P14 > - P7 -10G-> fc6A P26 > - P8 -10G-> fc6B P14 > - P9 -10G-> fc5A P26 > - P10 -10G-> fc5B P14 > - P11 -10G-> fc4A P26 > - P12 -10G-> fc4B P14 > - P13 -10G-> fc1B P14 > - P14 -10G-> fc1A P26 > - P15 -10G-> fc2B P14 > - P16 -10G-> fc2A P26 > - P17 -10G-> fc3B P14 > - P18 -10G-> fc3A P26 > - P19 -10G-> lc4-9A/P3 > - P20 -10G-> lc4-9B/P3 > - P21 -10G-> lc4-9B/P2 > - P22 -10G-> lc4-9B/P1 > - P23 -10G-> lc4-9A/P2 > - P24 -10G-> lc4-9A/P1 > - P25 -10G-> lc4-10A/P3 > - P26 -10G-> lc4-10B/P3 > - P27 -10G-> lc4-10B/P2 > - P28 -10G-> lc4-10B/P1 > - P29 -10G-> lc4-10A/P2 > - P30 -10G-> lc4-10A/P1 > - P31 -10G-> lc4-11A/P1 > - P32 -10G-> lc4-11A/P2 > - P33 -10G-> lc4-11B/P1 > - P34 -10G-> lc4-11B/P2 > - P35 -10G-> lc4-11B/P3 > - P36 -10G-> lc4-11A/P3 > + P1 -10G-> fc8A P30 > + P2 -10G-> fc8B P12 > + P3 -10G-> fc7A P30 > + P4 -10G-> fc7B P12 > + P5 -10G-> fc6A P30 > + P6 -10G-> fc6B P12 > + P7 -10G-> fc5A P30 > + P8 -10G-> fc5B P12 > + P9 -10G-> fc4A P30 > + P10 -10G-> fc4B P12 > + P11 -10G-> fc3A P30 > + P12 -10G-> fc3B P12 > + P13 -10G-> fc0B P12 > + P14 -10G-> fc0A P30 > + P15 -10G-> fc1B P12 > + P16 -10G-> fc1A P30 > + P17 -10G-> fc2B P12 > + P18 -10G-> fc2A P30 > + P19 -10G-> lc4-9B/P3 > + P20 -10G-> lc4-9A/P3 > + P21 -10G-> lc4-9A/P2 > + P22 -10G-> lc4-9A/P1 > + P23 -10G-> lc4-9B/P2 > + P24 -10G-> lc4-9B/P1 > + P25 -10G-> lc4-10B/P3 > + P26 -10G-> lc4-10A/P3 > + P27 -10G-> lc4-10A/P2 > + P28 -10G-> lc4-10A/P1 > + P29 -10G-> lc4-10B/P2 > + P30 -10G-> lc4-10B/P1 > + P31 -10G-> lc4-11B/P1 > + P32 -10G-> lc4-11B/P2 > + P33 -10G-> lc4-11A/P1 > + P34 -10G-> lc4-11A/P2 > + P35 -10G-> lc4-11A/P3 > + P36 -10G-> lc4-11B/P3 > > SUBSYSTEM LEAF lc5A > - P1 -10G-> fc9B P17 > - P2 -10G-> fc9A P33 > - P3 -10G-> fc8B P17 > - P4 -10G-> fc8A P33 > - P5 -10G-> fc7B P17 > - P6 -10G-> fc7A P33 > - P7 -10G-> fc6B P17 > - P8 -10G-> fc6A P33 > - P9 -10G-> fc5B P17 > - P10 -10G-> fc5A P33 > - P11 -10G-> fc4B P17 > - P12 -10G-> fc4A P33 > - P13 -10G-> fc1A P33 > - P14 -10G-> fc1B P17 > - P15 -10G-> fc2A P33 > - P16 -10G-> fc2B P17 > - P17 -10G-> fc3A P33 > - P18 -10G-> fc3B P17 > - P19 -10G-> lc5-0A/P3 > - P20 -10G-> lc5-0B/P3 > - P21 -10G-> lc5-0B/P2 > - P22 -10G-> lc5-0B/P1 > - P23 -10G-> lc5-0A/P2 > - P24 -10G-> lc5-0A/P1 > - P25 -10G-> lc5-1A/P3 > - P26 -10G-> lc5-1B/P3 > - P27 -10G-> lc5-1B/P2 > - P28 -10G-> lc5-1B/P1 > - P29 -10G-> lc5-1A/P2 > - P30 -10G-> lc5-1A/P1 > - P31 -10G-> lc5-2A/P1 > - P32 -10G-> lc5-2A/P2 > - P33 -10G-> lc5-2B/P1 > - P34 -10G-> lc5-2B/P2 > - P35 -10G-> lc5-2B/P3 > - P36 -10G-> lc5-2A/P3 > + P1 -10G-> fc8B P11 > + P2 -10G-> fc8A P14 > + P3 -10G-> fc7B P11 > + P4 -10G-> fc7A P14 > + P5 -10G-> fc6B P11 > + P6 -10G-> fc6A P14 > + P7 -10G-> fc5B P11 > + P8 -10G-> fc5A P14 > + P9 -10G-> fc4B P11 > + P10 -10G-> fc4A P14 > + P11 -10G-> fc3B P11 > + P12 -10G-> fc3A P14 > + P13 -10G-> fc0A P14 > + P14 -10G-> fc0B P11 > + P15 -10G-> fc1A P14 > + P16 -10G-> fc1B P11 > + P17 -10G-> fc2A P14 > + P18 -10G-> fc2B P11 > + P19 -10G-> lc5-0B/P3 > + P20 -10G-> lc5-0A/P3 > + P21 -10G-> lc5-0A/P2 > + P22 -10G-> lc5-0A/P1 > + P23 -10G-> lc5-0B/P2 > + P24 -10G-> lc5-0B/P1 > + P25 -10G-> lc5-1B/P3 > + P26 -10G-> lc5-1A/P3 > + P27 -10G-> lc5-1A/P2 > + P28 -10G-> lc5-1A/P1 > + P29 -10G-> lc5-1B/P2 > + P30 -10G-> lc5-1B/P1 > + P31 -10G-> lc5-2B/P1 > + P32 -10G-> lc5-2B/P2 > + P33 -10G-> lc5-2A/P1 > + P34 -10G-> lc5-2A/P2 > + P35 -10G-> lc5-2A/P3 > + P36 -10G-> lc5-2B/P3 > > SUBSYSTEM LEAF lc5B > - P1 -10G-> fc9A P34 > - P2 -10G-> fc9B P16 > - P3 -10G-> fc8A P34 > - P4 -10G-> fc8B P16 > - P5 -10G-> fc7A P34 > - P6 -10G-> fc7B P16 > - P7 -10G-> fc6A P34 > - P8 -10G-> fc6B P16 > - P9 -10G-> fc5A P34 > - P10 -10G-> fc5B P16 > - P11 -10G-> fc4A P34 > - P12 -10G-> fc4B P16 > - P13 -10G-> fc1B P16 > - P14 -10G-> fc1A P34 > - P15 -10G-> fc2B P16 > - P16 -10G-> fc2A P34 > - P17 -10G-> fc3B P16 > - P18 -10G-> fc3A P34 > - P19 -10G-> lc5-3A/P3 > - P20 -10G-> lc5-3B/P3 > - P21 -10G-> lc5-3B/P2 > - P22 -10G-> lc5-3B/P1 > - P23 -10G-> lc5-3A/P2 > - P24 -10G-> lc5-3A/P1 > - P25 -10G-> lc5-4A/P3 > - P26 -10G-> lc5-4B/P3 > - P27 -10G-> lc5-4B/P2 > - P28 -10G-> lc5-4B/P1 > - P29 -10G-> lc5-4A/P2 > - P30 -10G-> lc5-4A/P1 > - P31 -10G-> lc5-5A/P1 > - P32 -10G-> lc5-5A/P2 > - P33 -10G-> lc5-5B/P1 > - P34 -10G-> lc5-5B/P2 > - P35 -10G-> lc5-5B/P3 > - P36 -10G-> lc5-5A/P3 > + P1 -10G-> fc8A P13 > + P2 -10G-> fc8B P10 > + P3 -10G-> fc7A P13 > + P4 -10G-> fc7B P10 > + P5 -10G-> fc6A P13 > + P6 -10G-> fc6B P10 > + P7 -10G-> fc5A P13 > + P8 -10G-> fc5B P10 > + P9 -10G-> fc4A P13 > + P10 -10G-> fc4B P10 > + P11 -10G-> fc3A P13 > + P12 -10G-> fc3B P10 > + P13 -10G-> fc0B P10 > + P14 -10G-> fc0A P13 > + P15 -10G-> fc1B P10 > + P16 -10G-> fc1A P13 > + P17 -10G-> fc2B P10 > + P18 -10G-> fc2A P13 > + P19 -10G-> lc5-3B/P3 > + P20 -10G-> lc5-3A/P3 > + P21 -10G-> lc5-3A/P2 > + P22 -10G-> lc5-3A/P1 > + P23 -10G-> lc5-3B/P2 > + P24 -10G-> lc5-3B/P1 > + P25 -10G-> lc5-4B/P3 > + P26 -10G-> lc5-4A/P3 > + P27 -10G-> lc5-4A/P2 > + P28 -10G-> lc5-4A/P1 > + P29 -10G-> lc5-4B/P2 > + P30 -10G-> lc5-4B/P1 > + P31 -10G-> lc5-5B/P1 > + P32 -10G-> lc5-5B/P2 > + P33 -10G-> lc5-5A/P1 > + P34 -10G-> lc5-5A/P2 > + P35 -10G-> lc5-5A/P3 > + P36 -10G-> lc5-5B/P3 > > SUBSYSTEM LEAF lc5C > - P1 -10G-> fc9B P15 > - P2 -10G-> fc9A P35 > - P3 -10G-> fc8B P15 > - P4 -10G-> fc8A P35 > - P5 -10G-> fc7B P15 > - P6 -10G-> fc7A P35 > - P7 -10G-> fc6B P15 > - P8 -10G-> fc6A P35 > - P9 -10G-> fc5B P15 > - P10 -10G-> fc5A P35 > - P11 -10G-> fc4B P15 > - P12 -10G-> fc4A P35 > - P13 -10G-> fc1A P35 > - P14 -10G-> fc1B P15 > - P15 -10G-> fc2A P35 > - P16 -10G-> fc2B P15 > - P17 -10G-> fc3A P35 > - P18 -10G-> fc3B P15 > - P19 -10G-> lc5-6A/P3 > - P20 -10G-> lc5-6B/P3 > - P21 -10G-> lc5-6B/P2 > - P22 -10G-> lc5-6B/P1 > - P23 -10G-> lc5-6A/P2 > - P24 -10G-> lc5-6A/P1 > - P25 -10G-> lc5-7A/P3 > - P26 -10G-> lc5-7B/P3 > - P27 -10G-> lc5-7B/P2 > - P28 -10G-> lc5-7B/P1 > - P29 -10G-> lc5-7A/P2 > - P30 -10G-> lc5-7A/P1 > - P31 -10G-> lc5-8A/P1 > - P32 -10G-> lc5-8A/P2 > - P33 -10G-> lc5-8B/P1 > - P34 -10G-> lc5-8B/P2 > - P35 -10G-> lc5-8B/P3 > - P36 -10G-> lc5-8A/P3 > + P1 -10G-> fc8B P18 > + P2 -10G-> fc8A P31 > + P3 -10G-> fc7B P18 > + P4 -10G-> fc7A P31 > + P5 -10G-> fc6B P18 > + P6 -10G-> fc6A P31 > + P7 -10G-> fc5B P18 > + P8 -10G-> fc5A P31 > + P9 -10G-> fc4B P18 > + P10 -10G-> fc4A P31 > + P11 -10G-> fc3B P18 > + P12 -10G-> fc3A P31 > + P13 -10G-> fc0A P31 > + P14 -10G-> fc0B P18 > + P15 -10G-> fc1A P31 > + P16 -10G-> fc1B P18 > + P17 -10G-> fc2A P31 > + P18 -10G-> fc2B P18 > + P19 -10G-> lc5-6B/P3 > + P20 -10G-> lc5-6A/P3 > + P21 -10G-> lc5-6A/P2 > + P22 -10G-> lc5-6A/P1 > + P23 -10G-> lc5-6B/P2 > + P24 -10G-> lc5-6B/P1 > + P25 -10G-> lc5-7B/P3 > + P26 -10G-> lc5-7A/P3 > + P27 -10G-> lc5-7A/P2 > + P28 -10G-> lc5-7A/P1 > + P29 -10G-> lc5-7B/P2 > + P30 -10G-> lc5-7B/P1 > + P31 -10G-> lc5-8B/P1 > + P32 -10G-> lc5-8B/P2 > + P33 -10G-> lc5-8A/P1 > + P34 -10G-> lc5-8A/P2 > + P35 -10G-> lc5-8A/P3 > + P36 -10G-> lc5-8B/P3 > > SUBSYSTEM LEAF lc5D > - P1 -10G-> fc9A P30 > - P2 -10G-> fc9B P12 > - P3 -10G-> fc8A P30 > - P4 -10G-> fc8B P12 > - P5 -10G-> fc7A P30 > - P6 -10G-> fc7B P12 > - P7 -10G-> fc6A P30 > - P8 -10G-> fc6B P12 > - P9 -10G-> fc5A P30 > - P10 -10G-> fc5B P12 > - P11 -10G-> fc4A P30 > - P12 -10G-> fc4B P12 > - P13 -10G-> fc1B P12 > - P14 -10G-> fc1A P30 > - P15 -10G-> fc2B P12 > - P16 -10G-> fc2A P30 > - P17 -10G-> fc3B P12 > - P18 -10G-> fc3A P30 > - P19 -10G-> lc5-9A/P3 > - P20 -10G-> lc5-9B/P3 > - P21 -10G-> lc5-9B/P2 > - P22 -10G-> lc5-9B/P1 > - P23 -10G-> lc5-9A/P2 > - P24 -10G-> lc5-9A/P1 > - P25 -10G-> lc5-10A/P3 > - P26 -10G-> lc5-10B/P3 > - P27 -10G-> lc5-10B/P2 > - P28 -10G-> lc5-10B/P1 > - P29 -10G-> lc5-10A/P2 > - P30 -10G-> lc5-10A/P1 > - P31 -10G-> lc5-11A/P1 > - P32 -10G-> lc5-11A/P2 > - P33 -10G-> lc5-11B/P1 > - P34 -10G-> lc5-11B/P2 > - P35 -10G-> lc5-11B/P3 > - P36 -10G-> lc5-11A/P3 > + P1 -10G-> fc8A P32 > + P2 -10G-> fc8B P8 > + P3 -10G-> fc7A P32 > + P4 -10G-> fc7B P8 > + P5 -10G-> fc6A P32 > + P6 -10G-> fc6B P8 > + P7 -10G-> fc5A P32 > + P8 -10G-> fc5B P8 > + P9 -10G-> fc4A P32 > + P10 -10G-> fc4B P8 > + P11 -10G-> fc3A P32 > + P12 -10G-> fc3B P8 > + P13 -10G-> fc0B P8 > + P14 -10G-> fc0A P32 > + P15 -10G-> fc1B P8 > + P16 -10G-> fc1A P32 > + P17 -10G-> fc2B P8 > + P18 -10G-> fc2A P32 > + P19 -10G-> lc5-9B/P3 > + P20 -10G-> lc5-9A/P3 > + P21 -10G-> lc5-9A/P2 > + P22 -10G-> lc5-9A/P1 > + P23 -10G-> lc5-9B/P2 > + P24 -10G-> lc5-9B/P1 > + P25 -10G-> lc5-10B/P3 > + P26 -10G-> lc5-10A/P3 > + P27 -10G-> lc5-10A/P2 > + P28 -10G-> lc5-10A/P1 > + P29 -10G-> lc5-10B/P2 > + P30 -10G-> lc5-10B/P1 > + P31 -10G-> lc5-11B/P1 > + P32 -10G-> lc5-11B/P2 > + P33 -10G-> lc5-11A/P1 > + P34 -10G-> lc5-11A/P2 > + P35 -10G-> lc5-11A/P3 > + P36 -10G-> lc5-11B/P3 > > SUBSYSTEM LEAF lc6A > - P1 -10G-> fc9B P11 > - P2 -10G-> fc9A P14 > - P3 -10G-> fc8B P11 > - P4 -10G-> fc8A P14 > - P5 -10G-> fc7B P11 > - P6 -10G-> fc7A P14 > - P7 -10G-> fc6B P11 > - P8 -10G-> fc6A P14 > - P9 -10G-> fc5B P11 > - P10 -10G-> fc5A P14 > - P11 -10G-> fc4B P11 > - P12 -10G-> fc4A P14 > - P13 -10G-> fc1A P14 > - P14 -10G-> fc1B P11 > - P15 -10G-> fc2A P14 > - P16 -10G-> fc2B P11 > - P17 -10G-> fc3A P14 > - P18 -10G-> fc3B P11 > - P19 -10G-> lc6-0A/P3 > - P20 -10G-> lc6-0B/P3 > - P21 -10G-> lc6-0B/P2 > - P22 -10G-> lc6-0B/P1 > - P23 -10G-> lc6-0A/P2 > - P24 -10G-> lc6-0A/P1 > - P25 -10G-> lc6-1A/P3 > - P26 -10G-> lc6-1B/P3 > - P27 -10G-> lc6-1B/P2 > - P28 -10G-> lc6-1B/P1 > - P29 -10G-> lc6-1A/P2 > - P30 -10G-> lc6-1A/P1 > - P31 -10G-> lc6-2A/P1 > - P32 -10G-> lc6-2A/P2 > - P33 -10G-> lc6-2B/P1 > - P34 -10G-> lc6-2B/P2 > - P35 -10G-> lc6-2B/P3 > - P36 -10G-> lc6-2A/P3 > + P1 -10G-> fc8B P7 > + P2 -10G-> fc8A P12 > + P3 -10G-> fc7B P7 > + P4 -10G-> fc7A P12 > + P5 -10G-> fc6B P7 > + P6 -10G-> fc6A P12 > + P7 -10G-> fc5B P7 > + P8 -10G-> fc5A P12 > + P9 -10G-> fc4B P7 > + P10 -10G-> fc4A P12 > + P11 -10G-> fc3B P7 > + P12 -10G-> fc3A P12 > + P13 -10G-> fc0A P12 > + P14 -10G-> fc0B P7 > + P15 -10G-> fc1A P12 > + P16 -10G-> fc1B P7 > + P17 -10G-> fc2A P12 > + P18 -10G-> fc2B P7 > + P19 -10G-> lc6-0B/P3 > + P20 -10G-> lc6-0A/P3 > + P21 -10G-> lc6-0A/P2 > + P22 -10G-> lc6-0A/P1 > + P23 -10G-> lc6-0B/P2 > + P24 -10G-> lc6-0B/P1 > + P25 -10G-> lc6-1B/P3 > + P26 -10G-> lc6-1A/P3 > + P27 -10G-> lc6-1A/P2 > + P28 -10G-> lc6-1A/P1 > + P29 -10G-> lc6-1B/P2 > + P30 -10G-> lc6-1B/P1 > + P31 -10G-> lc6-2B/P1 > + P32 -10G-> lc6-2B/P2 > + P33 -10G-> lc6-2A/P1 > + P34 -10G-> lc6-2A/P2 > + P35 -10G-> lc6-2A/P3 > + P36 -10G-> lc6-2B/P3 > > SUBSYSTEM LEAF lc6B > - P1 -10G-> fc9A P13 > - P2 -10G-> fc9B P10 > - P3 -10G-> fc8A P13 > - P4 -10G-> fc8B P10 > - P5 -10G-> fc7A P13 > - P6 -10G-> fc7B P10 > - P7 -10G-> fc6A P13 > - P8 -10G-> fc6B P10 > - P9 -10G-> fc5A P13 > - P10 -10G-> fc5B P10 > - P11 -10G-> fc4A P13 > - P12 -10G-> fc4B P10 > - P13 -10G-> fc1B P10 > - P14 -10G-> fc1A P13 > - P15 -10G-> fc2B P10 > - P16 -10G-> fc2A P13 > - P17 -10G-> fc3B P10 > - P18 -10G-> fc3A P13 > - P19 -10G-> lc6-3A/P3 > - P20 -10G-> lc6-3B/P3 > - P21 -10G-> lc6-3B/P2 > - P22 -10G-> lc6-3B/P1 > - P23 -10G-> lc6-3A/P2 > - P24 -10G-> lc6-3A/P1 > - P25 -10G-> lc6-4A/P3 > - P26 -10G-> lc6-4B/P3 > - P27 -10G-> lc6-4B/P2 > - P28 -10G-> lc6-4B/P1 > - P29 -10G-> lc6-4A/P2 > - P30 -10G-> lc6-4A/P1 > - P31 -10G-> lc6-5A/P1 > - P32 -10G-> lc6-5A/P2 > - P33 -10G-> lc6-5B/P1 > - P34 -10G-> lc6-5B/P2 > - P35 -10G-> lc6-5B/P3 > - P36 -10G-> lc6-5A/P3 > + P1 -10G-> fc8A P17 > + P2 -10G-> fc8B P6 > + P3 -10G-> fc7A P17 > + P4 -10G-> fc7B P6 > + P5 -10G-> fc6A P17 > + P6 -10G-> fc6B P6 > + P7 -10G-> fc5A P17 > + P8 -10G-> fc5B P6 > + P9 -10G-> fc4A P17 > + P10 -10G-> fc4B P6 > + P11 -10G-> fc3A P17 > + P12 -10G-> fc3B P6 > + P13 -10G-> fc0B P6 > + P14 -10G-> fc0A P17 > + P15 -10G-> fc1B P6 > + P16 -10G-> fc1A P17 > + P17 -10G-> fc2B P6 > + P18 -10G-> fc2A P17 > + P19 -10G-> lc6-3B/P3 > + P20 -10G-> lc6-3A/P3 > + P21 -10G-> lc6-3A/P2 > + P22 -10G-> lc6-3A/P1 > + P23 -10G-> lc6-3B/P2 > + P24 -10G-> lc6-3B/P1 > + P25 -10G-> lc6-4B/P3 > + P26 -10G-> lc6-4A/P3 > + P27 -10G-> lc6-4A/P2 > + P28 -10G-> lc6-4A/P1 > + P29 -10G-> lc6-4B/P2 > + P30 -10G-> lc6-4B/P1 > + P31 -10G-> lc6-5B/P1 > + P32 -10G-> lc6-5B/P2 > + P33 -10G-> lc6-5A/P1 > + P34 -10G-> lc6-5A/P2 > + P35 -10G-> lc6-5A/P3 > + P36 -10G-> lc6-5B/P3 > > SUBSYSTEM LEAF lc6C > - P1 -10G-> fc9B P18 > - P2 -10G-> fc9A P31 > - P3 -10G-> fc8B P18 > - P4 -10G-> fc8A P31 > - P5 -10G-> fc7B P18 > - P6 -10G-> fc7A P31 > - P7 -10G-> fc6B P18 > - P8 -10G-> fc6A P31 > - P9 -10G-> fc5B P18 > - P10 -10G-> fc5A P31 > - P11 -10G-> fc4B P18 > - P12 -10G-> fc4A P31 > - P13 -10G-> fc1A P31 > - P14 -10G-> fc1B P18 > - P15 -10G-> fc2A P31 > - P16 -10G-> fc2B P18 > - P17 -10G-> fc3A P31 > - P18 -10G-> fc3B P18 > - P19 -10G-> lc6-6A/P3 > - P20 -10G-> lc6-6B/P3 > - P21 -10G-> lc6-6B/P2 > - P22 -10G-> lc6-6B/P1 > - P23 -10G-> lc6-6A/P2 > - P24 -10G-> lc6-6A/P1 > - P25 -10G-> lc6-7A/P3 > - P26 -10G-> lc6-7B/P3 > - P27 -10G-> lc6-7B/P2 > - P28 -10G-> lc6-7B/P1 > - P29 -10G-> lc6-7A/P2 > - P30 -10G-> lc6-7A/P1 > - P31 -10G-> lc6-8A/P1 > - P32 -10G-> lc6-8A/P2 > - P33 -10G-> lc6-8B/P1 > - P34 -10G-> lc6-8B/P2 > - P35 -10G-> lc6-8B/P3 > - P36 -10G-> lc6-8A/P3 > + P1 -10G-> fc8B P9 > + P2 -10G-> fc8A P16 > + P3 -10G-> fc7B P9 > + P4 -10G-> fc7A P16 > + P5 -10G-> fc6B P9 > + P6 -10G-> fc6A P16 > + P7 -10G-> fc5B P9 > + P8 -10G-> fc5A P16 > + P9 -10G-> fc4B P9 > + P10 -10G-> fc4A P16 > + P11 -10G-> fc3B P9 > + P12 -10G-> fc3A P16 > + P13 -10G-> fc0A P16 > + P14 -10G-> fc0B P9 > + P15 -10G-> fc1A P16 > + P16 -10G-> fc1B P9 > + P17 -10G-> fc2A P16 > + P18 -10G-> fc2B P9 > + P19 -10G-> lc6-6B/P3 > + P20 -10G-> lc6-6A/P3 > + P21 -10G-> lc6-6A/P2 > + P22 -10G-> lc6-6A/P1 > + P23 -10G-> lc6-6B/P2 > + P24 -10G-> lc6-6B/P1 > + P25 -10G-> lc6-7B/P3 > + P26 -10G-> lc6-7A/P3 > + P27 -10G-> lc6-7A/P2 > + P28 -10G-> lc6-7A/P1 > + P29 -10G-> lc6-7B/P2 > + P30 -10G-> lc6-7B/P1 > + P31 -10G-> lc6-8B/P1 > + P32 -10G-> lc6-8B/P2 > + P33 -10G-> lc6-8A/P1 > + P34 -10G-> lc6-8A/P2 > + P35 -10G-> lc6-8A/P3 > + P36 -10G-> lc6-8B/P3 > > SUBSYSTEM LEAF lc6D > - P1 -10G-> fc9A P32 > - P2 -10G-> fc9B P8 > - P3 -10G-> fc8A P32 > - P4 -10G-> fc8B P8 > - P5 -10G-> fc7A P32 > - P6 -10G-> fc7B P8 > - P7 -10G-> fc6A P32 > - P8 -10G-> fc6B P8 > - P9 -10G-> fc5A P32 > - P10 -10G-> fc5B P8 > - P11 -10G-> fc4A P32 > - P12 -10G-> fc4B P8 > - P13 -10G-> fc1B P8 > - P14 -10G-> fc1A P32 > - P15 -10G-> fc2B P8 > - P16 -10G-> fc2A P32 > - P17 -10G-> fc3B P8 > - P18 -10G-> fc3A P32 > - P19 -10G-> lc6-9A/P3 > - P20 -10G-> lc6-9B/P3 > - P21 -10G-> lc6-9B/P2 > - P22 -10G-> lc6-9B/P1 > - P23 -10G-> lc6-9A/P2 > - P24 -10G-> lc6-9A/P1 > - P25 -10G-> lc6-10A/P3 > - P26 -10G-> lc6-10B/P3 > - P27 -10G-> lc6-10B/P2 > - P28 -10G-> lc6-10B/P1 > - P29 -10G-> lc6-10A/P2 > - P30 -10G-> lc6-10A/P1 > - P31 -10G-> lc6-11A/P1 > - P32 -10G-> lc6-11A/P2 > - P33 -10G-> lc6-11B/P1 > - P34 -10G-> lc6-11B/P2 > - P35 -10G-> lc6-11B/P3 > - P36 -10G-> lc6-11A/P3 > + P1 -10G-> fc8A P15 > + P2 -10G-> fc8B P5 > + P3 -10G-> fc7A P15 > + P4 -10G-> fc7B P5 > + P5 -10G-> fc6A P15 > + P6 -10G-> fc6B P5 > + P7 -10G-> fc5A P15 > + P8 -10G-> fc5B P5 > + P9 -10G-> fc4A P15 > + P10 -10G-> fc4B P5 > + P11 -10G-> fc3A P15 > + P12 -10G-> fc3B P5 > + P13 -10G-> fc0B P5 > + P14 -10G-> fc0A P15 > + P15 -10G-> fc1B P5 > + P16 -10G-> fc1A P15 > + P17 -10G-> fc2B P5 > + P18 -10G-> fc2A P15 > + P19 -10G-> lc6-9B/P3 > + P20 -10G-> lc6-9A/P3 > + P21 -10G-> lc6-9A/P2 > + P22 -10G-> lc6-9A/P1 > + P23 -10G-> lc6-9B/P2 > + P24 -10G-> lc6-9B/P1 > + P25 -10G-> lc6-10B/P3 > + P26 -10G-> lc6-10A/P3 > + P27 -10G-> lc6-10A/P2 > + P28 -10G-> lc6-10A/P1 > + P29 -10G-> lc6-10B/P2 > + P30 -10G-> lc6-10B/P1 > + P31 -10G-> lc6-11B/P1 > + P32 -10G-> lc6-11B/P2 > + P33 -10G-> lc6-11A/P1 > + P34 -10G-> lc6-11A/P2 > + P35 -10G-> lc6-11A/P3 > + P36 -10G-> lc6-11B/P3 > > SUBSYSTEM LEAF lc7A > - P1 -10G-> fc9B P7 > - P2 -10G-> fc9A P12 > - P3 -10G-> fc8B P7 > - P4 -10G-> fc8A P12 > - P5 -10G-> fc7B P7 > - P6 -10G-> fc7A P12 > - P7 -10G-> fc6B P7 > - P8 -10G-> fc6A P12 > - P9 -10G-> fc5B P7 > - P10 -10G-> fc5A P12 > - P11 -10G-> fc4B P7 > - P12 -10G-> fc4A P12 > - P13 -10G-> fc1A P12 > - P14 -10G-> fc1B P7 > - P15 -10G-> fc2A P12 > - P16 -10G-> fc2B P7 > - P17 -10G-> fc3A P12 > - P18 -10G-> fc3B P7 > - P19 -10G-> lc7-0A/P3 > - P20 -10G-> lc7-0B/P3 > - P21 -10G-> lc7-0B/P2 > - P22 -10G-> lc7-0B/P1 > - P23 -10G-> lc7-0A/P2 > - P24 -10G-> lc7-0A/P1 > - P25 -10G-> lc7-1A/P3 > - P26 -10G-> lc7-1B/P3 > - P27 -10G-> lc7-1B/P2 > - P28 -10G-> lc7-1B/P1 > - P29 -10G-> lc7-1A/P2 > - P30 -10G-> lc7-1A/P1 > - P31 -10G-> lc7-2A/P1 > - P32 -10G-> lc7-2A/P2 > - P33 -10G-> lc7-2B/P1 > - P34 -10G-> lc7-2B/P2 > - P35 -10G-> lc7-2B/P3 > - P36 -10G-> lc7-2A/P3 > + P1 -10G-> fc8B P2 > + P2 -10G-> fc8A P8 > + P3 -10G-> fc7B P2 > + P4 -10G-> fc7A P8 > + P5 -10G-> fc6B P2 > + P6 -10G-> fc6A P8 > + P7 -10G-> fc5B P2 > + P8 -10G-> fc5A P8 > + P9 -10G-> fc4B P2 > + P10 -10G-> fc4A P8 > + P11 -10G-> fc3B P2 > + P12 -10G-> fc3A P8 > + P13 -10G-> fc0A P8 > + P14 -10G-> fc0B P2 > + P15 -10G-> fc1A P8 > + P16 -10G-> fc1B P2 > + P17 -10G-> fc2A P8 > + P18 -10G-> fc2B P2 > + P19 -10G-> lc7-0B/P3 > + P20 -10G-> lc7-0A/P3 > + P21 -10G-> lc7-0A/P2 > + P22 -10G-> lc7-0A/P1 > + P23 -10G-> lc7-0B/P2 > + P24 -10G-> lc7-0B/P1 > + P25 -10G-> lc7-1B/P3 > + P26 -10G-> lc7-1A/P3 > + P27 -10G-> lc7-1A/P2 > + P28 -10G-> lc7-1A/P1 > + P29 -10G-> lc7-1B/P2 > + P30 -10G-> lc7-1B/P1 > + P31 -10G-> lc7-2B/P1 > + P32 -10G-> lc7-2B/P2 > + P33 -10G-> lc7-2A/P1 > + P34 -10G-> lc7-2A/P2 > + P35 -10G-> lc7-2A/P3 > + P36 -10G-> lc7-2B/P3 > > SUBSYSTEM LEAF lc7B > - P1 -10G-> fc9A P17 > - P2 -10G-> fc9B P6 > - P3 -10G-> fc8A P17 > - P4 -10G-> fc8B P6 > - P5 -10G-> fc7A P17 > - P6 -10G-> fc7B P6 > - P7 -10G-> fc6A P17 > - P8 -10G-> fc6B P6 > - P9 -10G-> fc5A P17 > - P10 -10G-> fc5B P6 > - P11 -10G-> fc4A P17 > - P12 -10G-> fc4B P6 > - P13 -10G-> fc1B P6 > - P14 -10G-> fc1A P17 > - P15 -10G-> fc2B P6 > - P16 -10G-> fc2A P17 > - P17 -10G-> fc3B P6 > - P18 -10G-> fc3A P17 > - P19 -10G-> lc7-3A/P3 > - P20 -10G-> lc7-3B/P3 > - P21 -10G-> lc7-3B/P2 > - P22 -10G-> lc7-3B/P1 > - P23 -10G-> lc7-3A/P2 > - P24 -10G-> lc7-3A/P1 > - P25 -10G-> lc7-4A/P3 > - P26 -10G-> lc7-4B/P3 > - P27 -10G-> lc7-4B/P2 > - P28 -10G-> lc7-4B/P1 > - P29 -10G-> lc7-4A/P2 > - P30 -10G-> lc7-4A/P1 > - P31 -10G-> lc7-5A/P1 > - P32 -10G-> lc7-5A/P2 > - P33 -10G-> lc7-5B/P1 > - P34 -10G-> lc7-5B/P2 > - P35 -10G-> lc7-5B/P3 > - P36 -10G-> lc7-5A/P3 > + P1 -10G-> fc8A P11 > + P2 -10G-> fc8B P3 > + P3 -10G-> fc7A P11 > + P4 -10G-> fc7B P3 > + P5 -10G-> fc6A P11 > + P6 -10G-> fc6B P3 > + P7 -10G-> fc5A P11 > + P8 -10G-> fc5B P3 > + P9 -10G-> fc4A P11 > + P10 -10G-> fc4B P3 > + P11 -10G-> fc3A P11 > + P12 -10G-> fc3B P3 > + P13 -10G-> fc0B P3 > + P14 -10G-> fc0A P11 > + P15 -10G-> fc1B P3 > + P16 -10G-> fc1A P11 > + P17 -10G-> fc2B P3 > + P18 -10G-> fc2A P11 > + P19 -10G-> lc7-3B/P3 > + P20 -10G-> lc7-3A/P3 > + P21 -10G-> lc7-3A/P2 > + P22 -10G-> lc7-3A/P1 > + P23 -10G-> lc7-3B/P2 > + P24 -10G-> lc7-3B/P1 > + P25 -10G-> lc7-4B/P3 > + P26 -10G-> lc7-4A/P3 > + P27 -10G-> lc7-4A/P2 > + P28 -10G-> lc7-4A/P1 > + P29 -10G-> lc7-4B/P2 > + P30 -10G-> lc7-4B/P1 > + P31 -10G-> lc7-5B/P1 > + P32 -10G-> lc7-5B/P2 > + P33 -10G-> lc7-5A/P1 > + P34 -10G-> lc7-5A/P2 > + P35 -10G-> lc7-5A/P3 > + P36 -10G-> lc7-5B/P3 > > SUBSYSTEM LEAF lc7C > - P1 -10G-> fc9B P9 > - P2 -10G-> fc9A P16 > - P3 -10G-> fc8B P9 > - P4 -10G-> fc8A P16 > - P5 -10G-> fc7B P9 > - P6 -10G-> fc7A P16 > - P7 -10G-> fc6B P9 > - P8 -10G-> fc6A P16 > - P9 -10G-> fc5B P9 > - P10 -10G-> fc5A P16 > - P11 -10G-> fc4B P9 > - P12 -10G-> fc4A P16 > - P13 -10G-> fc1A P16 > - P14 -10G-> fc1B P9 > - P15 -10G-> fc2A P16 > - P16 -10G-> fc2B P9 > - P17 -10G-> fc3A P16 > - P18 -10G-> fc3B P9 > - P19 -10G-> lc7-6A/P3 > - P20 -10G-> lc7-6B/P3 > - P21 -10G-> lc7-6B/P2 > - P22 -10G-> lc7-6B/P1 > - P23 -10G-> lc7-6A/P2 > - P24 -10G-> lc7-6A/P1 > - P25 -10G-> lc7-7A/P3 > - P26 -10G-> lc7-7B/P3 > - P27 -10G-> lc7-7B/P2 > - P28 -10G-> lc7-7B/P1 > - P29 -10G-> lc7-7A/P2 > - P30 -10G-> lc7-7A/P1 > - P31 -10G-> lc7-8A/P1 > - P32 -10G-> lc7-8A/P2 > - P33 -10G-> lc7-8B/P1 > - P34 -10G-> lc7-8B/P2 > - P35 -10G-> lc7-8B/P3 > - P36 -10G-> lc7-8A/P3 > + P1 -10G-> fc8B P4 > + P2 -10G-> fc8A P10 > + P3 -10G-> fc7B P4 > + P4 -10G-> fc7A P10 > + P5 -10G-> fc6B P4 > + P6 -10G-> fc6A P10 > + P7 -10G-> fc5B P4 > + P8 -10G-> fc5A P10 > + P9 -10G-> fc4B P4 > + P10 -10G-> fc4A P10 > + P11 -10G-> fc3B P4 > + P12 -10G-> fc3A P10 > + P13 -10G-> fc0A P10 > + P14 -10G-> fc0B P4 > + P15 -10G-> fc1A P10 > + P16 -10G-> fc1B P4 > + P17 -10G-> fc2A P10 > + P18 -10G-> fc2B P4 > + P19 -10G-> lc7-6B/P3 > + P20 -10G-> lc7-6A/P3 > + P21 -10G-> lc7-6A/P2 > + P22 -10G-> lc7-6A/P1 > + P23 -10G-> lc7-6B/P2 > + P24 -10G-> lc7-6B/P1 > + P25 -10G-> lc7-7B/P3 > + P26 -10G-> lc7-7A/P3 > + P27 -10G-> lc7-7A/P2 > + P28 -10G-> lc7-7A/P1 > + P29 -10G-> lc7-7B/P2 > + P30 -10G-> lc7-7B/P1 > + P31 -10G-> lc7-8B/P1 > + P32 -10G-> lc7-8B/P2 > + P33 -10G-> lc7-8A/P1 > + P34 -10G-> lc7-8A/P2 > + P35 -10G-> lc7-8A/P3 > + P36 -10G-> lc7-8B/P3 > > SUBSYSTEM LEAF lc7D > - P1 -10G-> fc9A P15 > - P2 -10G-> fc9B P5 > - P3 -10G-> fc8A P15 > - P4 -10G-> fc8B P5 > - P5 -10G-> fc7A P15 > - P6 -10G-> fc7B P5 > - P7 -10G-> fc6A P15 > - P8 -10G-> fc6B P5 > - P9 -10G-> fc5A P15 > - P10 -10G-> fc5B P5 > - P11 -10G-> fc4A P15 > - P12 -10G-> fc4B P5 > - P13 -10G-> fc1B P5 > - P14 -10G-> fc1A P15 > - P15 -10G-> fc2B P5 > - P16 -10G-> fc2A P15 > - P17 -10G-> fc3B P5 > - P18 -10G-> fc3A P15 > - P19 -10G-> lc7-9A/P3 > - P20 -10G-> lc7-9B/P3 > - P21 -10G-> lc7-9B/P2 > - P22 -10G-> lc7-9B/P1 > - P23 -10G-> lc7-9A/P2 > - P24 -10G-> lc7-9A/P1 > - P25 -10G-> lc7-10A/P3 > - P26 -10G-> lc7-10B/P3 > - P27 -10G-> lc7-10B/P2 > - P28 -10G-> lc7-10B/P1 > - P29 -10G-> lc7-10A/P2 > - P30 -10G-> lc7-10A/P1 > - P31 -10G-> lc7-11A/P1 > - P32 -10G-> lc7-11A/P2 > - P33 -10G-> lc7-11B/P1 > - P34 -10G-> lc7-11B/P2 > - P35 -10G-> lc7-11B/P3 > - P36 -10G-> lc7-11A/P3 > + P1 -10G-> fc8A P18 > + P2 -10G-> fc8B P1 > + P3 -10G-> fc7A P18 > + P4 -10G-> fc7B P1 > + P5 -10G-> fc6A P18 > + P6 -10G-> fc6B P1 > + P7 -10G-> fc5A P18 > + P8 -10G-> fc5B P1 > + P9 -10G-> fc4A P18 > + P10 -10G-> fc4B P1 > + P11 -10G-> fc3A P18 > + P12 -10G-> fc3B P1 > + P13 -10G-> fc0B P1 > + P14 -10G-> fc0A P18 > + P15 -10G-> fc1B P1 > + P16 -10G-> fc1A P18 > + P17 -10G-> fc2B P1 > + P18 -10G-> fc2A P18 > + P19 -10G-> lc7-9B/P3 > + P20 -10G-> lc7-9A/P3 > + P21 -10G-> lc7-9A/P2 > + P22 -10G-> lc7-9A/P1 > + P23 -10G-> lc7-9B/P2 > + P24 -10G-> lc7-9B/P1 > + P25 -10G-> lc7-10B/P3 > + P26 -10G-> lc7-10A/P3 > + P27 -10G-> lc7-10A/P2 > + P28 -10G-> lc7-10A/P1 > + P29 -10G-> lc7-10B/P2 > + P30 -10G-> lc7-10B/P1 > + P31 -10G-> lc7-11B/P1 > + P32 -10G-> lc7-11B/P2 > + P33 -10G-> lc7-11A/P1 > + P34 -10G-> lc7-11A/P2 > + P35 -10G-> lc7-11A/P3 > + P36 -10G-> lc7-11B/P3 > > SUBSYSTEM LEAF lc8A > - P1 -10G-> fc9B P2 > - P2 -10G-> fc9A P8 > - P3 -10G-> fc8B P2 > - P4 -10G-> fc8A P8 > - P5 -10G-> fc7B P2 > - P6 -10G-> fc7A P8 > - P7 -10G-> fc6B P2 > - P8 -10G-> fc6A P8 > - P9 -10G-> fc5B P2 > - P10 -10G-> fc5A P8 > - P11 -10G-> fc4B P2 > - P12 -10G-> fc4A P8 > - P13 -10G-> fc1A P8 > - P14 -10G-> fc1B P2 > - P15 -10G-> fc2A P8 > - P16 -10G-> fc2B P2 > - P17 -10G-> fc3A P8 > - P18 -10G-> fc3B P2 > - P19 -10G-> lc8-0A/P3 > - P20 -10G-> lc8-0B/P3 > - P21 -10G-> lc8-0B/P2 > - P22 -10G-> lc8-0B/P1 > - P23 -10G-> lc8-0A/P2 > - P24 -10G-> lc8-0A/P1 > - P25 -10G-> lc8-1A/P3 > - P26 -10G-> lc8-1B/P3 > - P27 -10G-> lc8-1B/P2 > - P28 -10G-> lc8-1B/P1 > - P29 -10G-> lc8-1A/P2 > - P30 -10G-> lc8-1A/P1 > - P31 -10G-> lc8-2A/P1 > - P32 -10G-> lc8-2A/P2 > - P33 -10G-> lc8-2B/P1 > - P34 -10G-> lc8-2B/P2 > - P35 -10G-> lc8-2B/P3 > - P36 -10G-> lc8-2A/P3 > + P1 -10G-> fc8B P21 > + P2 -10G-> fc8A P5 > + P3 -10G-> fc7B P21 > + P4 -10G-> fc7A P5 > + P5 -10G-> fc6B P21 > + P6 -10G-> fc6A P5 > + P7 -10G-> fc5B P21 > + P8 -10G-> fc5A P5 > + P9 -10G-> fc4B P21 > + P10 -10G-> fc4A P5 > + P11 -10G-> fc3B P21 > + P12 -10G-> fc3A P5 > + P13 -10G-> fc0A P5 > + P14 -10G-> fc0B P21 > + P15 -10G-> fc1A P5 > + P16 -10G-> fc1B P21 > + P17 -10G-> fc2A P5 > + P18 -10G-> fc2B P21 > + P19 -10G-> lc8-0B/P3 > + P20 -10G-> lc8-0A/P3 > + P21 -10G-> lc8-0A/P2 > + P22 -10G-> lc8-0A/P1 > + P23 -10G-> lc8-0B/P2 > + P24 -10G-> lc8-0B/P1 > + P25 -10G-> lc8-1B/P3 > + P26 -10G-> lc8-1A/P3 > + P27 -10G-> lc8-1A/P2 > + P28 -10G-> lc8-1A/P1 > + P29 -10G-> lc8-1B/P2 > + P30 -10G-> lc8-1B/P1 > + P31 -10G-> lc8-2B/P1 > + P32 -10G-> lc8-2B/P2 > + P33 -10G-> lc8-2A/P1 > + P34 -10G-> lc8-2A/P2 > + P35 -10G-> lc8-2A/P3 > + P36 -10G-> lc8-2B/P3 > > SUBSYSTEM LEAF lc8B > - P1 -10G-> fc9A P11 > - P2 -10G-> fc9B P3 > - P3 -10G-> fc8A P11 > - P4 -10G-> fc8B P3 > - P5 -10G-> fc7A P11 > - P6 -10G-> fc7B P3 > - P7 -10G-> fc6A P11 > - P8 -10G-> fc6B P3 > - P9 -10G-> fc5A P11 > - P10 -10G-> fc5B P3 > - P11 -10G-> fc4A P11 > - P12 -10G-> fc4B P3 > - P13 -10G-> fc1B P3 > - P14 -10G-> fc1A P11 > - P15 -10G-> fc2B P3 > - P16 -10G-> fc2A P11 > - P17 -10G-> fc3B P3 > - P18 -10G-> fc3A P11 > - P19 -10G-> lc8-3A/P3 > - P20 -10G-> lc8-3B/P3 > - P21 -10G-> lc8-3B/P2 > - P22 -10G-> lc8-3B/P1 > - P23 -10G-> lc8-3A/P2 > - P24 -10G-> lc8-3A/P1 > - P25 -10G-> lc8-4A/P3 > - P26 -10G-> lc8-4B/P3 > - P27 -10G-> lc8-4B/P2 > - P28 -10G-> lc8-4B/P1 > - P29 -10G-> lc8-4A/P2 > - P30 -10G-> lc8-4A/P1 > - P31 -10G-> lc8-5A/P1 > - P32 -10G-> lc8-5A/P2 > - P33 -10G-> lc8-5B/P1 > - P34 -10G-> lc8-5B/P2 > - P35 -10G-> lc8-5B/P3 > - P36 -10G-> lc8-5A/P3 > + P1 -10G-> fc8A P7 > + P2 -10G-> fc8B P20 > + P3 -10G-> fc7A P7 > + P4 -10G-> fc7B P20 > + P5 -10G-> fc6A P7 > + P6 -10G-> fc6B P20 > + P7 -10G-> fc5A P7 > + P8 -10G-> fc5B P20 > + P9 -10G-> fc4A P7 > + P10 -10G-> fc4B P20 > + P11 -10G-> fc3A P7 > + P12 -10G-> fc3B P20 > + P13 -10G-> fc0B P20 > + P14 -10G-> fc0A P7 > + P15 -10G-> fc1B P20 > + P16 -10G-> fc1A P7 > + P17 -10G-> fc2B P20 > + P18 -10G-> fc2A P7 > + P19 -10G-> lc8-3B/P3 > + P20 -10G-> lc8-3A/P3 > + P21 -10G-> lc8-3A/P2 > + P22 -10G-> lc8-3A/P1 > + P23 -10G-> lc8-3B/P2 > + P24 -10G-> lc8-3B/P1 > + P25 -10G-> lc8-4B/P3 > + P26 -10G-> lc8-4A/P3 > + P27 -10G-> lc8-4A/P2 > + P28 -10G-> lc8-4A/P1 > + P29 -10G-> lc8-4B/P2 > + P30 -10G-> lc8-4B/P1 > + P31 -10G-> lc8-5B/P1 > + P32 -10G-> lc8-5B/P2 > + P33 -10G-> lc8-5A/P1 > + P34 -10G-> lc8-5A/P2 > + P35 -10G-> lc8-5A/P3 > + P36 -10G-> lc8-5B/P3 > > SUBSYSTEM LEAF lc8C > - P1 -10G-> fc9B P4 > - P2 -10G-> fc9A P10 > - P3 -10G-> fc8B P4 > - P4 -10G-> fc8A P10 > - P5 -10G-> fc7B P4 > - P6 -10G-> fc7A P10 > - P7 -10G-> fc6B P4 > - P8 -10G-> fc6A P10 > - P9 -10G-> fc5B P4 > - P10 -10G-> fc5A P10 > - P11 -10G-> fc4B P4 > - P12 -10G-> fc4A P10 > - P13 -10G-> fc1A P10 > - P14 -10G-> fc1B P4 > - P15 -10G-> fc2A P10 > - P16 -10G-> fc2B P4 > - P17 -10G-> fc3A P10 > - P18 -10G-> fc3B P4 > - P19 -10G-> lc8-6A/P3 > - P20 -10G-> lc8-6B/P3 > - P21 -10G-> lc8-6B/P2 > - P22 -10G-> lc8-6B/P1 > - P23 -10G-> lc8-6A/P2 > - P24 -10G-> lc8-6A/P1 > - P25 -10G-> lc8-7A/P3 > - P26 -10G-> lc8-7B/P3 > - P27 -10G-> lc8-7B/P2 > - P28 -10G-> lc8-7B/P1 > - P29 -10G-> lc8-7A/P2 > - P30 -10G-> lc8-7A/P1 > - P31 -10G-> lc8-8A/P1 > - P32 -10G-> lc8-8A/P2 > - P33 -10G-> lc8-8B/P1 > - P34 -10G-> lc8-8B/P2 > - P35 -10G-> lc8-8B/P3 > - P36 -10G-> lc8-8A/P3 > + P1 -10G-> fc8B P19 > + P2 -10G-> fc8A P6 > + P3 -10G-> fc7B P19 > + P4 -10G-> fc7A P6 > + P5 -10G-> fc6B P19 > + P6 -10G-> fc6A P6 > + P7 -10G-> fc5B P19 > + P8 -10G-> fc5A P6 > + P9 -10G-> fc4B P19 > + P10 -10G-> fc4A P6 > + P11 -10G-> fc3B P19 > + P12 -10G-> fc3A P6 > + P13 -10G-> fc0A P6 > + P14 -10G-> fc0B P19 > + P15 -10G-> fc1A P6 > + P16 -10G-> fc1B P19 > + P17 -10G-> fc2A P6 > + P18 -10G-> fc2B P19 > + P19 -10G-> lc8-6B/P3 > + P20 -10G-> lc8-6A/P3 > + P21 -10G-> lc8-6A/P2 > + P22 -10G-> lc8-6A/P1 > + P23 -10G-> lc8-6B/P2 > + P24 -10G-> lc8-6B/P1 > + P25 -10G-> lc8-7B/P3 > + P26 -10G-> lc8-7A/P3 > + P27 -10G-> lc8-7A/P2 > + P28 -10G-> lc8-7A/P1 > + P29 -10G-> lc8-7B/P2 > + P30 -10G-> lc8-7B/P1 > + P31 -10G-> lc8-8B/P1 > + P32 -10G-> lc8-8B/P2 > + P33 -10G-> lc8-8A/P1 > + P34 -10G-> lc8-8A/P2 > + P35 -10G-> lc8-8A/P3 > + P36 -10G-> lc8-8B/P3 > > SUBSYSTEM LEAF lc8D > - P1 -10G-> fc9A P18 > - P2 -10G-> fc9B P1 > - P3 -10G-> fc8A P18 > - P4 -10G-> fc8B P1 > - P5 -10G-> fc7A P18 > - P6 -10G-> fc7B P1 > - P7 -10G-> fc6A P18 > - P8 -10G-> fc6B P1 > - P9 -10G-> fc5A P18 > - P10 -10G-> fc5B P1 > - P11 -10G-> fc4A P18 > - P12 -10G-> fc4B P1 > - P13 -10G-> fc1B P1 > - P14 -10G-> fc1A P18 > - P15 -10G-> fc2B P1 > - P16 -10G-> fc2A P18 > - P17 -10G-> fc3B P1 > - P18 -10G-> fc3A P18 > - P19 -10G-> lc8-9A/P3 > - P20 -10G-> lc8-9B/P3 > - P21 -10G-> lc8-9B/P2 > - P22 -10G-> lc8-9B/P1 > - P23 -10G-> lc8-9A/P2 > - P24 -10G-> lc8-9A/P1 > - P25 -10G-> lc8-10A/P3 > - P26 -10G-> lc8-10B/P3 > - P27 -10G-> lc8-10B/P2 > - P28 -10G-> lc8-10B/P1 > - P29 -10G-> lc8-10A/P2 > - P30 -10G-> lc8-10A/P1 > - P31 -10G-> lc8-11A/P1 > - P32 -10G-> lc8-11A/P2 > - P33 -10G-> lc8-11B/P1 > - P34 -10G-> lc8-11B/P2 > - P35 -10G-> lc8-11B/P3 > - P36 -10G-> lc8-11A/P3 > - > -SUBSYSTEM LEAF lc9A > - P1 -10G-> fc9B P21 > - P2 -10G-> fc9A P5 > - P3 -10G-> fc8B P21 > - P4 -10G-> fc8A P5 > - P5 -10G-> fc7B P21 > - P6 -10G-> fc7A P5 > - P7 -10G-> fc6B P21 > - P8 -10G-> fc6A P5 > - P9 -10G-> fc5B P21 > - P10 -10G-> fc5A P5 > - P11 -10G-> fc4B P21 > - P12 -10G-> fc4A P5 > - P13 -10G-> fc1A P5 > - P14 -10G-> fc1B P21 > - P15 -10G-> fc2A P5 > - P16 -10G-> fc2B P21 > - P17 -10G-> fc3A P5 > - P18 -10G-> fc3B P21 > - P19 -10G-> lc9-0A/P3 > - P20 -10G-> lc9-0B/P3 > - P21 -10G-> lc9-0B/P2 > - P22 -10G-> lc9-0B/P1 > - P23 -10G-> lc9-0A/P2 > - P24 -10G-> lc9-0A/P1 > - P25 -10G-> lc9-1A/P3 > - P26 -10G-> lc9-1B/P3 > - P27 -10G-> lc9-1B/P2 > - P28 -10G-> lc9-1B/P1 > - P29 -10G-> lc9-1A/P2 > - P30 -10G-> lc9-1A/P1 > - P31 -10G-> lc9-2A/P1 > - P32 -10G-> lc9-2A/P2 > - P33 -10G-> lc9-2B/P1 > - P34 -10G-> lc9-2B/P2 > - P35 -10G-> lc9-2B/P3 > - P36 -10G-> lc9-2A/P3 > - > -SUBSYSTEM LEAF lc9B > - P1 -10G-> fc9A P7 > - P2 -10G-> fc9B P20 > - P3 -10G-> fc8A P7 > - P4 -10G-> fc8B P20 > - P5 -10G-> fc7A P7 > - P6 -10G-> fc7B P20 > - P7 -10G-> fc6A P7 > - P8 -10G-> fc6B P20 > - P9 -10G-> fc5A P7 > - P10 -10G-> fc5B P20 > - P11 -10G-> fc4A P7 > - P12 -10G-> fc4B P20 > - P13 -10G-> fc1B P20 > - P14 -10G-> fc1A P7 > - P15 -10G-> fc2B P20 > - P16 -10G-> fc2A P7 > - P17 -10G-> fc3B P20 > - P18 -10G-> fc3A P7 > - P19 -10G-> lc9-3A/P3 > - P20 -10G-> lc9-3B/P3 > - P21 -10G-> lc9-3B/P2 > - P22 -10G-> lc9-3B/P1 > - P23 -10G-> lc9-3A/P2 > - P24 -10G-> lc9-3A/P1 > - P25 -10G-> lc9-4A/P3 > - P26 -10G-> lc9-4B/P3 > - P27 -10G-> lc9-4B/P2 > - P28 -10G-> lc9-4B/P1 > - P29 -10G-> lc9-4A/P2 > - P30 -10G-> lc9-4A/P1 > - P31 -10G-> lc9-5A/P1 > - P32 -10G-> lc9-5A/P2 > - P33 -10G-> lc9-5B/P1 > - P34 -10G-> lc9-5B/P2 > - P35 -10G-> lc9-5B/P3 > - P36 -10G-> lc9-5A/P3 > - > -SUBSYSTEM LEAF lc9C > - P1 -10G-> fc9B P19 > - P2 -10G-> fc9A P6 > - P3 -10G-> fc8B P19 > - P4 -10G-> fc8A P6 > - P5 -10G-> fc7B P19 > - P6 -10G-> fc7A P6 > - P7 -10G-> fc6B P19 > - P8 -10G-> fc6A P6 > - P9 -10G-> fc5B P19 > - P10 -10G-> fc5A P6 > - P11 -10G-> fc4B P19 > - P12 -10G-> fc4A P6 > - P13 -10G-> fc1A P6 > - P14 -10G-> fc1B P19 > - P15 -10G-> fc2A P6 > - P16 -10G-> fc2B P19 > - P17 -10G-> fc3A P6 > - P18 -10G-> fc3B P19 > - P19 -10G-> lc9-6A/P3 > - P20 -10G-> lc9-6B/P3 > - P21 -10G-> lc9-6B/P2 > - P22 -10G-> lc9-6B/P1 > - P23 -10G-> lc9-6A/P2 > - P24 -10G-> lc9-6A/P1 > - P25 -10G-> lc9-7A/P3 > - P26 -10G-> lc9-7B/P3 > - P27 -10G-> lc9-7B/P2 > - P28 -10G-> lc9-7B/P1 > - P29 -10G-> lc9-7A/P2 > - P30 -10G-> lc9-7A/P1 > - P31 -10G-> lc9-8A/P1 > - P32 -10G-> lc9-8A/P2 > - P33 -10G-> lc9-8B/P1 > - P34 -10G-> lc9-8B/P2 > - P35 -10G-> lc9-8B/P3 > - P36 -10G-> lc9-8A/P3 > - > -SUBSYSTEM LEAF lc9D > - P1 -10G-> fc9A P9 > - P2 -10G-> fc9B P22 > - P3 -10G-> fc8A P9 > - P4 -10G-> fc8B P22 > - P5 -10G-> fc7A P9 > - P6 -10G-> fc7B P22 > - P7 -10G-> fc6A P9 > - P8 -10G-> fc6B P22 > - P9 -10G-> fc5A P9 > - P10 -10G-> fc5B P22 > - P11 -10G-> fc4A P9 > - P12 -10G-> fc4B P22 > - P13 -10G-> fc1B P22 > - P14 -10G-> fc1A P9 > - P15 -10G-> fc2B P22 > - P16 -10G-> fc2A P9 > - P17 -10G-> fc3B P22 > - P18 -10G-> fc3A P9 > - P19 -10G-> lc9-9A/P3 > - P20 -10G-> lc9-9B/P3 > - P21 -10G-> lc9-9B/P2 > - P22 -10G-> lc9-9B/P1 > - P23 -10G-> lc9-9A/P2 > - P24 -10G-> lc9-9A/P1 > - P25 -10G-> lc9-10A/P3 > - P26 -10G-> lc9-10B/P3 > - P27 -10G-> lc9-10B/P2 > - P28 -10G-> lc9-10B/P1 > - P29 -10G-> lc9-10A/P2 > - P30 -10G-> lc9-10A/P1 > - P31 -10G-> lc9-11A/P1 > - P32 -10G-> lc9-11A/P2 > - P33 -10G-> lc9-11B/P1 > - P34 -10G-> lc9-11B/P2 > - P35 -10G-> lc9-11B/P3 > - P36 -10G-> lc9-11A/P3 > + P1 -10G-> fc8A P9 > + P2 -10G-> fc8B P22 > + P3 -10G-> fc7A P9 > + P4 -10G-> fc7B P22 > + P5 -10G-> fc6A P9 > + P6 -10G-> fc6B P22 > + P7 -10G-> fc5A P9 > + P8 -10G-> fc5B P22 > + P9 -10G-> fc4A P9 > + P10 -10G-> fc4B P22 > + P11 -10G-> fc3A P9 > + P12 -10G-> fc3B P22 > + P13 -10G-> fc0B P22 > + P14 -10G-> fc0A P9 > + P15 -10G-> fc1B P22 > + P16 -10G-> fc1A P9 > + P17 -10G-> fc2B P22 > + P18 -10G-> fc2A P9 > + P19 -10G-> lc8-9B/P3 > + P20 -10G-> lc8-9A/P3 > + P21 -10G-> lc8-9A/P2 > + P22 -10G-> lc8-9A/P1 > + P23 -10G-> lc8-9B/P2 > + P24 -10G-> lc8-9B/P1 > + P25 -10G-> lc8-10B/P3 > + P26 -10G-> lc8-10A/P3 > + P27 -10G-> lc8-10A/P2 > + P28 -10G-> lc8-10A/P1 > + P29 -10G-> lc8-10B/P2 > + P30 -10G-> lc8-10B/P1 > + P31 -10G-> lc8-11B/P1 > + P32 -10G-> lc8-11B/P2 > + P33 -10G-> lc8-11A/P1 > + P34 -10G-> lc8-11A/P2 > + P35 -10G-> lc8-11A/P3 > + P36 -10G-> lc8-11B/P3 > diff --git a/ibdm/ibnl/SUNDCS72QDR.ibnl b/ibdm/ibnl/SUNDCS72QDR.ibnl > index 1907ec3..fee233a 100644 > --- a/ibdm/ibnl/SUNDCS72QDR.ibnl > +++ b/ibdm/ibnl/SUNDCS72QDR.ibnl > @@ -176,24 +176,24 @@ SUBSYSTEM LEAF SW-D > P16 -10G-> SW-E P18 > P17 -10G-> SW-F P17 > P18 -10G-> SW-E P16 > - P19 -10G-> C-9A/P3 > - P20 -10G-> C-9B/P3 > - P21 -10G-> C-9B/P2 > - P22 -10G-> C-9B/P1 > - P23 -10G-> C-9A/P2 > - P24 -10G-> C-9A/P1 > - P25 -10G-> C-10A/P3 > - P26 -10G-> C-10B/P3 > - P27 -10G-> C-10B/P2 > - P28 -10G-> C-10B/P1 > - P29 -10G-> C-10A/P2 > - P30 -10G-> C-10A/P1 > - P31 -10G-> C-11A/P1 > - P32 -10G-> C-11A/P2 > - P33 -10G-> C-11B/P1 > - P34 -10G-> C-11B/P2 > - P35 -10G-> C-11B/P3 > - P36 -10G-> C-11A/P3 > + P19 -10G-> C-9B/P3 > + P20 -10G-> C-9A/P3 > + P21 -10G-> C-9A/P2 > + P22 -10G-> C-9A/P1 > + P23 -10G-> C-9B/P2 > + P24 -10G-> C-9B/P1 > + P25 -10G-> C-10B/P3 > + P26 -10G-> C-10A/P3 > + P27 -10G-> C-10A/P2 > + P28 -10G-> C-10A/P1 > + P29 -10G-> C-10B/P2 > + P30 -10G-> C-10B/P1 > + P31 -10G-> C-11B/P1 > + P32 -10G-> C-11B/P2 > + P33 -10G-> C-11A/P1 > + P34 -10G-> C-11A/P2 > + P35 -10G-> C-11A/P3 > + P36 -10G-> C-11B/P3 > > SUBSYSTEM LEAF SW-C > P1 -10G-> SW-F P9 > @@ -214,24 +214,24 @@ SUBSYSTEM LEAF SW-C > P16 -10G-> SW-E P24 > P17 -10G-> SW-F P23 > P18 -10G-> SW-E P22 > - P19 -10G-> C-6A/P3 > - P20 -10G-> C-6B/P3 > - P21 -10G-> C-6B/P2 > - P22 -10G-> C-6B/P1 > - P23 -10G-> C-6A/P2 > - P24 -10G-> C-6A/P1 > - P25 -10G-> C-7A/P3 > - P26 -10G-> C-7B/P3 > - P27 -10G-> C-7B/P2 > - P28 -10G-> C-7B/P1 > - P29 -10G-> C-7A/P2 > - P30 -10G-> C-7A/P1 > - P31 -10G-> C-8A/P1 > - P32 -10G-> C-8A/P2 > - P33 -10G-> C-8B/P1 > - P34 -10G-> C-8B/P2 > - P35 -10G-> C-8B/P3 > - P36 -10G-> C-8A/P3 > + P19 -10G-> C-6B/P3 > + P20 -10G-> C-6A/P3 > + P21 -10G-> C-6A/P2 > + P22 -10G-> C-6A/P1 > + P23 -10G-> C-6B/P2 > + P24 -10G-> C-6B/P1 > + P25 -10G-> C-7B/P3 > + P26 -10G-> C-7A/P3 > + P27 -10G-> C-7A/P2 > + P28 -10G-> C-7A/P1 > + P29 -10G-> C-7B/P2 > + P30 -10G-> C-7B/P1 > + P31 -10G-> C-8B/P1 > + P32 -10G-> C-8B/P2 > + P33 -10G-> C-8A/P1 > + P34 -10G-> C-8A/P2 > + P35 -10G-> C-8A/P3 > + P36 -10G-> C-8B/P3 > > SUBSYSTEM LEAF SW-B > P1 -10G-> SW-E P28 > @@ -252,24 +252,24 @@ SUBSYSTEM LEAF SW-B > P16 -10G-> SW-F P18 > P17 -10G-> SW-E P17 > P18 -10G-> SW-F P16 > - P19 -10G-> C-3A/P3 > - P20 -10G-> C-3B/P3 > - P21 -10G-> C-3B/P2 > - P22 -10G-> C-3B/P1 > - P23 -10G-> C-3A/P2 > - P24 -10G-> C-3A/P1 > - P25 -10G-> C-4A/P3 > - P26 -10G-> C-4B/P3 > - P27 -10G-> C-4B/P2 > - P28 -10G-> C-4B/P1 > - P29 -10G-> C-4A/P2 > - P30 -10G-> C-4A/P1 > - P31 -10G-> C-5A/P1 > - P32 -10G-> C-5A/P2 > - P33 -10G-> C-5B/P1 > - P34 -10G-> C-5B/P2 > - P35 -10G-> C-5B/P3 > - P36 -10G-> C-5A/P3 > + P19 -10G-> C-3B/P3 > + P20 -10G-> C-3A/P3 > + P21 -10G-> C-3A/P2 > + P22 -10G-> C-3A/P1 > + P23 -10G-> C-3B/P2 > + P24 -10G-> C-3B/P1 > + P25 -10G-> C-4B/P3 > + P26 -10G-> C-4A/P3 > + P27 -10G-> C-4A/P2 > + P28 -10G-> C-4A/P1 > + P29 -10G-> C-4B/P2 > + P30 -10G-> C-4B/P1 > + P31 -10G-> C-5B/P1 > + P32 -10G-> C-5B/P2 > + P33 -10G-> C-5A/P1 > + P34 -10G-> C-5A/P2 > + P35 -10G-> C-5A/P3 > + P36 -10G-> C-5B/P3 > > SUBSYSTEM LEAF SW-A > P1 -10G-> SW-E P9 > @@ -290,22 +290,22 @@ SUBSYSTEM LEAF SW-A > P16 -10G-> SW-F P24 > P17 -10G-> SW-E P23 > P18 -10G-> SW-F P22 > - P19 -10G-> C-0A/P3 > - P20 -10G-> C-0B/P3 > - P21 -10G-> C-0B/P2 > - P22 -10G-> C-0B/P1 > - P23 -10G-> C-0A/P2 > - P24 -10G-> C-0A/P1 > - P25 -10G-> C-1A/P3 > - P26 -10G-> C-1B/P3 > - P27 -10G-> C-1B/P2 > - P28 -10G-> C-1B/P1 > - P29 -10G-> C-1A/P2 > - P30 -10G-> C-1A/P1 > - P31 -10G-> C-2A/P1 > - P32 -10G-> C-2A/P2 > - P33 -10G-> C-2B/P1 > - P34 -10G-> C-2B/P2 > - P35 -10G-> C-2B/P3 > - P36 -10G-> C-2A/P3 > + P19 -10G-> C-0B/P3 > + P20 -10G-> C-0A/P3 > + P21 -10G-> C-0A/P2 > + P22 -10G-> C-0A/P1 > + P23 -10G-> C-0B/P2 > + P24 -10G-> C-0B/P1 > + P25 -10G-> C-1B/P3 > + P26 -10G-> C-1A/P3 > + P27 -10G-> C-1A/P2 > + P28 -10G-> C-1A/P1 > + P29 -10G-> C-1B/P2 > + P30 -10G-> C-1B/P1 > + P31 -10G-> C-2B/P1 > + P32 -10G-> C-2B/P2 > + P33 -10G-> C-2A/P1 > + P34 -10G-> C-2A/P2 > + P35 -10G-> C-2A/P3 > + P36 -10G-> C-2B/P3 > > > From monis at Voltaire.COM Mon Sep 21 02:28:16 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Mon, 21 Sep 2009 12:28:16 +0300 Subject: [ofa-general] Re: [PATCH] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: References: <4AB20C6C.9090005@Voltaire.COM> Message-ID: <4AB74730.5030203@Voltaire.COM> Roland Dreier wrote: > > + if (ib_query_port(priv->ca, priv->port, &attr) || > > + attr.state != IB_PORT_ACTIVE) { > > + ipoib_dbg(priv, "wait with carrier until IB port is active\n"); > > + if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) > > + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, HZ); > > + return; > > + } > > This queueing delayed work to poll the port state seems a bit odd to > me... we get an event when the port changes state anyway, right? So > can't we just turn the carrier on when we get an active event? > > - R. You're right. I've complicated things where I shouldn't need. The call to __ipoib_ib_dev_flush() from ipoib_event() will requeue the carrier_on_task in join completion of the broadcast group. I'll resend Thanks. From monis at Voltaire.COM Mon Sep 21 02:34:06 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Mon, 21 Sep 2009 12:34:06 +0300 Subject: [ofa-general] Re: [PATCH] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: References: <4AB20C6C.9090005@Voltaire.COM> Message-ID: <4AB7488E.6090700@Voltaire.COM> Roland Dreier wrote: > And by the way, this current patch has a deadlock I think: > > > @@ -724,6 +724,8 @@ int ipoib_ib_dev_down(struct net_device *dev, int flush) > > ipoib_dbg(priv, "downing ib_dev\n"); > > > > clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); > > + cancel_delayed_work(&priv->carrier_on_task); > > ipoib_ib_dev_down() is called with rtnl held but carrier_on_task() does > rtn_lock(). So if carrier_on_task() is running but about to take the > rtnl when we try to do cancel_delayed_work() here, then it will wait > forever. > > I think using lockdep on a new enough kernel (2.6.30 or maybe 2.6.31) > will report workqueue / timer vs. lock deadlocks. > > - R. I may miss this but I don't see how ipoib_ib_dev_down() is called with rtnl held. Anyway, the new patch doesn't use delayed work. From monis at Voltaire.COM Mon Sep 21 02:42:16 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Mon, 21 Sep 2009 12:42:16 +0300 Subject: [ofa-general] [PATCH V2] IB/ipoib: Do not turn on carrier to a non active port Message-ID: <4AB74A78.2080108@Voltaire.COM> This patch fixes https://bugs.openfabrics.org/show_bug.cgi?id=1726 Multicast join can succeed even if IB port is down. This happens when OpenSM runs on the same port with the requesting port. IPoIB on the other hand, calls netif_carrier_on() when join succeeded without caring about the state of the IB port. The result is an IPoIB interface in RUNNING state but without active IB port to support it. If a bonding interface uses this IPoIB interface as a slave it might not detect that this slave is almost useless and failover functionality will be damaged. The fix checks the state of the IB port in the carrier_task before calling netif_carrier_on(). Signed-off-by: Moni Shoua --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 25874fc..9ace51d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -362,12 +362,19 @@ void ipoib_mcast_carrier_on_task(struct work_struct *work) { struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv, carrier_on_task); + struct ib_port_attr attr; /* * Take rtnl_lock to avoid racing with ipoib_stop() and * turning the carrier back on while a device is being * removed. */ + + if (ib_query_port(priv->ca, priv->port, &attr) || + attr.state != IB_PORT_ACTIVE) { + ipoib_dbg(priv, "wait with carrier until IB port is active\n"); + return; + } rtnl_lock(); netif_carrier_on(priv->dev); rtnl_unlock(); From vlad at lists.openfabrics.org Mon Sep 21 03:14:02 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 21 Sep 2009 03:14:02 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090921-0200 daily build status Message-ID: <20090921101402.CF8DCE62048@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From jackm at dev.mellanox.co.il Mon Sep 21 03:12:42 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 21 Sep 2009 13:12:42 +0300 Subject: [ofa-general] [PATCH] mthca: Fix access to freed memory in catas processing Message-ID: <200909211312.43359.jackm@dev.mellanox.co.il> catas_reset() uses a pointer to mthca_dev, but mthca_dev may not be valid after the call to __mthca_restart_one(). Based on a similar patch for mlx4 by Vitaliy Gusev Signed-off-by: Jack Morgenstein --- Roland, Here is the equivalent patch for mthca catas error processing. Here, also, we need to avoid accessing freed memory. I allocated the "d" struct pointer, because I think that using the "dev" loop variable could be problematic (I do not want to depend on the internals of "list_for_each_entry_safe"). diff --git a/drivers/infiniband/hw/mthca/mthca_catas.c b/drivers/infiniband/hw/mthca/mthca_catas.c index 056b2a4..0aa0110 100644 --- a/drivers/infiniband/hw/mthca/mthca_catas.c +++ b/drivers/infiniband/hw/mthca/mthca_catas.c @@ -68,11 +68,16 @@ static void catas_reset(struct work_struct *work) spin_unlock_irq(&catas_lock); list_for_each_entry_safe(dev, tmpdev, &tlist, catas_err.list) { + struct pci_dev *pdev = dev->pdev; ret = __mthca_restart_one(dev->pdev); + /* 'dev' now is not valid */ if (ret) - mthca_err(dev, "Reset failed (%d)\n", ret); - else - mthca_dbg(dev, "Reset succeeded\n"); + printk(KERN_ERR "mthca %s: Reset failed (%d)\n", + pci_name(pdev), ret); + else { + struct mthca_dev *d = pci_get_drvdata(pdev); + mthca_dbg(d, "Reset succeeded\n"); + } } mutex_unlock(&mthca_device_mutex); From dorons at voltaire.com Mon Sep 21 05:03:43 2009 From: dorons at voltaire.com (Doron Shoham) Date: Mon, 21 Sep 2009 15:03:43 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> <4AB0F0E5.2000305@voltaire.com> Message-ID: <4AB76B9F.1080706@voltaire.com> Add ibcheckroutes script. ibcheckroutes validates route between all leaf switches, switches or CAs in the fabric. Signed-off-by: Doron Shoham --- infiniband-diags/Makefile.am | 4 +- infiniband-diags/configure.in | 1 + infiniband-diags/man/ibcheckroutes.8 | 46 ++++++++++ infiniband-diags/scripts/ibcheckroutes.in | 138 +++++++++++++++++++++++++++++ 4 files changed, 187 insertions(+), 2 deletions(-) create mode 100644 infiniband-diags/man/ibcheckroutes.8 create mode 100644 infiniband-diags/scripts/ibcheckroutes.in diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 1cdb60e..57363c4 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ scripts/ibprintca.pl scripts/ibprintrt.pl \ scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ - scripts/check_lft_balance.pl + scripts/check_lft_balance.pl scripts/ibcheckroutes noinst_LIBRARIES = libcommon.a @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ - man/check_lft_balance.8 + man/check_lft_balance.8 man/ibcheckroutes.8 BUILT_SOURCES = ibdiag_version ibdiag_version: diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 3ef35cc..aa178c5 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ scripts/ibcheckportwidth \ scripts/ibcheckstate \ scripts/ibcheckwidth \ + scripts/ibcheckroutes \ scripts/ibclearcounters \ scripts/ibclearerrors \ scripts/ibdatacounts \ diff --git a/infiniband-diags/man/ibcheckroutes.8 b/infiniband-diags/man/ibcheckroutes.8 new file mode 100644 index 0000000..fe6f0d6 --- /dev/null +++ b/infiniband-diags/man/ibcheckroutes.8 @@ -0,0 +1,46 @@ +.TH IBCHECKROUTES 8 "September 10, 2009" "OpenIB" "OpenIB Diagnostics" + +.SH NAME +ibcheckroutes \- validate routes between all hosts in fabric + +.SH SYNOPSIS +.B ibcheckroutes +[\-l] [\-s] [\-c] [\-n topology-file ] [\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] + +.SH DESCRIPTION +.PP +ibcheckroutes is a script which can use a full topology file that was created by ibnetdiscover or +scans the subnet. Then it validates routes between all leaf switches, switches or CAs in the fabric. + +.SH OPTIONS +.PP +\-n Use topology-file. +.PP +\-l Check routes between all leaf switches. +.PP +\-s Check routes between all switches. +.PP +\-c Check routes between all CAs. +.PP +\-h Show help. +.PP +\-N Use mono rather than color mode. +.PP +\-b Suppress output. +.PP +\-e Show errors only. +.PP +\-C Use the specified ca_name. +.PP +\-P Use the specified ca_port. +.PP +\-t Override the default timeout for the solicited mads. + +.SH SEE ALSO +.BR ibnetdiscover(8), +.BR ibtracert(8) + +.SH AUTHOR +.TP +Doron Shoham +.RI < dorons at voltaire.com > diff --git a/infiniband-diags/scripts/ibcheckroutes.in b/infiniband-diags/scripts/ibcheckroutes.in new file mode 100644 index 0000000..c7dd191 --- /dev/null +++ b/infiniband-diags/scripts/ibcheckroutes.in @@ -0,0 +1,138 @@ +#!/bin/sh + +IBPATH=${IBPATH:- at IBSCRIPTPATH@} + +function usage() { + echo -e Usage: `basename $0` "[-l] [-s] [-c] [-h] [-N] [-b] [-e] [-n topology-file ] \ +[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms]" + echo -e " Validate routes between all leaf switches, switches or CAs in the fabric" + echo -e " -n - Use topology-file" + echo -e " -l - Check routes between all leaf switches" + echo -e " -s - Check routes between all switches" + echo -e " -c - Check routes between all CAs" + echo -e " -h - Show help" + echo -e " -N - Use mono rather than color mode" + echo -e " -b - Suppress output" + echo -e " -e - Show errors only" + echo -e " -C - Use the specified ca_name" + echo -e " -P - Use the specified ca_port" + echo -e " -t - Override the default timeout for the solicited mads" + exit -1 +} + +function user_abort() { + echo "Aborted" + exit 1 +} + +function green() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[OK]\n" $1 + return + fi + printf "\033[1;032m${res_col}[OK]\033[0;39m\n" $1 +} + +function red() { + if [ "$bw" = "yes" ]; then + printf "${res_col}[FAILED]\n" "$1" + return + fi + printf "\033[31m${res_col}[FAILED]\033[0m\n" "$1" +} + +trap user_abort SIGINT SIGTERM + +bw="" +brief=0 +error=0 +ca_info="" +st=0 +method="leaf" +topofile=/tmp/net +discover=1 +res_col="%-20.20s" + +function get_opts() { + while getopts P:C:t:n:beNhlsc o; do + case "$o" in + n) + topofile="$OPTARG" + discover=0 + ;; + l) + method="leaf" + ;; + s) + method="sw" + ;; + c) + method="ca" + ;; + h) + usage + ;; + N) + bw="yes" + ;; + b) + brief=1 + ;; + e) + error=1 + ;; + P | C | t | timeout) + ca_info="$ca_info -$o $OPTARG" + ;; + *) + usage + ;; + esac + done +} + +get_opts $* + +if [ $discover -eq 1 ]; then + $IBPATH/ibnetdiscover $ca_info > $topofile +fi + +# find LIDs to check +case $method in +leaf) + [ $brief -eq 0 ] && echo -e "Checking routes between all Leaf Switches" + LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) if (v!=0) print v}' $topofile)) + ;; +sw) + [ $brief -eq 0 ] && echo -e "Checking routes between all Switches" + LIDS=($(awk '/^Switch/ {a[$(NF-2)]=$(NF-2)} END{for(v in a) if (v!=0) print v}' $topofile)) + ;; +ca) + [ $brief -eq 0 ] && echo -e "Checking routes between all CAs" + LIDS=($(awk '/# lid /{lmc=$7; e=2^lmc+$5; for(i=$5; i Destination lid" +for((s=0; s /dev/null + if [ $? -eq 0 ]; then + [ $brief -eq 0 ] && [ $error -eq 0 ] && green "${LIDS[$s]}-->${LIDS[$d]}" + else + [ $brief -eq 0 ] && red "${LIDS[$s]}-->${LIDS[$d]}" + st=1 + fi + done +done + +exit $st -- 1.5.4 From sashak at voltaire.com Mon Sep 21 05:26:40 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 15:26:40 +0300 Subject: [ofa-general] Re: [PATCH] libibmad/dump.c: In mad_dump_portcapmask, decode new capabilities In-Reply-To: <20090902143627.GB10980@comcast.net> References: <20090902143627.GB10980@comcast.net> Message-ID: <20090921122640.GB24398@me> On 10:36 Wed 02 Sep , Hal Rosenstock wrote: > > Per published MgtWG errata > RefID 4484 - vendor specific MADs table support > RefID 4626 - reverse path PKey support in PathRecord responses > RefID 4635 - multicast FDB top support > RefID 4644 - hierarchy support > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Mon Sep 21 05:26:58 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 15:26:58 +0300 Subject: [ofa-general] Re: [PATCH] libibmad/mad.h: Add a couple of SM class attribute IDs In-Reply-To: <20090902144250.GC10980@comcast.net> References: <20090902144250.GC10980@comcast.net> Message-ID: <20090921122658.GC24398@me> On 10:42 Wed 02 Sep , Hal Rosenstock wrote: > > VendorSpecificMadsTable added by MgtWG errata RefID 4482 > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Mon Sep 21 05:59:36 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 15:59:36 +0300 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: <20090902143133.GA10980@comcast.net> References: <20090902143133.GA10980@comcast.net> Message-ID: <20090921125936.GD24398@me> On 10:31 Wed 02 Sep , Hal Rosenstock wrote: > > Add support for SwitchInfo:MulticastFDBTop > Added by MgtWG errata #4505-4508 > Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries > > In osm_mcast_mgr.c:mcast_mgr_set_mftables call new routine > mcast_mgr_set_mfttop to set MulticastFDBTop in SwitchInfo > based on max_block_in_use when switch port 0 indicates > IsMulticastFDBTop is supported. > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index d7c5ce1..3671e08 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -1066,6 +1066,83 @@ Exit: > > /********************************************************************** > **********************************************************************/ > +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) > +{ > + osm_node_t *p_node; > + osm_dr_path_t *p_path; > + osm_physp_t *p_physp; > + osm_mcast_tbl_t *p_tbl; > + osm_madw_context_t context; > + ib_api_status_t status; > + ib_switch_info_t si; > + boolean_t set_swinfo_require = FALSE; > + uint16_t mcast_top; > + uint8_t life_state; > + > + OSM_LOG_ENTER(sm->p_log); > + > + CL_ASSERT(p_sw); > + > + p_node = p_sw->p_node; > + > + CL_ASSERT(p_node); > + > + p_physp = osm_node_get_physp_ptr(p_node, 0); > + p_path = osm_physp_get_dr_path_ptr(p_physp); > + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > + > + if (p_physp->port_info.capability_mask & IB_PORT_CAP_HAS_MCAST_FDB_TOP) { BTW any reason why this capability bit if placed in PortInfo and not in SwitchInfo (it is not port but switch related feature)? > + /* > + Set the top of the multicast forwarding table. > + */ > + si = p_sw->switch_info; > + if (p_tbl->max_block_in_use == -1) > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); > + else > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + > + (p_tbl->max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); > + if (mcast_top != si.mcast_top) { > + set_swinfo_require = TRUE; > + si.mcast_top = mcast_top; > + } > + > + /* check to see if the change state bit is on. If it is - then > + we need to clear it. */ > + if (ib_switch_info_get_state_change(&si)) > + life_state = ((sm->p_subn->opt.packet_life_time << 3) > + | (si.life_state & IB_SWITCH_PSC)) & 0xfc; > + else > + life_state = (sm->p_subn->opt.packet_life_time << 3) & 0xf8; > + > + if (life_state != si.life_state || > + ib_switch_info_get_state_change(&si)) { > + set_swinfo_require = TRUE; > + si.life_state = life_state; > + } Switch's StateChange and LifeState are handled when unicast routing is configured. Why do we need duplicate it here? Sasha > + > + if (set_swinfo_require) { > + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > + "Setting switch MFT top to MLID 0x%x\n", > + cl_ntoh16(si.mcast_top)); > + > + context.si_context.light_sweep = FALSE; > + context.si_context.node_guid = osm_node_get_node_guid(p_node); > + context.si_context.set_method = TRUE; > + > + status = osm_req_set(sm, p_path, (uint8_t *) & si, > + sizeof(si), IB_MAD_ATTR_SWITCH_INFO, > + 0, CL_DISP_MSGID_NONE, &context); > + > + if (status != IB_SUCCESS) > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1B: " > + "Sending SwitchInfo attribute failed (%s)\n", > + ib_get_err_str(status)); > + } > + } > +} > + > +/********************************************************************** > + **********************************************************************/ > static int mcast_mgr_set_mftables(osm_sm_t * sm) > { > cl_qmap_t *p_sw_tbl = &sm->p_subn->sw_guid_tbl; > @@ -1081,6 +1158,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) > p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > if (osm_mcast_tbl_get_max_block_in_use(p_tbl) > max_block) > max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); > + mcast_mgr_set_mfttop(sm, p_sw); > p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > } > > diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c > index d2ab96a..fb58fe5 100644 > --- a/opensm/opensm/osm_sa_class_port_info.c > +++ b/opensm/opensm/osm_sa_class_port_info.c > @@ -1,6 +1,6 @@ > /* > * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * > * This software is available to you under a choice of one of two > @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) > OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; > #endif > if (sa->p_subn->opt.qos) > - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); > - > + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | > + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); > + else > + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); > if (!sa->p_subn->opt.disable_multicast) > p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; > p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); > From hal.rosenstock at gmail.com Mon Sep 21 06:20:04 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 09:20:04 -0400 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: <20090920102045.GH17656@me> References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> Message-ID: Hi Sasha, On Sun, Sep 20, 2009 at 6:20 AM, Sasha Khapyorsky wrote: > diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c > index 986a8d2..8357d90 100644 > --- a/opensm/osmtest/osmtest.c > +++ b/opensm/osmtest/osmtest.c > + > + /* > + * Do a blocking query for the PathRecord. > + */ > + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, &context); > + if (status != IB_SUCCESS) { > + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 000A: " > + "osmtest_get_path_rec_by_lid_pair failed (%s)\n", > + ib_get_err_str(status)); > + goto Exit; > + } It is not really "stress" testing, just pinging. So are the other tests (additionally those use RMPP). Isn't repetitive pinging a stress of a kind ? > Shouldn't it be clarified in test description? Same level of description as other tests. They all could be made more descriptive. -- Hal -------------- next part -------------- An HTML attachment was scrubbed... URL: From robyf at tekno-soft.it Mon Sep 21 01:42:05 2009 From: robyf at tekno-soft.it (Roberto Fichera) Date: Mon, 21 Sep 2009 10:42:05 +0200 (CEST) Subject: [ofa-general] Building IB SAN with Linux without switch In-Reply-To: <2f3bf9a60909202304r33c97d6dx37c48b1eb8d319eb@mail.gmail.com> References: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> <2f3bf9a60909202304r33c97d6dx37c48b1eb8d319eb@mail.gmail.com> Message-ID: <259de15b2d30989cc26f53cf02dc4b8d.squirrel@webmail.tekno-soft.it> On Lun, 21 Settembre 2009 8:04 am, Dotan Barak wrote: > Hi. > > On Fri, Sep 18, 2009 at 11:41 PM, Roberto Fichera > wrote: >> Hi All in the list, >> >> I would like to know if it's possible to configure a linux server with 2 >> or 3 HCAs, with 2 ports each, so that I can connect 4 or 6 nodes without >> using any switch in the middle. If possible, please show an example of >> the >> network configuration. > > Yes, this is possible. > > But please pay attention: Every port will be in a different Infiniband > subnet (from the local host point of view). yes! I know! Actually I don't need any communication between nodes via IB nodes. I'll use a normal Gb network for that. > > What do you plan to do with this setup? > (which SW/program to use?) A dedicated 10GB link for a limited number of nodes. The server will work like a SAN storage, nodes are mainly used for Xen virtualization. > > Dotan > From hnrose at comcast.net Mon Sep 21 06:18:25 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 09:18:25 -0400 Subject: [ofa-general] [PATCHv2] osmtest: Add SA get PathRecord stress test Message-ID: <20090921131825.GB14213@comcast.net> Signed-off-by: Hal Rosenstock --- Changes since v1: Removed unneeded mode parameter diff --git a/opensm/man/osmtest.8 b/opensm/man/osmtest.8 index fa0cd52..f0d6323 100644 --- a/opensm/man/osmtest.8 +++ b/opensm/man/osmtest.8 @@ -1,4 +1,4 @@ -.TH OSMTEST 8 "August 11, 2008" "OpenIB" "OpenIB Management" +.TH OSMTEST 8 "August 31, 2009" "OpenIB" "OpenIB Management" .SH NAME osmtest \- InfiniBand subnet manager and administration (SM/SA) test program @@ -108,9 +108,10 @@ Stress test options are as follows: OPT Description --- ----------------- - -s1 - Single-MAD response SA queries + -s1 - Single-MAD (RMPP) response SA queries -s2 - Multi-MAD (RMPP) response SA queries -s3 - Multi-MAD (RMPP) Path Record SA queries + -s4 - Single-MAD (non RMPP) get Path Record SA queries Without -s, stress testing is not performed .TP diff --git a/opensm/osmtest/include/osmtest_base.h b/opensm/osmtest/include/osmtest_base.h index 7c33da3..cda3a31 100644 --- a/opensm/osmtest/include/osmtest_base.h +++ b/opensm/osmtest/include/osmtest_base.h @@ -56,11 +56,12 @@ #define STRESS_SMALL_RMPP_THR 100000 /* - Take long times when quering big clusters (over 40 nodes) , an average of : 0.25 sec for query + Take long times when querying big clusters (over 40 nodes), an average of : 0.25 sec for query each query receives 1000 records */ #define STRESS_LARGE_RMPP_THR 4000 #define STRESS_LARGE_PR_RMPP_THR 20000 +#define STRESS_GET_PR 100000 extern const char *const p_file; diff --git a/opensm/osmtest/main.c b/opensm/osmtest/main.c index bb2d6bc..4bb9f82 100644 --- a/opensm/osmtest/main.c +++ b/opensm/osmtest/main.c @@ -143,9 +143,10 @@ void show_usage() " Stress test options are as follows:\n" " OPT Description\n" " --- -----------------\n" - " -s1 - Single-MAD response SA queries\n" + " -s1 - Single-MAD (RMPP) response SA queries\n" " -s2 - Multi-MAD (RMPP) response SA queries\n" " -s3 - Multi-MAD (RMPP) Path Record SA queries\n" + " -s4 - Single-MAD (non RMPP) get Path Record SA queries\n" " Without -s, stress testing is not performed\n\n"); printf("-M\n" "--Multicast_Mode\n" @@ -499,6 +500,9 @@ int main(int argc, char *argv[]) case 3: printf("Large Path Record SA queries\n"); break; + case 4: + printf("SA Get Path Record queries\n"); + break; default: printf("Unknown value %u (ignored)\n", opt.stress); diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c index 986a8d2..c6ec955 100644 --- a/opensm/osmtest/osmtest.c +++ b/opensm/osmtest/osmtest.c @@ -2882,6 +2882,146 @@ Exit: /********************************************************************** **********************************************************************/ +ib_api_status_t +osmtest_stress_path_recs_by_lid(IN osmtest_t * const p_osmt, + OUT uint32_t * const p_num_recs, + OUT uint32_t * const p_num_queries) +{ + osmtest_req_context_t context; + ib_path_rec_t *p_rec; + cl_status_t status; + ib_net16_t dlid, slid; + int num_recs, i; + + OSM_LOG_ENTER(&p_osmt->log); + + memset(&context, 0, sizeof(context)); + + slid = cl_ntoh16(p_osmt->local_port.lid); + dlid = cl_ntoh16(p_osmt->local_port.sm_lid); + + /* + * Do a blocking query for the PathRecord. + */ + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, &context); + if (status != IB_SUCCESS) { + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 000A: " + "osmtest_get_path_rec_by_lid_pair failed (%s)\n", + ib_get_err_str(status)); + goto Exit; + } + + /* + * Populate the database with the received records. + */ + num_recs = context.result.result_cnt; + *p_num_recs += num_recs; + ++*p_num_queries; + + if (osm_log_is_active(&p_osmt->log, OSM_LOG_VERBOSE)) { + OSM_LOG(&p_osmt->log, OSM_LOG_VERBOSE, + "Received %u records\n", num_recs); + + for (i = 0; i < num_recs; i++) { + p_rec = osmv_get_query_path_rec(context.result.p_result_madw, 0); + osm_dump_path_record(&p_osmt->log, p_rec, OSM_LOG_VERBOSE); + } + } + +Exit: + /* + * Return the IB query MAD to the pool as necessary. + */ + if (context.result.p_result_madw != NULL) { + osm_mad_pool_put(&p_osmt->mad_pool, + context.result.p_result_madw); + context.result.p_result_madw = NULL; + } + + OSM_LOG_EXIT(&p_osmt->log); + return (status); +} + +/********************************************************************** + **********************************************************************/ +static ib_api_status_t osmtest_stress_get_pr(IN osmtest_t * const p_osmt) +{ + ib_api_status_t status = IB_SUCCESS; + uint64_t num_recs = 0; + uint64_t num_queries = 0; + uint32_t delta_recs; + uint32_t delta_queries; + uint32_t print_freq = 0; + int num_timeouts = 0; + struct timeval start_tv, end_tv; + long sec_diff, usec_diff; + + OSM_LOG_ENTER(&p_osmt->log); + gettimeofday(&start_tv, NULL); + printf("-I- Start time is : %09ld:%06ld [sec:usec]\n", + start_tv.tv_sec, (long)start_tv.tv_usec); + + while ((num_queries < STRESS_GET_PR) && (num_timeouts < 100)) { + delta_recs = 0; + delta_queries = 0; + + status = osmtest_stress_path_recs_by_lid(p_osmt, + &delta_recs, + &delta_queries); + if (status != IB_SUCCESS) + goto Exit; + + num_recs += delta_recs; + num_queries += delta_queries; + + print_freq += delta_recs; + if (print_freq > 5000) { + gettimeofday(&end_tv, NULL); + printf("%" PRIu64 " records, %" PRIu64 " queries\n", + num_recs, num_queries); + if (end_tv.tv_usec > start_tv.tv_usec) { + sec_diff = end_tv.tv_sec - start_tv.tv_sec; + usec_diff = end_tv.tv_usec - start_tv.tv_usec; + } else { + sec_diff = end_tv.tv_sec - start_tv.tv_sec - 1; + usec_diff = + 1000000 - (start_tv.tv_usec - + end_tv.tv_usec); + } + printf("-I- End time is : %09ld:%06ld [sec:usec]\n", + end_tv.tv_sec, (long)end_tv.tv_usec); + printf("-I- Querying %" PRId64 + " path_rec queries took %04ld:%06ld [sec:usec]\n", + num_queries, sec_diff, usec_diff); + print_freq = 0; + } + } + +Exit: + gettimeofday(&end_tv, NULL); + printf("-I- End time is : %09ld:%06ld [sec:usec]\n", + end_tv.tv_sec, (long)end_tv.tv_usec); + if (end_tv.tv_usec > start_tv.tv_usec) { + sec_diff = end_tv.tv_sec - start_tv.tv_sec; + usec_diff = end_tv.tv_usec - start_tv.tv_usec; + } else { + sec_diff = end_tv.tv_sec - start_tv.tv_sec - 1; + usec_diff = 1000000 - (start_tv.tv_usec - end_tv.tv_usec); + } + + printf("-I- Querying %" PRId64 + " path_rec queries took %04ld:%06ld [sec:usec]\n", + num_queries, sec_diff, usec_diff); + if (num_timeouts > 50) { + status = IB_TIMEOUT; + } + /* Exit: */ + OSM_LOG_EXIT(&p_osmt->log); + return (status); +} + +/********************************************************************** + **********************************************************************/ static void osmtest_prepare_db_generic(IN osmtest_t * const p_osmt, IN cl_qmap_t * const p_tbl) @@ -7247,6 +7387,16 @@ ib_api_status_t osmtest_run(IN osmtest_t * const p_osmt) goto Exit; } break; + case 4: /* SA Get PR to SA LID */ + status = osmtest_stress_get_pr(p_osmt); + if (status != IB_SUCCESS) { + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, + "ERR 014B: " + "SA Get PR stress test failed (%s)\n", + ib_get_err_str(status)); + goto Exit; + } + break; default: OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 0144: " From hnrose at comcast.net Mon Sep 21 05:51:34 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 08:51:34 -0400 Subject: [ofa-general] [PATCHv2] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes Message-ID: <20090921125134.GA14213@comcast.net> Signed-off-by: Hal Rosenstock --- Changes since v1: Fix use after free issue diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..03f988d 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include #include +static void free_node(db_node_t * node); + /** ========================================================================= */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,17 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item, *next_item; + db_node_t *node; + if (db) { + item = cl_qmap_head(&db->pc_data); + while (item != cl_qmap_end(&db->pc_data)) { + node = (db_node_t *)item; + next_item = cl_qmap_next(item); + free_node(node); + item = next_item; + } cl_plock_destroy(&db->lock); free(db); } From hal.rosenstock at gmail.com Mon Sep 21 06:43:10 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 09:43:10 -0400 Subject: [ofa-general] Re: [PATCHv2] opensm/osm_mesh.c: Remove edges in lash matrix In-Reply-To: <20090830103615.GC21909@me> References: <20090806223417.GA2997@comcast.net> <20090830103615.GC21909@me> Message-ID: Hi Sasha, On Sun, Aug 30, 2009 at 6:36 AM, Sasha Khapyorsky wrote: > > @@ -878,6 +950,12 @@ static void make_geometry(lash_t *p_lash, int sw) > > n = s1->node->num_links; > > > > /* > > + * ignore chain fragments > > + */ > > + if (n < seed->node->num_links && n <= 2) > > + continue; > > + > > + /* > > * only process 'mesh' switches > > */ > > if (!s1->node->matrix) > > @@ -908,7 +986,8 @@ static void make_geometry(lash_t *p_lash, int sw) > > if (j == i) > > continue; > > > > - if (s1->node->matrix[i][j] != 2) { > > + if (s1->node->matrix[i][j] != 2 && > > + s1->node->matrix[i][j] <= > 4) { > > What does this ' <= 4' check? > It's to rule out opposite nodes when distance is greater than 4. I've added a comment to the next version of the patch for this. -- Hal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hnrose at comcast.net Mon Sep 21 06:41:51 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 09:41:51 -0400 Subject: [ofa-general] [PATCHv3] opensm/osm_mesh.c: Remove edges in lash matrix Message-ID: <20090921134151.GC14213@comcast.net> The intent of this change is to remove edge nodes (by "not counting them). The point of this heuristic is to deal with the case of small lattices which can easily have more surface than interior, which leads to choosing a non representative seed. This causes impossible counts to get reported. Signed-off-by: Robert Pearson Signed-off-by: Hal Rosenstock --- Changes since v2: In make_geometry, added comment on meaning of magic number 4 In seed_axes, made log level DEBUG and placed in osm_log_is_active clause Changes since v1: Replaced printfs with OSM_LOG calls diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 72a9aa9..260e2f8 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -170,6 +170,11 @@ static const struct mesh_info { {8, {2, 2, 2, 2, 2, 2, 2, 2}, 8, {-1792, -6144, -8960, -7168, -3360, -896, -112, 0, 1}, }, + /* + * mesh errors + */ + {2, {6, 6}, 4, {-192, -256, -80, 0, 1}, }, + {-1, {0,}, 0, {0, }, }, }; @@ -727,6 +732,42 @@ done: } /* + * remove_edges + * + * remove type from nodes that have fewer links + * than adjacent nodes + */ +static void remove_edges(lash_t *p_lash) +{ + osm_log_t *p_log = &p_lash->p_osm->log; + int sw; + mesh_node_t *n, *nn; + int i; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw < p_lash->num_switches; sw++) { + n = p_lash->switches[sw]->node; + if (!n->type) + continue; + + for (i = 0; i < n->num_links; i++) { + nn = p_lash->switches[n->links[i]->switch_id]->node; + + if (nn->num_links > n->num_links) { + OSM_LOG(p_log, OSM_LOG_DEBUG, + "removed edge switch %s\n", + p_lash->switches[sw]->p_sw->p_node->print_desc); + n->type = -1; + break; + } + } + } + + OSM_LOG_EXIT(p_log); +} + +/* * get_local_geometry * * analyze the local geometry around each switch @@ -735,6 +776,7 @@ static int get_local_geometry(lash_t *p_lash, mesh_t *mesh) { osm_log_t *p_log = &p_lash->p_osm->log; int sw; + int status = 0; OSM_LOG_ENTER(p_log); @@ -747,15 +789,38 @@ static int get_local_geometry(lash_t *p_lash, mesh_t *mesh) continue; if (get_switch_metric(p_lash, sw)) { - OSM_LOG_EXIT(p_log); - return -1; + status = -1; + goto Exit; } - classify_switch(p_lash, mesh, sw); classify_mesh_type(p_lash, sw); } + remove_edges(p_lash); + + for (sw = 0; sw < p_lash->num_switches; sw++) { + if (p_lash->switches[sw]->node->type < 0) + continue; + classify_switch(p_lash, mesh, sw); + } + +Exit: OSM_LOG_EXIT(p_log); - return 0; + return status; +} + +static void print_axis(lash_t *p_lash, char *p, int sw, int port) +{ + mesh_node_t *node = p_lash->switches[sw]->node; + char *name = p_lash->switches[sw]->p_sw->p_node->print_desc; + int c = node->axes[port]; + + p += sprintf(p, "%s[%d] = ", name, port); + if (c) + p += sprintf(p, "%s%c -> ", ((c - 1) & 1) ? "-" : "+", 'X' + (c - 1)/2); + else + p += sprintf(p, "N/A -> "); + p += sprintf(p, "%s\n", + p_lash->switches[node->links[port]->switch_id]->p_sw->p_node->print_desc); } /* @@ -775,6 +840,7 @@ static void seed_axes(lash_t *p_lash, int sw) int i, j, c; OSM_LOG_ENTER(p_log); + if (!node->matrix || !node->dimension) goto done; @@ -805,6 +871,16 @@ static void seed_axes(lash_t *p_lash, int sw) } } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) { + char buf[256], *p; + + for (i = 0; i < n; i++) { + p = buf; + print_axis(p_lash, p, sw, i); + OSM_LOG(p_log, OSM_LOG_DEBUG, "%s", buf); + } + } + done: OSM_LOG_EXIT(p_log); } @@ -878,6 +954,12 @@ static void make_geometry(lash_t *p_lash, int sw) n = s1->node->num_links; /* + * ignore chain fragments + */ + if (n < seed->node->num_links && n <= 2) + continue; + + /* * only process 'mesh' switches */ if (!s1->node->matrix) @@ -908,7 +990,9 @@ static void make_geometry(lash_t *p_lash, int sw) if (j == i) continue; - if (s1->node->matrix[i][j] != 2) { + /* Rule out opposite nodes when distance greater than 4 */ + if (s1->node->matrix[i][j] != 2 && + s1->node->matrix[i][j] <= 4) { if (s1->node->axes[j]) { if (s1->node->axes[j] != opposite(seed, s1->node->axes[i])) { OSM_LOG(p_log, OSM_LOG_DEBUG, "phase 1 mismatch\n"); From hal.rosenstock at gmail.com Mon Sep 21 07:20:50 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 10:20:50 -0400 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: <20090921125936.GD24398@me> References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> Message-ID: On Mon, Sep 21, 2009 at 8:59 AM, Sasha Khapyorsky wrote: > On 10:31 Wed 02 Sep , Hal Rosenstock wrote: > > > > Add support for SwitchInfo:MulticastFDBTop > > Added by MgtWG errata #4505-4508 > > Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no > entries > > > > In osm_mcast_mgr.c:mcast_mgr_set_mftables call new routine > > mcast_mgr_set_mfttop to set MulticastFDBTop in SwitchInfo > > based on max_block_in_use when switch port 0 indicates > > IsMulticastFDBTop is supported. > > > > Signed-off-by: Hal Rosenstock > > --- > > diff --git a/opensm/opensm/osm_mcast_mgr.c > b/opensm/opensm/osm_mcast_mgr.c > > index d7c5ce1..3671e08 100644 > > --- a/opensm/opensm/osm_mcast_mgr.c > > +++ b/opensm/opensm/osm_mcast_mgr.c > > @@ -1066,6 +1066,83 @@ Exit: > > > > /********************************************************************** > > **********************************************************************/ > > +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * > p_sw) > > +{ > > + osm_node_t *p_node; > > + osm_dr_path_t *p_path; > > + osm_physp_t *p_physp; > > + osm_mcast_tbl_t *p_tbl; > > + osm_madw_context_t context; > > + ib_api_status_t status; > > + ib_switch_info_t si; > > + boolean_t set_swinfo_require = FALSE; > > + uint16_t mcast_top; > > + uint8_t life_state; > > + > > + OSM_LOG_ENTER(sm->p_log); > > + > > + CL_ASSERT(p_sw); > > + > > + p_node = p_sw->p_node; > > + > > + CL_ASSERT(p_node); > > + > > + p_physp = osm_node_get_physp_ptr(p_node, 0); > > + p_path = osm_physp_get_dr_path_ptr(p_physp); > > + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > > + > > + if (p_physp->port_info.capability_mask & > IB_PORT_CAP_HAS_MCAST_FDB_TOP) { > > BTW any reason why this capability bit if placed in PortInfo and not in > SwitchInfo (it is not port but switch related feature)? I don't recall. > > > + /* > > + Set the top of the multicast forwarding table. > > + */ > > + si = p_sw->switch_info; > > + if (p_tbl->max_block_in_use == -1) > > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); > > + else > > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + > > + (p_tbl->max_block_in_use + 1) > * IB_MCAST_BLOCK_SIZE - 1); > > + if (mcast_top != si.mcast_top) { > > + set_swinfo_require = TRUE; > > + si.mcast_top = mcast_top; > > + } > > + > > + /* check to see if the change state bit is on. If it is - > then > > + we need to clear it. */ > > + if (ib_switch_info_get_state_change(&si)) > > + life_state = ((sm->p_subn->opt.packet_life_time << > 3) > > + | (si.life_state & IB_SWITCH_PSC)) & > 0xfc; > > + else > > + life_state = (sm->p_subn->opt.packet_life_time << > 3) & 0xf8; > > + > > + if (life_state != si.life_state || > > + ib_switch_info_get_state_change(&si)) { > > + set_swinfo_require = TRUE; > > + si.life_state = life_state; > > + } > > Switch's StateChange and LifeState are handled when unicast routing is > configured. Why do we need duplicate it here? > I thought we could lose a PortStateChange but it looks like just making sure that this bit is 0 on set should be fine. I'll send a revised patch shortly. -- Hal -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Mon Sep 21 07:28:13 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 17:28:13 +0300 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> Message-ID: <20090921142813.GG24398@me> On 09:20 Mon 21 Sep , Hal Rosenstock wrote: > > > > + > > + /* > > + * Do a blocking query for the PathRecord. > > + */ > > + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, > &context); > > + if (status != IB_SUCCESS) { > > + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 000A: " > > + "osmtest_get_path_rec_by_lid_pair failed (%s)\n", > > + ib_get_err_str(status)); > > + goto Exit; > > + } > > It is not really "stress" testing, just pinging. > > > So are the other tests (additionally those use RMPP). Isn't repetitive > pinging a stress of a kind ? No. "Stress" test assumes full load. In "ping" case the only one thread is loaded and in only request processing time. > > > > Shouldn't it be clarified in test description? > > > Same level of description as other tests. They all could be made more > descriptive. Agree. And we need to start somewhere. Sasha From hal.rosenstock at gmail.com Mon Sep 21 07:32:02 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 10:32:02 -0400 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: <20090921142813.GG24398@me> References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> <20090921142813.GG24398@me> Message-ID: On Mon, Sep 21, 2009 at 10:28 AM, Sasha Khapyorsky wrote: > On 09:20 Mon 21 Sep , Hal Rosenstock wrote: > > > > > > > + > > > + /* > > > + * Do a blocking query for the PathRecord. > > > + */ > > > + status = osmtest_get_path_rec_by_lid_pair(p_osmt, slid, dlid, > > &context); > > > + if (status != IB_SUCCESS) { > > > + OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 000A: " > > > + "osmtest_get_path_rec_by_lid_pair failed (%s)\n", > > > + ib_get_err_str(status)); > > > + goto Exit; > > > + } > > > > It is not really "stress" testing, just pinging. > > > > > > So are the other tests (additionally those use RMPP). Isn't repetitive > > pinging a stress of a kind ? > > No. "Stress" test assumes full load. What do you mean full load ? > In "ping" case the only one thread > is loaded and in only request processing time. > I'm not following what you mean by this. > > > > > > > > Shouldn't it be clarified in test description? > > > > > > Same level of description as other tests. They all could be made more > > descriptive. > > Agree. And we need to start somewhere. > Separate patch ? -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Mon Sep 21 07:35:53 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 17:35:53 +0300 Subject: [ofa-general] Re: [PATCHv2] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes In-Reply-To: <20090921125134.GA14213@comcast.net> References: <20090921125134.GA14213@comcast.net> Message-ID: <20090921143553.GH24398@me> On 08:51 Mon 21 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock > --- > Changes since v1: > Fix use after free issue > > diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c > index e5dfc19..03f988d 100644 > --- a/opensm/opensm/osm_perfmgr_db.c > +++ b/opensm/opensm/osm_perfmgr_db.c > @@ -49,6 +49,8 @@ > #include > #include > > +static void free_node(db_node_t * node); > + > /** ========================================================================= > */ > perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) > @@ -68,7 +70,17 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) > */ > void perfmgr_db_destroy(perfmgr_db_t * db) > { > + cl_map_item_t *item, *next_item; > + db_node_t *node; > + > if (db) { > + item = cl_qmap_head(&db->pc_data); > + while (item != cl_qmap_end(&db->pc_data)) { > + node = (db_node_t *)item; > + next_item = cl_qmap_next(item); > + free_node(node); > + item = next_item; > + } And why do you need both 'item' and 'node' variables? Sasha > cl_plock_destroy(&db->lock); > free(db); > } > From sashak at voltaire.com Mon Sep 21 07:45:04 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 17:45:04 +0300 Subject: [ofa-general] Re: [PATCHv2] osmtest: Add SA get PathRecord stress test In-Reply-To: <20090921131825.GB14213@comcast.net> References: <20090921131825.GB14213@comcast.net> Message-ID: <20090921144504.GI24398@me> On 09:18 Mon 21 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From hnrose at comcast.net Mon Sep 21 07:44:31 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 10:44:31 -0400 Subject: [ofa-general] [PATCHv3] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes Message-ID: <20090921144431.GA22785@comcast.net> Signed-off-by: Hal Rosenstock --- Changes since v2: Eliminated node variable Changes since v1: Fix use after free issue diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c index e5dfc19..5321c59 100644 --- a/opensm/opensm/osm_perfmgr_db.c +++ b/opensm/opensm/osm_perfmgr_db.c @@ -49,6 +49,8 @@ #include #include +static void free_node(db_node_t * node); + /** ========================================================================= */ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) @@ -68,7 +70,15 @@ perfmgr_db_t *perfmgr_db_construct(osm_perfmgr_t *perfmgr) */ void perfmgr_db_destroy(perfmgr_db_t * db) { + cl_map_item_t *item, *next_item; + if (db) { + item = cl_qmap_head(&db->pc_data); + while (item != cl_qmap_end(&db->pc_data)) { + next_item = cl_qmap_next(item); + free_node((db_node_t *)item); + item = next_item; + } cl_plock_destroy(&db->lock); free(db); } From hnrose at comcast.net Mon Sep 21 07:38:42 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 10:38:42 -0400 Subject: [ofa-general] [PATCHv2] opensm: Add support for MulticastFDBTop Message-ID: <20090921143842.GA20906@comcast.net> Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_sm.c:osm_sm_set_mcast_tbl, when switch port 0 indicates IsMulticastFDBTop supported, set MulticastFDBTop in SwitchInfo based on max_block_in_use Signed-off-by: Hal Rosenstock --- Changes since v1: In mcast_mgr_set_mfttop, eliminated PortStateChange checking diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..0da0ef1 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1044,6 +1044,64 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) /********************************************************************** **********************************************************************/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + uint16_t mcast_top; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw->p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp->port_info.capability_mask & IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. + */ + si = p_sw->switch_info; + if (p_tbl->max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl->max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top != si.mcast_top) { + si.mcast_top = mcast_top; + + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Setting switch MFT top to MLID 0x%x\n", + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) & si, + sizeof(si), IB_MAD_ATTR_SWITCH_INFO, + 0, CL_DISP_MSGID_NONE, &context); + + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1B: " + "Sending SwitchInfo attribute failed (%s)\n", + ib_get_err_str(status)); + } + } +} + +/********************************************************************** + **********************************************************************/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = &sm->p_subn->sw_guid_tbl; @@ -1059,6 +1117,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) > max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif if (sa->p_subn->opt.qos) - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); + else + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); if (!sa->p_subn->opt.disable_multicast) p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); From sashak at voltaire.com Mon Sep 21 07:57:36 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 17:57:36 +0300 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> <20090921142813.GG24398@me> Message-ID: <20090921145736.GJ24398@me> On 10:32 Mon 21 Sep , Hal Rosenstock wrote: > > > > + goto Exit; > > > > + } > > > > > > It is not really "stress" testing, just pinging. > > > > > > > > > So are the other tests (additionally those use RMPP). Isn't repetitive > > > pinging a stress of a kind ? > > > > No. "Stress" test assumes full load. > > > What do you mean full load ? What is not clear "full" or "load"? > > In "ping" case the only one thread > > is loaded and in only request processing time. > > > > I'm not following what you mean by this. Really? How this ping test's timeline looks? (1) client sends one request, (2) it travels to a server, (3) server processes it and replies, (4) the response travels to client, (5) client gets it and continue from beginning. Ok? The server works only in (3) and does nothing in other test stages. > > > Same level of description as other tests. They all could be made more > > > descriptive. > > > > Agree. And we need to start somewhere. > > > > Separate patch ? No problem. Sasha From sashak at voltaire.com Mon Sep 21 08:00:43 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 18:00:43 +0300 Subject: [ofa-general] Re: [PATCHv3] opensm/osm_perfmgr_db.c: Fix memory leak of db nodes In-Reply-To: <20090921144431.GA22785@comcast.net> References: <20090921144431.GA22785@comcast.net> Message-ID: <20090921150043.GK24398@me> On 10:44 Mon 21 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From hal.rosenstock at gmail.com Mon Sep 21 08:05:04 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 11:05:04 -0400 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: <20090921145736.GJ24398@me> References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> <20090921142813.GG24398@me> <20090921145736.GJ24398@me> Message-ID: On Mon, Sep 21, 2009 at 10:57 AM, Sasha Khapyorsky wrote: > How this ping test's timeline looks? > > (1) client sends one request, (2) it travels to a server, (3) server > processes it and replies, (4) the response travels to client, (5) client > gets it and continue from beginning. Ok? The server works only in (3) > and does nothing in other test stages. > How is this different from the other stress tests ? Aren't they all blocking too ? -- Hal -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Mon Sep 21 08:09:29 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 18:09:29 +0300 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> Message-ID: <20090921150929.GL24398@me> On 10:20 Mon 21 Sep , Hal Rosenstock wrote: > > > + > > > + if (p_physp->port_info.capability_mask & > > IB_PORT_CAP_HAS_MCAST_FDB_TOP) { > > > > BTW any reason why this capability bit if placed in PortInfo and not in > > SwitchInfo (it is not port but switch related feature)? > > > I don't recall. Could this be verified? For me it does not look very reasonable to leak PortInfo:CapabilityMask bits for this purpose, it is meanless for CA and switch external ports. > > Switch's StateChange and LifeState are handled when unicast routing is > > configured. Why do we need duplicate it here? > > > > I thought we could lose a PortStateChange Basically we could lose this bit when doing reset twice - link state can change in window between two resets. Sasha From sashak at voltaire.com Mon Sep 21 08:20:02 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 18:20:02 +0300 Subject: [ofa-general] Re: [PATCH] osmtest: Add SA get PathRecord stress test In-Reply-To: References: <20090831192134.GA12094@comcast.net> <20090920102045.GH17656@me> <20090921142813.GG24398@me> <20090921145736.GJ24398@me> Message-ID: <20090921152002.GM24398@me> On 11:05 Mon 21 Sep , Hal Rosenstock wrote: > On Mon, Sep 21, 2009 at 10:57 AM, Sasha Khapyorsky wrote: > > > > > > How this ping test's timeline looks? > > > > (1) client sends one request, (2) it travels to a server, (3) server > > processes it and replies, (4) the response travels to client, (5) client > > gets it and continue from beginning. Ok? The server works only in (3) > > and does nothing in other test stages. > > > > How is this different from the other stress tests ? Aren't they all blocking > too ? I'm not sure, but why should it matter - we don't need to repeat existing bugs. :) Sasha From bart.vanassche at gmail.com Mon Sep 21 08:22:07 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 21 Sep 2009 17:22:07 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Wed, Sep 9, 2009 at 12:29 AM, Chris Worley wrote: > [ ... ] > But, the same issue occurs... the apps on the initiator hang, and the > target thinks all is well.   An app will hang in one of the file > systems... the others seem to be working well (even though they are > comprised of the same drives as the hung fs/app), for example: you can > do a "find ." from their root w/o hanging "find", but if you try that > in the fs where the app is hung, "find" will hang.  Lvscan/pvscan will > hang too. > > Strangely, restarting the target (removing ib_srpd and scst_vdisk > modules, then re-registering the disks with scst_vdisk and > re-modprobing ib_srpt from scratch) causes the apps on the initiator > to un-hang and make progress again... (but eventually hang again... > seemingly more readily than before). > > While nothing other than the messages you'd expect (from > re-registering the drives to the initiator logging in) occur on the > target, the initiator has much to say during this re-registration > period, starting w/ the time-out (that has been shown previously): > > Sep  8 22:04:07 nameme kernel: sd 30:0:0:3: timing out command, waited 360s > Sep  8 22:04:07 nameme kernel: sd 30:0:0:3: SCSI error: return code = 0x06000000 > Sep  8 22:04:07 nameme kernel: end_request: I/O error, dev sdo, sector 45304704 > [ ... ] Hello Chris, Unless you will report the opposite I assume that the above issue (SRP timeouts) has been solved by the solution I sent you via private e-mail, namely to load the SRPT kernel module with the parameter 'thread' set to one (modprobe ib_srpt thread=1). Bart. From hal.rosenstock at gmail.com Mon Sep 21 08:22:28 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 11:22:28 -0400 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: <20090921150929.GL24398@me> References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> <20090921150929.GL24398@me> Message-ID: On Mon, Sep 21, 2009 at 11:09 AM, Sasha Khapyorsky wrote: > On 10:20 Mon 21 Sep , Hal Rosenstock wrote: > > > > + > > > > + if (p_physp->port_info.capability_mask & > > > IB_PORT_CAP_HAS_MCAST_FDB_TOP) { > > > > > > BTW any reason why this capability bit if placed in PortInfo and not in > > > SwitchInfo (it is not port but switch related feature)? > > > > > > I don't recall. > > Could this be verified? > I'll try. > > For me it does not look very reasonable to leak PortInfo:CapabilityMask > bits for this purpose, it is meanless for CA and switch external ports. > Right; it would never be set for such ports. > > > > Switch's StateChange and LifeState are handled when unicast routing is > > > configured. Why do we need duplicate it here? > > > > > > > I thought we could lose a PortStateChange > > Basically we could lose this bit when doing reset twice - link state can > change in window between two resets. > I removed this in the latest patch version. -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Mon Sep 21 08:45:35 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 21 Sep 2009 08:45:35 -0700 Subject: [ofa-general] Re: [PATCH] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: <4AB7488E.6090700@Voltaire.COM> (Moni Shoua's message of "Mon, 21 Sep 2009 12:34:06 +0300") References: <4AB20C6C.9090005@Voltaire.COM> <4AB7488E.6090700@Voltaire.COM> Message-ID: > I may miss this but I don't see how ipoib_ib_dev_down() is called with rtnl held. It's called from ipoib_stop(), and .ndo_stop is called with rtnl held. > Anyway, the new patch doesn't use delayed work. great From sashak at voltaire.com Mon Sep 21 08:48:32 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 18:48:32 +0300 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> <20090921150929.GL24398@me> Message-ID: <20090921154832.GQ24398@me> On 11:22 Mon 21 Sep , Hal Rosenstock wrote: > > I'll try. Thanks. > I removed this in the latest patch version. Ok. Let's wait with this patch up to capability mask clarification/resolution. Sasha From worleys at gmail.com Mon Sep 21 08:56:36 2009 From: worleys at gmail.com (Chris Worley) Date: Mon, 21 Sep 2009 09:56:36 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: On Mon, Sep 21, 2009 at 9:22 AM, Bart Van Assche wrote: > On Wed, Sep 9, 2009 at 12:29 AM, Chris Worley wrote: >> [ ... ] >> But, the same issue occurs... the apps on the initiator hang, and the >> target thinks all is well.   An app will hang in one of the file >> systems... the others seem to be working well (even though they are >> comprised of the same drives as the hung fs/app), for example: you can >> do a "find ." from their root w/o hanging "find", but if you try that >> in the fs where the app is hung, "find" will hang.  Lvscan/pvscan will >> hang too. >> >> Strangely, restarting the target (removing ib_srpd and scst_vdisk >> modules, then re-registering the disks with scst_vdisk and >> re-modprobing ib_srpt from scratch) causes the apps on the initiator >> to un-hang and make progress again... (but eventually hang again... >> seemingly more readily than before). >> >> While nothing other than the messages you'd expect (from >> re-registering the drives to the initiator logging in) occur on the >> target, the initiator has much to say during this re-registration >> period, starting w/ the time-out (that has been shown previously): >> >> Sep  8 22:04:07 nameme kernel: sd 30:0:0:3: timing out command, waited 360s >> Sep  8 22:04:07 nameme kernel: sd 30:0:0:3: SCSI error: return code = 0x06000000 >> Sep  8 22:04:07 nameme kernel: end_request: I/O error, dev sdo, sector 45304704 >> [ ... ] > > Hello Chris, > > Unless you will report the opposite I assume that the above issue (SRP > timeouts) has been solved by the solution I sent you via private > e-mail, namely to load the SRPT kernel module with the parameter > 'thread' set to one (modprobe ib_srpt thread=1). I do view that as a work-around, as it implies there is an issue in the threads... and multiple threads do provide more performance (which is what IB is all about). I very much appreciate the work-around, though... this has been such a show-stopper for me. Thanks, Chris > > Bart. > From brian at sun.com Mon Sep 21 09:45:53 2009 From: brian at sun.com (Brian J. Murrell) Date: Mon, 21 Sep 2009 12:45:53 -0400 Subject: [ofa-general] OFED 1.5 beta1: iscsi_iser.c:601: error: unknown field =?utf-8?b?4oCYZWhfdGFyZ2V0X3Jlc2V0X2hhbmRsZXLigJk=?= specified in initializer Message-ID: <1253551553.11940.132.camel@pc.interlinx.bc.ca> I am getting the following error trying to build OFED 1.5 beta1 with RHEL5U3's 2.6.18-128.1.1.el5 kernel: gcc -m32 -Wp,-MD,/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/.iscsi_iser.o.d -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.3.3/include -D__KERNEL__ \ -D__OFED_BUILD__ \ -include include/linux/autoconf.h \ -include /home/brian/rpm/BUILD/ofa_kernel-1.5/include/linux/autoconf.h \ -I/home/brian/rpm/BUILD/ofa_kernel-1.5/kernel_addons/backport/2.6.18-EL5.3/include/ \ \ \ -I/home/brian/rpm/BUILD/ofa_kernel-1.5/include \ -I/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/debug \ -I/usr/local/include/scst \ -I/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/srpt \ -I/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/net/cxgb3 \ -Iinclude \ \ -I/mnt/lustre/brian/lustre/moved/OFED-1.5-beta1/build/linux-2.6.18-128.1.1.el5_lustre.1.8.0.50.20090302160821smp/arch//include \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -pipe -msoft-float -fno-builtin-sprintf -fno-builtin-log2 -fno-builtin-puts -mpreferred-stack-boundary=2 -march=i686 -mtune=generic -mtune=generic -mregparm=3 -ffreestanding -Iinclude/asm-i386/mach-generic -Iinclude/asm-i386/mach-default -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(iscsi_iser)" -D"KBUILD_MODNAME=KBUILD_STR(ib_iser)" -c -o /home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/.tmp_iscsi_iser.o /home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/iscsi_iser.c /home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/iscsi_iser.c:601: error: unknown field ‘eh_target_reset_handler’ specified in initializer make[4]: *** [/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/iscsi_iser.o] Error 1 make[3]: *** [/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband] Error 2 make[1]: *** [_module_/home/brian/rpm/BUILD/ofa_kernel-1.5] Error 2 make[1]: Leaving directory `/mnt/lustre/brian/lustre/moved/OFED-1.5-beta1/build/linux-2.6.18-128.1.1.el5_lustre.1.8.0.50.20090302160821smp' make: *** [kernel] Error 2 error: Bad exit status from /home/brian/tmp/rpm-tmp.80552 (%build) b. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: From hal.rosenstock at gmail.com Mon Sep 21 09:50:02 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 21 Sep 2009 12:50:02 -0400 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: <20090921154832.GQ24398@me> References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> <20090921150929.GL24398@me> <20090921154832.GQ24398@me> Message-ID: On Mon, Sep 21, 2009 at 11:48 AM, Sasha Khapyorsky wrote: > On 11:22 Mon 21 Sep , Hal Rosenstock wrote: > > > > I'll try. > > Thanks. > > > I removed this in the latest patch version. > > Ok. Let's wait with this patch up to capability mask > clarification/resolution. > Other than this, is the patch acceptable ? I want to get this as ready as possible. -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vst at vlnb.net Mon Sep 21 09:59:24 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 21 Sep 2009 20:59:24 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> Message-ID: <4AB7B0EC.20109@vlnb.net> Chris Worley, on 09/19/2009 01:31 AM wrote: > On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/06/2009 05:41 PM wrote: >>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van Assche >>>> wrote: >>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley wrote: >>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley wrote: >>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>> repeat). >>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>> both cases, the problem does not repeat. >>>>>> >>>>>> That makes it sound like OFED is the cure on either side of the >>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>> w/ either Ununtu or RHEL targets). >>>>> These results are strange. Regarding the Linux-only tests, I was >>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>> each of these components there is at least one test that passes and at >>>>> least one test that fails. So either my assumption is wrong or one of >>>>> the above test results is not repeatable. Do you have the time to >>>>> repeat the Linux-only tests ? >>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>> the problem repeated; now, I can't repeat the case where it didn't >>>> fail. Still, no errors, other than the eventual timeouts previously >>>> shown; the target thinks all is fine, the initiator is stuck. >>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 or >>> 9.04. >> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible races >> you have, although not completely. > > I finally got around to this test... 1 CPU works very well, w/o hangs > (will test all night to see if this holds true), 2 or more don't. > This is dual-socket NHM, so I can't specify more than one processor > w/o getting more than one socket. Where 1 CPU works well, on the target or initiator? The race is on the corresponding host. I'd suggest you to reproduce the problem with the latest SCST trunk, lockdep enabled on the suspected host (better on both) and mgmt_minor trace level enabled on the target. Then, after the hang, let the system stay for about a half an hour, then send us with Bart (privately, compressed) kernel logs from both systems starting from the early boot messages. If you have dmesg only output, please enable printk timestamps (CONFIG_PRINTK_TIME). > Chris >> 2. Try with another hardware, including motherboard. You can have something >> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) >> >>> Chris >>>> Chris >>>>> Bart. >>>>> From dotanba at gmail.com Mon Sep 21 11:47:11 2009 From: dotanba at gmail.com (Dotan Barak) Date: Mon, 21 Sep 2009 20:47:11 +0200 Subject: [ofa-general] Building IB SAN with Linux without switch In-Reply-To: <259de15b2d30989cc26f53cf02dc4b8d.squirrel@webmail.tekno-soft.it> References: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> <2f3bf9a60909202304r33c97d6dx37c48b1eb8d319eb@mail.gmail.com> <259de15b2d30989cc26f53cf02dc4b8d.squirrel@webmail.tekno-soft.it> Message-ID: <4AB7CA2F.8010504@gmail.com> Hi. Do you have any specific answer you wish to ask or everything is fine now? Dotan From bart.vanassche at gmail.com Mon Sep 21 11:06:53 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 21 Sep 2009 20:06:53 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> Message-ID: On Mon, Sep 21, 2009 at 5:56 PM, Chris Worley wrote: > On Mon, Sep 21, 2009 at 9:22 AM, Bart Van Assche > wrote: >> Unless you will report the opposite I assume that the above issue (SRP >> timeouts) has been solved by the solution I sent you via private >> e-mail, namely to load the SRPT kernel module with the parameter >> 'thread' set to one (modprobe ib_srpt thread=1). > > I do view that as a work-around, as it implies there is an issue in > the threads... and multiple threads do provide more performance (which > is what IB is all about). This does not imply a threading issue in ib_srpt. And more threads do not always provide higher performance. And you are misunderstanding the effect of the ib_srpt kernel parameter called 'thread'. The effect of this parameter is as follows: * thread=0: as much as possible of the SRP protocol is processed in IB interrupt context. * thread=1: a kernel thread handles all SRP protocol processing. Work is delegated from IB interrupt context to the SRPT kernel thread via a queue (see also the srpt_completion() function in file ib_srpt.c). So the kernel parameter 'thread' of ib_srpt allows to choose between two significantly different behaviors of ib_srpt. My hypothesis is that in your setup running ib_srpt with thread=0 resulted in SRPT's completion queue handler (srpt_completion()), which keeps running as long as more completion queue elements can be processed, took up too much time, and that this finally resulted in remote SRP initiator disconnects. Bart. From sashak at voltaire.com Mon Sep 21 11:17:03 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 21:17:03 +0300 Subject: [ofa-general] Re: [PATCH] opensm: Add support for MulticastFDBTop In-Reply-To: References: <20090902143133.GA10980@comcast.net> <20090921125936.GD24398@me> <20090921150929.GL24398@me> <20090921154832.GQ24398@me> Message-ID: <20090921181703.GT24398@me> On 12:50 Mon 21 Sep , Hal Rosenstock wrote: > > Other than this, is the patch acceptable ? I want to get this as ready as > possible. I will look at v2. Sasha From sashak at voltaire.com Mon Sep 21 11:21:55 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 21 Sep 2009 21:21:55 +0300 Subject: [ofa-general] Re: [PATCHv2] opensm: Add support for MulticastFDBTop In-Reply-To: <20090921143842.GA20906@comcast.net> References: <20090921143842.GA20906@comcast.net> Message-ID: <20090921182155.GU24398@me> On 10:38 Mon 21 Sep , Hal Rosenstock wrote: > > Add support for SwitchInfo:MulticastFDBTop > Added by MgtWG errata #4505-4508 > Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries > > In osm_sm.c:osm_sm_set_mcast_tbl, when switch port 0 indicates > IsMulticastFDBTop supported, set MulticastFDBTop in SwitchInfo > based on max_block_in_use > > Signed-off-by: Hal Rosenstock > --- > Changes since v1: > In mcast_mgr_set_mfttop, eliminated PortStateChange checking > > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index c1d1916..0da0ef1 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -1044,6 +1044,64 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) > > /********************************************************************** > **********************************************************************/ > +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) > +{ > + osm_node_t *p_node; > + osm_dr_path_t *p_path; > + osm_physp_t *p_physp; > + osm_mcast_tbl_t *p_tbl; > + osm_madw_context_t context; > + ib_api_status_t status; > + ib_switch_info_t si; > + uint16_t mcast_top; > + > + OSM_LOG_ENTER(sm->p_log); > + > + CL_ASSERT(p_sw); > + > + p_node = p_sw->p_node; > + > + CL_ASSERT(p_node); > + > + p_physp = osm_node_get_physp_ptr(p_node, 0); > + p_path = osm_physp_get_dr_path_ptr(p_physp); > + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > + > + if (p_physp->port_info.capability_mask & IB_PORT_CAP_HAS_MCAST_FDB_TOP) { > + /* > + Set the top of the multicast forwarding table. > + */ > + si = p_sw->switch_info; > + if (p_tbl->max_block_in_use == -1) > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); > + else > + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + > + (p_tbl->max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); > + if (mcast_top != si.mcast_top) { > + si.mcast_top = mcast_top; > + > + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, > + "Setting switch MFT top to MLID 0x%x\n", > + cl_ntoh16(si.mcast_top)); > + > + context.si_context.light_sweep = FALSE; > + context.si_context.node_guid = osm_node_get_node_guid(p_node); > + context.si_context.set_method = TRUE; > + > + status = osm_req_set(sm, p_path, (uint8_t *) & si, > + sizeof(si), IB_MAD_ATTR_SWITCH_INFO, > + 0, CL_DISP_MSGID_NONE, &context); > + > + if (status != IB_SUCCESS) > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1B: " > + "Sending SwitchInfo attribute failed (%s)\n", > + ib_get_err_str(status)); > + } > + } > +} Basically the patch looks fine, I would suggest to simplify flows here by using 'if ("no update needed") return;', but it is minor. So we are just wating for capability bits clarification. Sasha > + > +/********************************************************************** > + **********************************************************************/ > static int mcast_mgr_set_mftables(osm_sm_t * sm) > { > cl_qmap_t *p_sw_tbl = &sm->p_subn->sw_guid_tbl; > @@ -1059,6 +1117,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) > p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); > if (osm_mcast_tbl_get_max_block_in_use(p_tbl) > max_block) > max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); > + mcast_mgr_set_mfttop(sm, p_sw); > p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > } > > diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c > index d2ab96a..fb58fe5 100644 > --- a/opensm/opensm/osm_sa_class_port_info.c > +++ b/opensm/opensm/osm_sa_class_port_info.c > @@ -1,6 +1,6 @@ > /* > * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. > + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. > * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. > * > * This software is available to you under a choice of one of two > @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) > OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; > #endif > if (sa->p_subn->opt.qos) > - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); > - > + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | > + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); > + else > + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); > if (!sa->p_subn->opt.disable_multicast) > p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; > p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); > From hnrose at comcast.net Mon Sep 21 11:40:17 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Mon, 21 Sep 2009 14:40:17 -0400 Subject: [ofa-general] [PATCHv3] opensm: Add support for MulticastFDBTop Message-ID: <20090921184017.GA24542@comcast.net> Add support for SwitchInfo:MulticastFDBTop Added by MgtWG errata #4505-4508 Also, per MgtWG RefID #4640, MulticastFDBTop value of 0xbfff means no entries In osm_sm.c:osm_sm_set_mcast_tbl, when switch port 0 indicates IsMulticastFDBTop supported, set MulticastFDBTop in SwitchInfo based on max_block_in_use Signed-off-by: Hal Rosenstock --- Changes since v2: In mcast_mgr_set_mfttop, reverse sense of mft top test so can remove indentation of code doing update Changes since v1: In mcast_mgr_set_mfttop, eliminated PortStateChange checking diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index c1d1916..c6c6d6d 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1044,6 +1044,65 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) /********************************************************************** **********************************************************************/ +static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN osm_switch_t * p_sw) +{ + osm_node_t *p_node; + osm_dr_path_t *p_path; + osm_physp_t *p_physp; + osm_mcast_tbl_t *p_tbl; + osm_madw_context_t context; + ib_api_status_t status; + ib_switch_info_t si; + uint16_t mcast_top; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_sw); + + p_node = p_sw->p_node; + + CL_ASSERT(p_node); + + p_physp = osm_node_get_physp_ptr(p_node, 0); + p_path = osm_physp_get_dr_path_ptr(p_physp); + p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); + + if (p_physp->port_info.capability_mask & IB_PORT_CAP_HAS_MCAST_FDB_TOP) { + /* + Set the top of the multicast forwarding table. + */ + si = p_sw->switch_info; + if (p_tbl->max_block_in_use == -1) + mcast_top = cl_hton16(IB_LID_MCAST_START_HO - 1); + else + mcast_top = cl_hton16(IB_LID_MCAST_START_HO + + (p_tbl->max_block_in_use + 1) * IB_MCAST_BLOCK_SIZE - 1); + if (mcast_top == si.mcast_top) + return; + + si.mcast_top = mcast_top; + + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Setting switch MFT top to MLID 0x%x\n", + cl_ntoh16(si.mcast_top)); + + context.si_context.light_sweep = FALSE; + context.si_context.node_guid = osm_node_get_node_guid(p_node); + context.si_context.set_method = TRUE; + + status = osm_req_set(sm, p_path, (uint8_t *) & si, + sizeof(si), IB_MAD_ATTR_SWITCH_INFO, + 0, CL_DISP_MSGID_NONE, &context); + + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A1B: " + "Sending SwitchInfo attribute failed (%s)\n", + ib_get_err_str(status)); + } +} + +/********************************************************************** + **********************************************************************/ static int mcast_mgr_set_mftables(osm_sm_t * sm) { cl_qmap_t *p_sw_tbl = &sm->p_subn->sw_guid_tbl; @@ -1059,6 +1118,7 @@ static int mcast_mgr_set_mftables(osm_sm_t * sm) p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); if (osm_mcast_tbl_get_max_block_in_use(p_tbl) > max_block) max_block = osm_mcast_tbl_get_max_block_in_use(p_tbl); + mcast_mgr_set_mfttop(sm, p_sw); p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); } diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index d2ab96a..fb58fe5 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two @@ -159,8 +159,10 @@ static void cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif if (sa->p_subn->opt.qos) - ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED | + OSM_CAP2_IS_MCAST_TOP_SUPPORTED); + else + ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_MCAST_TOP_SUPPORTED); if (!sa->p_subn->opt.disable_multicast) p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); From worleys at gmail.com Mon Sep 21 15:00:17 2009 From: worleys at gmail.com (Chris Worley) Date: Mon, 21 Sep 2009 16:00:17 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: <4AB7B0EC.20109@vlnb.net> References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> <4AB7B0EC.20109@vlnb.net> Message-ID: On Mon, Sep 21, 2009 at 10:59 AM, Vladislav Bolkhovitin wrote: > Chris Worley, on 09/19/2009 01:31 AM wrote: >> >> On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin >> wrote: >>> >>> Chris Worley, on 09/06/2009 05:41 PM wrote: >>>> >>>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>>> >>>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van >>>>> Assche >>>>> wrote: >>>>>> >>>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley >>>>>> wrote: >>>>>>> >>>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley >>>>>>> wrote: >>>>>>>> >>>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>>> repeat). >>>>>>> >>>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>>> both cases, the problem does not repeat. >>>>>>> >>>>>>> That makes it sound like OFED is the cure on either side of the >>>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>>> w/ either Ununtu or RHEL targets). >>>>>> >>>>>> These results are strange. Regarding the Linux-only tests, I was >>>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>>> each of these components there is at least one test that passes and at >>>>>> least one test that fails. So either my assumption is wrong or one of >>>>>> the above test results is not repeatable. Do you have the time to >>>>>> repeat the Linux-only tests ? >>>>> >>>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>>> the problem repeated; now, I can't repeat the case where it didn't >>>>> fail.  Still, no errors, other than the eventual timeouts previously >>>>> shown; the target thinks all is fine, the initiator is stuck. >>>> >>>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 >>>> or >>>> 9.04. >>> >>> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible >>> races >>> you have, although not completely. >> >> I finally got around to this test... 1 CPU works very well, w/o hangs >> (will test all night to see if this holds true), 2 or more don't. >> This is dual-socket NHM, so I can't specify more than one processor >> w/o getting more than one socket. > > Where 1 CPU works well, on the target or initiator? That was on the target. > The race is on the > corresponding host. > > I'd suggest you to reproduce the problem with the latest SCST trunk, lockdep > enabled on the suspected host (better on both) and mgmt_minor trace level > enabled on the target. Then, after the hang, let the system stay for about a > half an hour, then send us with Bart (privately, compressed) kernel logs > from both systems starting from the early boot messages. I believe I comprehensively tested w/ Lockdep and complete scst messages dumps on the target (and lockdep on the initiator) and came up with no messages or lock issues salient to the issue. If you think I should repeat this, I will. > > If you have dmesg only output, please enable printk timestamps > (CONFIG_PRINTK_TIME). Ubuntu has been pretty good about that. Thanks, Chris > >> Chris >>> >>> 2. Try with another hardware, including motherboard. You can have >>> something >>> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) >>> >>>> Chris >>>>> >>>>> Chris >>>>>> >>>>>> Bart. >>>>>> > > From robyf at tekno-soft.it Tue Sep 22 00:05:08 2009 From: robyf at tekno-soft.it (Roberto Fichera) Date: Tue, 22 Sep 2009 09:05:08 +0200 Subject: [ofa-general] Building IB SAN with Linux without switch In-Reply-To: <4AB7CA2F.8010504@gmail.com> References: <432fbe81a097b0082ea54201f1cc3ec5.squirrel@webmail.tekno-soft.it> <2f3bf9a60909202304r33c97d6dx37c48b1eb8d319eb@mail.gmail.com> <259de15b2d30989cc26f53cf02dc4b8d.squirrel@webmail.tekno-soft.it> <4AB7CA2F.8010504@gmail.com> Message-ID: <4AB87724.50109@tekno-soft.it> Dotan Barak ha scritto: > Hi. > > Do you have any specific answer you wish to ask or everything is fine > now? Now is everything clear thanks to Hal Rosenstock > > Dotan > From sneha0930 at gmail.com Tue Sep 22 02:04:43 2009 From: sneha0930 at gmail.com (Sneha Mistry) Date: Tue, 22 Sep 2009 14:34:43 +0530 Subject: [ofa-general] Problem while running ib tests Message-ID: Hi, I have two Dual port HCAa card.Which I have installed in same PC. I am using OpenSuse 10.3 and installed OFED 1.4. If I try to run any IB bandwidth test or latency test it end us with warning "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". Is it an OFED related problem or Linux kernel problem. HCA are connected back to back and output of ibstat is as given below. --------------------------------------------------------------------------------------------------------- CA 'mthca0' CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.8.200 Hardware version: 20 Node GUID: 0x0002c90200283734 System image GUID: 0x0002c90200283737 Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x02510a6a Port GUID: 0x0002c90200283735 Port 2: State: Initializing Physical state: LinkUp Rate: 20 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283736 CA 'mthca1' CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.8.200 Hardware version: 20 Node GUID: 0x0002c90200283730 System image GUID: 0x0002c90200283733 Port 1: State: Initializing Physical state: LinkUp Rate: 20 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283731 Port 2: State: Active Physical state: LinkUp Rate: 20 Base lid: 3 LMC: 0 SM lid: 1 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283732 -------------------------------------------------------------------------------------------------------------------------------------- Even ibibnetdiscover is detecting one hope. So there is nothing wrong if I try to test 2 HCA connected back to back. Please tell me what can be done for running ib tests. Thanks, SGM From karun.sharma at qlogic.com Tue Sep 22 02:54:22 2009 From: karun.sharma at qlogic.com (Karun Sharma) Date: Tue, 22 Sep 2009 04:54:22 -0500 Subject: [ofa-general] Problem while running ib tests In-Reply-To: References: Message-ID: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD86B@MNEXMB1.qlogic.org> Please use "-F" option while running the tests. It will ignore the "Conflicting CPU frequency" errors. You will still see these messages on your screen, but with "-F", you will also see the results. Regards Karun -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Sneha Mistry Sent: Tuesday, September 22, 2009 2:35 PM To: general at lists.openfabrics.org Subject: [ofa-general] Problem while running ib tests Hi, I have two Dual port HCAa card.Which I have installed in same PC. I am using OpenSuse 10.3 and installed OFED 1.4. If I try to run any IB bandwidth test or latency test it end us with warning "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". Is it an OFED related problem or Linux kernel problem. HCA are connected back to back and output of ibstat is as given below. --------------------------------------------------------------------------------------------------------- CA 'mthca0' CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.8.200 Hardware version: 20 Node GUID: 0x0002c90200283734 System image GUID: 0x0002c90200283737 Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x02510a6a Port GUID: 0x0002c90200283735 Port 2: State: Initializing Physical state: LinkUp Rate: 20 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283736 CA 'mthca1' CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.8.200 Hardware version: 20 Node GUID: 0x0002c90200283730 System image GUID: 0x0002c90200283733 Port 1: State: Initializing Physical state: LinkUp Rate: 20 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283731 Port 2: State: Active Physical state: LinkUp Rate: 20 Base lid: 3 LMC: 0 SM lid: 1 Capability mask: 0x02510a68 Port GUID: 0x0002c90200283732 -------------------------------------------------------------------------------------------------------------------------------------- Even ibibnetdiscover is detecting one hope. So there is nothing wrong if I try to test 2 HCA connected back to back. Please tell me what can be done for running ib tests. Thanks, SGM _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From vlad at lists.openfabrics.org Tue Sep 22 03:23:18 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 22 Sep 2009 03:23:18 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090922-0200 daily build status Message-ID: <20090922102319.30DF2E620B9@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.18-164.el5 Log: /home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband/hw/cxgb3/iwch_provider.c:1431: error: 'struct ib_device' has no member named 'dev' /home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband/hw/cxgb3/iwch_provider.c: In function 'iwch_unregister_device': /home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband/hw/cxgb3/iwch_provider.c:1451: error: 'struct ib_device' has no member named 'dev' make[4]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband/hw/cxgb3/iwch_provider.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband/hw/cxgb3] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090922-0200_linux-2.6.18-164.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-164.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From hnrose at comcast.net Tue Sep 22 04:42:52 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 22 Sep 2009 07:42:52 -0400 Subject: [ofa-general] [PATCH] opensm/osm_sa_mcmember_record.c: Remove uninitialized variable compile warning Message-ID: <20090922114252.GA14206@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8f7816b..7e95622 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -978,7 +978,7 @@ Exit: static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) { osm_mgrp_t *p_mgrp = NULL; - ib_api_status_t status; + ib_api_status_t status = IB_SUCCESS; ib_sa_mad_t *p_sa_mad; ib_member_rec_t *p_recvd_mcmember_rec; ib_member_rec_t mcmember_rec; From hnrose at comcast.net Tue Sep 22 06:46:11 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 22 Sep 2009 09:46:11 -0400 Subject: [ofa-general] [PATCH] libibmad/dump.c: Fix typo Message-ID: <20090922134611.GA15227@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index 1b287c0..5151882 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -523,7 +523,7 @@ void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) if (mask & (1 << 28)) s += sprintf(s, "\t\t\t\tIsVendorSpecificMadsTableSupported\n"); if (mask & (1 << 29)) - s += sprintf(s, "\t\t\t\tIsiMcastPkeyTrapSuppressionSupported\n"); + s += sprintf(s, "\t\t\t\tIsMcastPkeyTrapSuppressionSupported\n"); if (mask & (1 << 30)) s += sprintf(s, "\t\t\t\tIsMulticastFDBTopSupported\n"); if (mask & (1 << 31)) From sashak at voltaire.com Tue Sep 22 08:38:09 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 18:38:09 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_sa_mcmember_record.c: Remove uninitialized variable compile warning In-Reply-To: <20090922114252.GA14206@comcast.net> References: <20090922114252.GA14206@comcast.net> Message-ID: <20090922153809.GY24398@me> On 07:42 Tue 22 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > index 8f7816b..7e95622 100644 > --- a/opensm/opensm/osm_sa_mcmember_record.c > +++ b/opensm/opensm/osm_sa_mcmember_record.c > @@ -978,7 +978,7 @@ Exit: > static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) > { > osm_mgrp_t *p_mgrp = NULL; > - ib_api_status_t status; > + ib_api_status_t status = IB_SUCCESS; This makes sense. However I think about another fix - we don't need to refer unintialized and not used status value when error response is generated after osm_mgrp_add_port() failure. IOW: commit 4dd928b705024c4fefd6435c733ddd885fded5ab Author: Sasha Khapyorsky Date: Tue Sep 22 18:31:13 2009 +0300 opensm/osm_sa_mcmember_record.c: clean uninitialized variable use Clean uninitialized variable 'status' use. Signed-off-by: Sasha Khapyorsky diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8f7816b..dd64d94 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1167,9 +1167,7 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) CL_PLOCK_RELEASE(sa->p_lock); OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B06: " "osm_mgrp_add_port failed\n"); - osm_sa_send_error(sa, p_madw, status == IB_INVALID_PARAMETER ? - IB_SA_MAD_STATUS_REQ_INVALID : - IB_SA_MAD_STATUS_NO_RESOURCES); + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } Sasha From sashak at voltaire.com Tue Sep 22 08:39:08 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 18:39:08 +0300 Subject: [ofa-general] Re: [PATCH] libibmad/dump.c: Fix typo In-Reply-To: <20090922134611.GA15227@comcast.net> References: <20090922134611.GA15227@comcast.net> Message-ID: <20090922153908.GZ24398@me> On 09:46 Tue 22 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Tue Sep 22 08:48:39 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 18:48:39 +0300 Subject: [ofa-general] Re: [PATCHv3] opensm/osm_mesh.c: Remove edges in lash matrix In-Reply-To: <20090921134151.GC14213@comcast.net> References: <20090921134151.GC14213@comcast.net> Message-ID: <20090922154839.GB24398@me> On 09:41 Mon 21 Sep , Hal Rosenstock wrote: > > The intent of this change is to remove edge nodes (by "not counting > them). > > The point of this heuristic is to deal with the case of small > lattices which can easily have more surface than interior, > which leads to choosing a non representative seed. This causes > impossible counts to get reported. > > Signed-off-by: Robert Pearson > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From weiny2 at llnl.gov Tue Sep 22 08:56:20 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 22 Sep 2009 08:56:20 -0700 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD86B@MNEXMB1.qlogic.org> References: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD86B@MNEXMB1.qlogic.org> Message-ID: <20090922085620.52b2ac17.weiny2@llnl.gov> On Tue, 22 Sep 2009 04:54:22 -0500 Karun Sharma wrote: > Please use "-F" option while running the tests. It will ignore the "Conflicting CPU frequency" errors. You will still see these messages on your screen, but with "-F", you will also see the results. Which commands are you referring to with the "-F" option? I just did a pull from git://git.openfabrics.org/~mst/perftest.git and I don't see a -F for the commands there. Have these tools moved? Or are you speaking of other tools? I also run into this problem and I have resorted to turning off cpuspeed. Thanks, Ira > > Regards > Karun > > -----Original Message----- > From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Sneha Mistry > Sent: Tuesday, September 22, 2009 2:35 PM > To: general at lists.openfabrics.org > Subject: [ofa-general] Problem while running ib tests > > Hi, > > I have two Dual port HCAa card.Which I have installed in > same PC. > > I am using OpenSuse 10.3 and installed OFED 1.4. > > If I try to run any IB bandwidth test or latency test it end us with warning > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > Is it an OFED related problem or Linux kernel problem. > > HCA are connected back to back and output of ibstat is as given below. > --------------------------------------------------------------------------------------------------------- > CA 'mthca0' > CA type: MT25208 (MT23108 compat mode) > Number of ports: 2 > Firmware version: 4.8.200 > Hardware version: 20 > Node GUID: 0x0002c90200283734 > System image GUID: 0x0002c90200283737 > Port 1: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 1 > LMC: 0 > SM lid: 1 > Capability mask: 0x02510a6a > Port GUID: 0x0002c90200283735 > Port 2: > State: Initializing > Physical state: LinkUp > Rate: 20 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283736 > CA 'mthca1' > CA type: MT25208 (MT23108 compat mode) > Number of ports: 2 > Firmware version: 4.8.200 > Hardware version: 20 > Node GUID: 0x0002c90200283730 > System image GUID: 0x0002c90200283733 > Port 1: > State: Initializing > Physical state: LinkUp > Rate: 20 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283731 > Port 2: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 3 > LMC: 0 > SM lid: 1 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283732 > -------------------------------------------------------------------------------------------------------------------------------------- > > Even ibibnetdiscover is detecting one hope. > So there is nothing wrong if I try to test 2 HCA connected back to back. > > Please tell me what can be done for running ib tests. > > Thanks, > SGM > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From rdreier at cisco.com Tue Sep 22 10:51:10 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Sep 2009 10:51:10 -0700 Subject: [ofa-general] Re: [PATCH] mlx4: confiugre cache line size In-Reply-To: <20090916110302.GA32767@mtls03> (Eli Cohen's message of "Wed, 16 Sep 2009 14:03:03 +0300") References: <20090916110302.GA32767@mtls03> Message-ID: > +#if defined(cache_line_size) Why the #if here? Do we just need to include explicitly to make sure we get the define? > + *((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) = > + order_base_2(cache_line_size() / 16) << 5; Trivial but I think it's safe to assume a cacheline is always a power of 2. And I think it's clearer (and avoids generating a divide) to use subtraction rather than division... so this could all become: (ilog2(cache_line_size()) - 4) << 5; - R. From rdreier at cisco.com Tue Sep 22 11:27:25 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Sep 2009 11:27:25 -0700 Subject: [ofa-general] Re: [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code In-Reply-To: (Sean Hefty's message of "Wed, 9 Sep 2009 14:22:28 -0700") References: Message-ID: > The locking is needed to protect against items being removed from rmpp_list in > recv_timeout_handler() and recv_cleanup_handler(). No new items should be added > to the rmpp_list when ib_cancel_rmpp_recvs() is running (or there's a separate > bug). OK so how about something like this? Just hold the lock to mark the items on the list as being canceled, and then actually cancel the delayed work without the lock. I think this doesn't leave any races or holes where the delayed work can mess up the cancel. diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c index 57a3c6f..4e0f282 100644 --- a/drivers/infiniband/core/mad_rmpp.c +++ b/drivers/infiniband/core/mad_rmpp.c @@ -37,7 +37,8 @@ enum rmpp_state { RMPP_STATE_ACTIVE, RMPP_STATE_TIMEOUT, - RMPP_STATE_COMPLETE + RMPP_STATE_COMPLETE, + RMPP_STATE_CANCELING }; struct mad_rmpp_recv { @@ -87,18 +88,22 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent) spin_lock_irqsave(&agent->lock, flags); list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) { + if (rmpp_recv->state != RMPP_STATE_COMPLETE) + ib_free_recv_mad(rmpp_recv->rmpp_wc); + rmpp_recv->state = RMPP_STATE_CANCELING; + } + spin_unlock_irqrestore(&agent->lock, flags); + + list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) { cancel_delayed_work(&rmpp_recv->timeout_work); cancel_delayed_work(&rmpp_recv->cleanup_work); } - spin_unlock_irqrestore(&agent->lock, flags); flush_workqueue(agent->qp_info->port_priv->wq); list_for_each_entry_safe(rmpp_recv, temp_rmpp_recv, &agent->rmpp_list, list) { list_del(&rmpp_recv->list); - if (rmpp_recv->state != RMPP_STATE_COMPLETE) - ib_free_recv_mad(rmpp_recv->rmpp_wc); destroy_rmpp_recv(rmpp_recv); } } @@ -260,6 +265,10 @@ static void recv_cleanup_handler(struct work_struct *work) unsigned long flags; spin_lock_irqsave(&rmpp_recv->agent->lock, flags); + if (rmpp_recv->state == RMPP_STATE_CANCELING) { + spin_unlock_irqrestore(&rmpp_recv->agent->lock, flags); + return; + } list_del(&rmpp_recv->list); spin_unlock_irqrestore(&rmpp_recv->agent->lock, flags); destroy_rmpp_recv(rmpp_recv); From hnrose at comcast.net Tue Sep 22 11:38:58 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 22 Sep 2009 14:38:58 -0400 Subject: [ofa-general] [PATCH] opensm/osm_mesh.c: Add dump_mesh routine at OSM_LOG_DEBUG log level Message-ID: <20090922183858.GA1984@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 260e2f8..beb6bd7 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -1565,6 +1565,39 @@ err: return -1; } +static void dump_mesh(lash_t *p_lash) +{ + osm_log_t *p_log = &p_lash->p_osm->log; + int sw; + int num_switches = p_lash->num_switches; + int dimension; + int i, j, k; + switch_t *s, *s2; + char buf[256], *p; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw < num_switches; sw++) { + p = buf; + s = p_lash->switches[sw]; + dimension = s->node->dimension; + p += sprintf(p, "["); + for (i = 0; i < dimension; i++) + p += sprintf(p, "%2d%s", s->node->coord[i], + (i == dimension - 1) ? "]" : ","); + for (j = 0; j < s->node->num_links; j++) { + s2 = p_lash->switches[s->node->links[j]->switch_id]; + p += sprintf(p, " [%d]->[", j); + for (k = 0; k < dimension; k++) + p += sprintf(p, "%2d%s", s2->node->coord[k], + (k == dimension - 1) ? "] " : ","); + } + OSM_LOG(p_log, OSM_LOG_DEBUG, "%s\n", buf); + } + + OSM_LOG_EXIT(p_log); +} + /* * osm_do_mesh_analysis */ @@ -1653,6 +1686,9 @@ int osm_do_mesh_analysis(lash_t *p_lash) OSM_LOG(p_log, OSM_LOG_INFO, "%s", buf); } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) + dump_mesh(p_lash); + done: mesh_delete(mesh); OSM_LOG_EXIT(p_log); From sashak at voltaire.com Tue Sep 22 11:50:14 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 21:50:14 +0300 Subject: [ofa-general] Re: [PATCH 1/2] opensm: avoid LASH use-after-free when switch is deleted from fabric. In-Reply-To: <1251486496-24812-2-git-send-email-jaschut@sandia.gov> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> Message-ID: <20090922185014.GF24398@me> Hi Jim, On 13:08 Fri 28 Aug , Jim Schutt wrote: > When LASH is run against ibsim, valgrind reports the following > (on x86_64) after a switch is removed from the fabric: > > ==15699== Invalid write of size 8 > ==15699== at 0x45FD8A: switch_delete (osm_ucast_lash.c:648) > ==15699== by 0x461483: lash_cleanup (osm_ucast_lash.c:1123) > ==15699== by 0x461848: lash_process (osm_ucast_lash.c:1230) > ==15699== by 0x45C043: ucast_mgr_route (osm_ucast_mgr.c:1016) > ==15699== by 0x45C1A0: osm_ucast_mgr_process (osm_ucast_mgr.c:1057) > ==15699== by 0x44F11B: do_sweep (osm_state_mgr.c:1283) > ==15699== by 0x44F539: osm_state_mgr_process (osm_state_mgr.c:1398) > ==15699== by 0x447296: sm_process (osm_sm.c:90) > ==15699== by 0x4473FE: sm_sweeper (osm_sm.c:130) > ==15699== by 0x5023505: __cl_thread_wrapper (cl_thread.c:57) > ==15699== by 0x37AC006366: start_thread (in /lib64/libpthread-2.5.so) > ==15699== by 0x37AB4D30AC: clone (in /lib64/libc-2.5.so) > ==15699== Address 0x9B28198 is 152 bytes inside a block of size 160 free'd > ==15699== at 0x4A0541E: free (vg_replace_malloc.c:233) > ==15699== by 0x453866: osm_switch_delete (osm_switch.c:97) > ==15699== by 0x4116AA: drop_mgr_remove_switch (osm_drop_mgr.c:290) > ==15699== by 0x411820: drop_mgr_process_node (osm_drop_mgr.c:339) > ==15699== by 0x411D0C: osm_drop_mgr_process (osm_drop_mgr.c:465) > ==15699== by 0x44EF97: do_sweep (osm_state_mgr.c:1231) > ==15699== by 0x44F539: osm_state_mgr_process (osm_state_mgr.c:1398) > ==15699== by 0x447296: sm_process (osm_sm.c:90) > ==15699== by 0x4473FE: sm_sweeper (osm_sm.c:130) > ==15699== by 0x5023505: __cl_thread_wrapper (cl_thread.c:57) > ==15699== by 0x37AC006366: start_thread (in /lib64/libpthread-2.5.so) > ==15699== by 0x37AB4D30AC: clone (in /lib64/libc-2.5.so) > > The root cause is that in order to perform SL lookup for path record > queries, LASH needs to keep persistent data between calls to the > routing engine. > > LASH uses the osm_switch_t:priv member to speed lookup of the LASH > switch_t objects it needs to perform SL lookup, and has a corresponding > switch_t:p_sw member to point to the corresponding osm_switch_t object. > > When a switch is deleted from the fabric, the switch_t:p_sw value becomes > invalid, but LASH's switch_delete() uses it to clear the corresponding > osm_switch_t:priv value. Ok. I see the issue. This 'p_sw->priv = NULL' line was not in the original "priv" introduction code, but was added by mistake (AFAIR for "for sure" reason :)) by some subsequent patch. Why to not fix this by just removing this not actually needed statement? Sasha > > Solve this problem by adding a priv_release function pointer that > is set when osm_switch_t:priv is set. This allows the opensm core to > clean up after any routing engine that is using priv to access > persistent data (LASH seems to be the only one so far), without > knowing the details of how to do so. > > When multiple routing engines are configured, it also allows a routing > engine using osm_switch_t:priv to clean up if some other routing engine > using priv fails in an unexpected way. > > With this addition, the rules for using osm_switch_t:priv become: > 1) Never assign to priv without also assigning to priv_release. > 2) Always use priv_release() before assigning to priv; this > prevents memory issues due to unexpected errors in a > routing engine using priv. > 3) Always use priv_release() to clean up after a use of priv. > > Since updn uses osm_switch_t:priv, fix it up to follow the above > rules as well, for consistency. > > Signed-off-by: Jim Schutt > --- > opensm/include/opensm/osm_switch.h | 1 + > opensm/opensm/osm_switch.c | 2 ++ > opensm/opensm/osm_ucast_lash.c | 24 ++++++++++++++++++++---- > opensm/opensm/osm_ucast_updn.c | 15 +++++++++++---- > 4 files changed, 34 insertions(+), 8 deletions(-) > > diff --git a/opensm/include/opensm/osm_switch.h b/opensm/include/opensm/osm_switch.h > index 7ce28c5..d48f8c6 100644 > --- a/opensm/include/opensm/osm_switch.h > +++ b/opensm/include/opensm/osm_switch.h > @@ -106,6 +106,7 @@ typedef struct osm_switch { > unsigned endport_links; > unsigned need_update; > void *priv; > + void (*priv_release)(struct osm_switch *p_sw); > } osm_switch_t; > /* > * FIELDS > diff --git a/opensm/opensm/osm_switch.c b/opensm/opensm/osm_switch.c > index ce1ca63..fbf3973 100644 > --- a/opensm/opensm/osm_switch.c > +++ b/opensm/opensm/osm_switch.c > @@ -94,6 +94,8 @@ void osm_switch_delete(IN OUT osm_switch_t ** const pp_sw) > free(p_sw->hops[i]); > free(p_sw->hops); > } > + if (p_sw->priv_release) > + p_sw->priv_release(p_sw); > free(*pp_sw); > *pp_sw = NULL; > } > diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c > index 0a567b3..ceae7d8 100644 > --- a/opensm/opensm/osm_ucast_lash.c > +++ b/opensm/opensm/osm_ucast_lash.c > @@ -603,6 +603,17 @@ static int balance_virtual_lanes(lash_t * p_lash, unsigned lanes_needed) > return 0; > } > > +static void lash_switch_priv_release(osm_switch_t *osm_sw) > +{ > + switch_t *sw = osm_sw->priv; > + > + osm_sw->priv_release = NULL; > + osm_sw->priv = NULL; > + > + if (sw && sw->p_sw == osm_sw) > + sw->p_sw = NULL; > +} > + > static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw) > { > unsigned num_switches = p_lash->num_switches; > @@ -628,8 +639,12 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw > } > > sw->p_sw = p_sw; > - if (p_sw) > + if (p_sw) { > + if (p_sw->priv_release) > + p_sw->priv_release(p_sw); > p_sw->priv = sw; > + p_sw->priv_release = lash_switch_priv_release; > + } > > if (osm_mesh_node_create(p_lash, sw)) { > free(sw->dij_channels); > @@ -644,8 +659,8 @@ static void switch_delete(lash_t *p_lash, switch_t * sw) > { > if (sw->dij_channels) > free(sw->dij_channels); > - if (sw->p_sw) > - sw->p_sw->priv = NULL; > + if (sw->p_sw && sw->p_sw->priv_release) > + sw->p_sw->priv_release(sw->p_sw); > free(sw); > } > > @@ -1113,7 +1128,8 @@ static void lash_cleanup(lash_t * p_lash) > while (p_next_sw != (osm_switch_t *) cl_qmap_end(&p_subn->sw_guid_tbl)) { > p_sw = p_next_sw; > p_next_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > - p_sw->priv = NULL; > + if (p_sw->priv_release) > + p_sw->priv_release(p_sw); > } > > if (p_lash->switches) { > diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c > index bb9ccda..dc5f459 100644 > --- a/opensm/opensm/osm_ucast_updn.c > +++ b/opensm/opensm/osm_ucast_updn.c > @@ -404,10 +404,13 @@ static struct updn_node *create_updn_node(osm_switch_t * sw) > return u; > } > > -static void delete_updn_node(struct updn_node *u) > +static void updn_sw_priv_release(osm_switch_t *sw) > { > - u->sw->priv = NULL; > - free(u); > + if (sw->priv) > + free(sw->priv); > + > + sw->priv_release = NULL; > + sw->priv = NULL; > } > > /********************************************************************** > @@ -589,6 +592,8 @@ static int updn_lid_matrices(void *ctx) > item != cl_qmap_end(&p_updn->p_osm->subn.sw_guid_tbl); > item = cl_qmap_next(item)) { > p_sw = (osm_switch_t *)item; > + if (p_sw->priv_release) > + p_sw->priv_release(p_sw); > p_sw->priv = create_updn_node(p_sw); > if (!p_sw->priv) { > OSM_LOG(&(p_updn->p_osm->log), OSM_LOG_ERROR, "ERR AA0C: " > @@ -596,6 +601,7 @@ static int updn_lid_matrices(void *ctx) > OSM_LOG_EXIT(&p_updn->p_osm->log); > return -1; > } > + p_sw->priv_release = updn_sw_priv_release; > } > > /* First setup root nodes */ > @@ -653,7 +659,8 @@ static int updn_lid_matrices(void *ctx) > item != cl_qmap_end(&p_updn->p_osm->subn.sw_guid_tbl); > item = cl_qmap_next(item)) { > p_sw = (osm_switch_t *) item; > - delete_updn_node(p_sw->priv); > + if (p_sw->priv_release) > + p_sw->priv_release(p_sw); > } > > OSM_LOG_EXIT(&p_updn->p_osm->log); > -- > 1.5.6.GIT > > From eli at dev.mellanox.co.il Tue Sep 22 11:54:15 2009 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 22 Sep 2009 21:54:15 +0300 Subject: [ofa-general] Re: [PATCH] mlx4: confiugre cache line size In-Reply-To: References: <20090916110302.GA32767@mtls03> Message-ID: <20090922185415.GA13020@mtls03> On Tue, Sep 22, 2009 at 10:51:10AM -0700, Roland Dreier wrote: I agree with you on both comments. Would you like me to resend or will you make the necessary changes? > > > +#if defined(cache_line_size) > > Why the #if here? Do we just need to include explicitly > to make sure we get the define? > > > + *((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) = > > + order_base_2(cache_line_size() / 16) << 5; > > Trivial but I think it's safe to assume a cacheline is always a power of > 2. And I think it's clearer (and avoids generating a divide) to use > subtraction rather than division... so this could all become: > > (ilog2(cache_line_size()) - 4) << 5; > > - R. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html From nathan at robotics.net Tue Sep 22 11:58:39 2009 From: nathan at robotics.net (Nathan Stratton) Date: Tue, 22 Sep 2009 13:58:39 -0500 (CDT) Subject: [ofa-general] Fedora 11, kernel 2.6.31 Message-ID: Having an issue with getting verbs working on 2.6.31. I am running Fedora 11 with 2.6.31 and 1.1.2-0.1.gb00dc7d libibverbs. Everything looks great until I run ibv_srq_pingpong to the server. It shows local/remote address a bunch of times and then freezes. I wanted to try OFED, but it does not work with 2.6.31. :( [root at xen1 src]# lsmod Module Size Used by ib_ucm 13752 0 rdma_ucm 12112 0 ib_uverbs 32256 2 ib_ucm,rdma_ucm 8021q 21200 0 bonding 87140 0 fuse 60016 0 bridge 40488 0 stp 2588 1 bridge llc 6240 2 bridge,stp ib_ipoib 68880 0 igb 80620 0 ib_mthca 123700 0 3w_9xxx 33092 0 rng_core 4688 0 [root at xen1 src]# ibv_devices device node GUID ------ ---------------- mthca0 0005ad00000327e8 [root at xen1 src]# ibv_devinfo hca_id: mthca0 fw_ver: 3.5.0 node_guid: 0005:ad00:0003:27e8 sys_image_guid: 0005:ad00:0100:d050 vendor_id: 0x02c9 vendor_part_id: 23108 hw_ver: 0xA1 board_id: MT_0270110001 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 2 port_lid: 11 port_lmc: 0x00 port: 2 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 512 (2) sm_lid: 0 port_lid: 0 port_lmc: 0x00 [root at xen1 src]# ibv_srq_pingpong 10.13.0.220 local address: LID 0x000b, QPN 0x470406, PSN 0x697578 local address: LID 0x000b, QPN 0x470407, PSN 0x4eb139 local address: LID 0x000b, QPN 0x470408, PSN 0x41b959 local address: LID 0x000b, QPN 0x470409, PSN 0x3e0cfd local address: LID 0x000b, QPN 0x47040a, PSN 0x3aff03 local address: LID 0x000b, QPN 0x47040b, PSN 0x212873 local address: LID 0x000b, QPN 0x47040c, PSN 0xecd920 local address: LID 0x000b, QPN 0x47040d, PSN 0xb9fb63 local address: LID 0x000b, QPN 0x47040e, PSN 0xa76195 local address: LID 0x000b, QPN 0x47040f, PSN 0x9c74bc local address: LID 0x000b, QPN 0x470410, PSN 0x5644e6 local address: LID 0x000b, QPN 0x470411, PSN 0x5b2db8 local address: LID 0x000b, QPN 0x470412, PSN 0x18420e local address: LID 0x000b, QPN 0x470413, PSN 0x6a2a6c local address: LID 0x000b, QPN 0x470414, PSN 0xd70ff0 local address: LID 0x000b, QPN 0x470415, PSN 0xe3ac2a remote address: LID 0x0004, QPN 0x5d0406, PSN 0xd6f70e remote address: LID 0x0004, QPN 0x5d0407, PSN 0x2d2ee0 remote address: LID 0x0004, QPN 0x5d0408, PSN 0x4bb6aa remote address: LID 0x0004, QPN 0x5d0409, PSN 0x3821b1 remote address: LID 0x0004, QPN 0x5d040a, PSN 0xc91470 remote address: LID 0x0004, QPN 0x5d040b, PSN 0xd2a912 remote address: LID 0x0004, QPN 0x5d0418, PSN 0x19ea09 remote address: LID 0x0004, QPN 0x5d0419, PSN 0x5df7ce remote address: LID 0x0004, QPN 0x5d041a, PSN 0x5c705a remote address: LID 0x0004, QPN 0x5d041b, PSN 0x0b2fd4 remote address: LID 0x0004, QPN 0x5d041c, PSN 0x3d0ae8 remote address: LID 0x0004, QPN 0x5d041d, PSN 0xe4d55b remote address: LID 0x0004, QPN 0x5d041e, PSN 0x0b87ab remote address: LID 0x0004, QPN 0x5d041f, PSN 0xbc4f7c remote address: LID 0x0004, QPN 0x5d0420, PSN 0x66c48a remote address: LID 0x0004, QPN 0x5d0421, PSN 0xd77a86 ><> Nathan Stratton CTO, BlinkMind, Inc. nathan at robotics.net nathan at blinkmind.com http://www.robotics.net http://www.blinkmind.com From sashak at voltaire.com Tue Sep 22 12:31:10 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 22:31:10 +0300 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: <4AB76B9F.1080706@voltaire.com> References: <4AA8E97E.1090109@voltaire.com> <4AB0F0E5.2000305@voltaire.com> <4AB76B9F.1080706@voltaire.com> Message-ID: <20090922193110.GG24398@me> Hi Doron, On 15:03 Mon 21 Sep , Doron Shoham wrote: > Add ibcheckroutes script. Wouldn't it be better to implement this using C program with help of newly introduced libibnetdisc library? Saving all subsequent ibtracert calls should improve performance dramatically. Other comments are below. > ibcheckroutes validates route between all leaf switches, switches or > CAs in the fabric. > > Signed-off-by: Doron Shoham > --- > infiniband-diags/Makefile.am | 4 +- > infiniband-diags/configure.in | 1 + > infiniband-diags/man/ibcheckroutes.8 | 46 ++++++++++ > infiniband-diags/scripts/ibcheckroutes.in | 138 +++++++++++++++++++++++++++++ > 4 files changed, 187 insertions(+), 2 deletions(-) > create mode 100644 infiniband-diags/man/ibcheckroutes.8 > create mode 100644 infiniband-diags/scripts/ibcheckroutes.in > > diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am > index 1cdb60e..57363c4 100644 > --- a/infiniband-diags/Makefile.am > +++ b/infiniband-diags/Makefile.am > @@ -33,7 +33,7 @@ sbin_SCRIPTS = scripts/ibcheckerrs scripts/ibchecknet scripts/ibchecknode \ > scripts/iblinkinfo.pl scripts/ibprintswitch.pl \ > scripts/ibprintca.pl scripts/ibprintrt.pl \ > scripts/ibfindnodesusing.pl scripts/ibidsverify.pl \ > - scripts/check_lft_balance.pl > + scripts/check_lft_balance.pl scripts/ibcheckroutes > > noinst_LIBRARIES = libcommon.a > > @@ -76,7 +76,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ > man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \ > man/ibdatacounts.8 man/ibdatacounters.8 \ > man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \ > - man/check_lft_balance.8 > + man/check_lft_balance.8 man/ibcheckroutes.8 > > BUILT_SOURCES = ibdiag_version > ibdiag_version: > diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in > index 3ef35cc..aa178c5 100644 > --- a/infiniband-diags/configure.in > +++ b/infiniband-diags/configure.in > @@ -158,6 +158,7 @@ AC_CONFIG_FILES([\ > scripts/ibcheckportwidth \ > scripts/ibcheckstate \ > scripts/ibcheckwidth \ > + scripts/ibcheckroutes \ > scripts/ibclearcounters \ > scripts/ibclearerrors \ > scripts/ibdatacounts \ > diff --git a/infiniband-diags/man/ibcheckroutes.8 b/infiniband-diags/man/ibcheckroutes.8 > new file mode 100644 > index 0000000..fe6f0d6 > --- /dev/null > +++ b/infiniband-diags/man/ibcheckroutes.8 > @@ -0,0 +1,46 @@ > +.TH IBCHECKROUTES 8 "September 10, 2009" "OpenIB" "OpenIB Diagnostics" > + > +.SH NAME > +ibcheckroutes \- validate routes between all hosts in fabric > + > +.SH SYNOPSIS > +.B ibcheckroutes > +[\-l] [\-s] [\-c] [\-n topology-file ] [\-h] [\-N] [\-b] [\-e] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] > + > +.SH DESCRIPTION > +.PP > +ibcheckroutes is a script which can use a full topology file that was created by ibnetdiscover or > +scans the subnet. Then it validates routes between all leaf switches, switches or CAs in the fabric. > + > +.SH OPTIONS > +.PP > +\-n Use topology-file. > +.PP > +\-l Check routes between all leaf switches. > +.PP > +\-s Check routes between all switches. > +.PP > +\-c Check routes between all CAs. > +.PP > +\-h Show help. > +.PP > +\-N Use mono rather than color mode. > +.PP > +\-b Suppress output. > +.PP > +\-e Show errors only. > +.PP > +\-C Use the specified ca_name. > +.PP > +\-P Use the specified ca_port. > +.PP > +\-t Override the default timeout for the solicited mads. > + > +.SH SEE ALSO > +.BR ibnetdiscover(8), > +.BR ibtracert(8) > + > +.SH AUTHOR > +.TP > +Doron Shoham > +.RI < dorons at voltaire.com > > diff --git a/infiniband-diags/scripts/ibcheckroutes.in b/infiniband-diags/scripts/ibcheckroutes.in > new file mode 100644 > index 0000000..c7dd191 > --- /dev/null > +++ b/infiniband-diags/scripts/ibcheckroutes.in > @@ -0,0 +1,138 @@ > +#!/bin/sh By using '/bin/sh' the script is declared as 'sh' compatible, but below we can find that 'bash' extensions are used intensively. > + > +IBPATH=${IBPATH:- at IBSCRIPTPATH@} > + > +function usage() { > + echo -e Usage: `basename $0` "[-l] [-s] [-c] [-h] [-N] [-b] [-e] [-n topology-file ] \ > +[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms]" > + echo -e " Validate routes between all leaf switches, switches or CAs in the fabric" > + echo -e " -n - Use topology-file" > + echo -e " -l - Check routes between all leaf switches" > + echo -e " -s - Check routes between all switches" > + echo -e " -c - Check routes between all CAs" > + echo -e " -h - Show help" > + echo -e " -N - Use mono rather than color mode" > + echo -e " -b - Suppress output" > + echo -e " -e - Show errors only" > + echo -e " -C - Use the specified ca_name" > + echo -e " -P - Use the specified ca_port" > + echo -e " -t - Override the default timeout for the solicited mads" > + exit -1 > +} > + > +function user_abort() { > + echo "Aborted" > + exit 1 > +} > + > +function green() { > + if [ "$bw" = "yes" ]; then > + printf "${res_col}[OK]\n" $1 > + return > + fi > + printf "\033[1;032m${res_col}[OK]\033[0;39m\n" $1 > +} > + > +function red() { > + if [ "$bw" = "yes" ]; then > + printf "${res_col}[FAILED]\n" "$1" > + return > + fi > + printf "\033[31m${res_col}[FAILED]\033[0m\n" "$1" > +} > + > +trap user_abort SIGINT SIGTERM > + > +bw="" > +brief=0 > +error=0 > +ca_info="" > +st=0 > +method="leaf" > +topofile=/tmp/net > +discover=1 > +res_col="%-20.20s" > + > +function get_opts() { > + while getopts P:C:t:n:beNhlsc o; do > + case "$o" in > + n) > + topofile="$OPTARG" > + discover=0 > + ;; > + l) > + method="leaf" > + ;; > + s) > + method="sw" > + ;; > + c) > + method="ca" > + ;; > + h) > + usage > + ;; > + N) > + bw="yes" > + ;; > + b) > + brief=1 > + ;; > + e) > + error=1 > + ;; > + P | C | t | timeout) > + ca_info="$ca_info -$o $OPTARG" > + ;; > + *) > + usage > + ;; > + esac > + done > +} > + > +get_opts $* > + > +if [ $discover -eq 1 ]; then > + $IBPATH/ibnetdiscover $ca_info > $topofile > +fi > + > +# find LIDs to check > +case $method in > +leaf) > + [ $brief -eq 0 ] && echo -e "Checking routes between all Leaf Switches" > + LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) if (v!=0) print v}' $topofile)) This '/# lid /' match expression as well as using (NF - 1) makes your script *hardly* dependent from ibnetdiscover output format, for example if more information will be added in this comment line it will likely break your things. > + ;; > +sw) > + [ $brief -eq 0 ] && echo -e "Checking routes between all Switches" > + LIDS=($(awk '/^Switch/ {a[$(NF-2)]=$(NF-2)} END{for(v in a) if (v!=0) print v}' $topofile)) > + ;; > +ca) > + [ $brief -eq 0 ] && echo -e "Checking routes between all CAs" > + LIDS=($(awk '/# lid /{lmc=$7; e=2^lmc+$5; for(i=$5; i + ;; Ditto. Also could you format the code in more friendly/readable C-like form? Sasha > +esac > + > +# number of LIDs > +N=${#LIDS[@]} > + > +if [ $N -lt 2 ]; then > + [ $brief -eq 0 ] && echo "Error: found single node" > + exit 0 > +fi > + > +# check routes > +[ $brief -eq 0 ] && echo -e "Checking route between:\nSource lid --> Destination lid" > +for((s=0; s + for ((d=s+1; d + $IBPATH/ibtracert $ca_info ${LIDS[$s]} ${LIDS[$d]} > /dev/null > + if [ $? -eq 0 ]; then > + [ $brief -eq 0 ] && [ $error -eq 0 ] && green "${LIDS[$s]}-->${LIDS[$d]}" > + else > + [ $brief -eq 0 ] && red "${LIDS[$s]}-->${LIDS[$d]}" > + st=1 > + fi > + done > +done > + > +exit $st > -- > 1.5.4 > From sashak at voltaire.com Tue Sep 22 12:43:31 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 22:43:31 +0300 Subject: [ofa-general] [PATCH] infiniband-diags/scripts: Add ibcheckroutes to scripts In-Reply-To: References: <4AA8E97E.1090109@voltaire.com> <4AACA572.2000603@voltaire.com> <4AAE53EA.1030009@gmail.com> Message-ID: <20090922194331.GH24398@me> On 10:41 Mon 14 Sep , Hal Rosenstock wrote: > > How are the leaf switches determined (from core switches) in the > ibnetdiscover output ? Using this: +# find all leaf switches LIDs +LIDS=($(awk '/# lid /{a[$(NF-1)]=$(NF-1)} END{for(v in a) print v}' $topofile)) It parses out remote switch lids in CA's port connection lines. Sasha From jaschut at sandia.gov Tue Sep 22 13:29:03 2009 From: jaschut at sandia.gov (Jim Schutt) Date: Tue, 22 Sep 2009 14:29:03 -0600 Subject: [ofa-general] Re: [PATCH 1/2] opensm: avoid LASH use-after-free when switch is deleted from fabric. In-Reply-To: <20090922185014.GF24398@me> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> <20090922185014.GF24398@me> Message-ID: <1253651343.4776.1125.camel@sale659.sandia.gov> Hi Sasha, On Tue, 2009-09-22 at 12:50 -0600, Sasha Khapyorsky wrote: > Hi Jim, > > On 13:08 Fri 28 Aug , Jim Schutt wrote: > > When LASH is run against ibsim, valgrind reports the following > > (on x86_64) after a switch is removed from the fabric: > > > > ==15699== Invalid write of size 8 > > ==15699== at 0x45FD8A: switch_delete (osm_ucast_lash.c:648) > > ==15699== by 0x461483: lash_cleanup (osm_ucast_lash.c:1123) > > ==15699== by 0x461848: lash_process (osm_ucast_lash.c:1230) > > ==15699== by 0x45C043: ucast_mgr_route (osm_ucast_mgr.c:1016) > > ==15699== by 0x45C1A0: osm_ucast_mgr_process (osm_ucast_mgr.c:1057) > > ==15699== by 0x44F11B: do_sweep (osm_state_mgr.c:1283) > > ==15699== by 0x44F539: osm_state_mgr_process (osm_state_mgr.c:1398) > > ==15699== by 0x447296: sm_process (osm_sm.c:90) > > ==15699== by 0x4473FE: sm_sweeper (osm_sm.c:130) > > ==15699== by 0x5023505: __cl_thread_wrapper (cl_thread.c:57) > > ==15699== by 0x37AC006366: start_thread (in /lib64/libpthread-2.5.so) > > ==15699== by 0x37AB4D30AC: clone (in /lib64/libc-2.5.so) > > ==15699== Address 0x9B28198 is 152 bytes inside a block of size 160 free'd > > ==15699== at 0x4A0541E: free (vg_replace_malloc.c:233) > > ==15699== by 0x453866: osm_switch_delete (osm_switch.c:97) > > ==15699== by 0x4116AA: drop_mgr_remove_switch (osm_drop_mgr.c:290) > > ==15699== by 0x411820: drop_mgr_process_node (osm_drop_mgr.c:339) > > ==15699== by 0x411D0C: osm_drop_mgr_process (osm_drop_mgr.c:465) > > ==15699== by 0x44EF97: do_sweep (osm_state_mgr.c:1231) > > ==15699== by 0x44F539: osm_state_mgr_process (osm_state_mgr.c:1398) > > ==15699== by 0x447296: sm_process (osm_sm.c:90) > > ==15699== by 0x4473FE: sm_sweeper (osm_sm.c:130) > > ==15699== by 0x5023505: __cl_thread_wrapper (cl_thread.c:57) > > ==15699== by 0x37AC006366: start_thread (in /lib64/libpthread-2.5.so) > > ==15699== by 0x37AB4D30AC: clone (in /lib64/libc-2.5.so) > > > > The root cause is that in order to perform SL lookup for path record > > queries, LASH needs to keep persistent data between calls to the > > routing engine. > > > > LASH uses the osm_switch_t:priv member to speed lookup of the LASH > > switch_t objects it needs to perform SL lookup, and has a corresponding > > switch_t:p_sw member to point to the corresponding osm_switch_t object. > > > > When a switch is deleted from the fabric, the switch_t:p_sw value becomes > > invalid, but LASH's switch_delete() uses it to clear the corresponding > > osm_switch_t:priv value. > > Ok. I see the issue. This 'p_sw->priv = NULL' line was not in the > original "priv" introduction code, but was added by mistake (AFAIR for > "for sure" reason :)) by some subsequent patch. > > Why to not fix this by just removing this not actually needed statement? Hmmm, I suppose that would fix this particular issue just fine, and is a lot less code ;) But consider this background info that I didn't include in the patch description: I'm working on another routing engine that also uses osm_switch_t:priv to point to data that persists between calls to the routing engine, as LASH does. And like LASH my objects have a pointer to the corresponding osm_switch_t. Since trying to implement this engine was my first experience with opensm, it checks these links before using them by making sure that the osm_switch_t my object references points back to my object, because I'm paranoid. And my engine doesn't overwrite pointers it expects to be NULL, because I'm really paranoid. So under circumstances where I had two routing engines configured, if my engine failed over to LASH because of some problem caused by downing a switch in the fabric, then routing reverted back to my engine when the problem cleared up, non-NULL osm_switch_t:priv values would keep my engine from working. So I came up with this priv_release() business to provide a general way for the opensm core to clean up after unexpected behavior of a routing engine. In the event you remove the 'p_sw->priv = NULL' line as the fix to the use-after-free issue, and I get my routing engine into good enough shape to submit, should I resubmit this patch too, or should I be less paranoid and remove the extra checks in my engine? Thanks for taking a look. -- Jim > > Sasha > > > > > Solve this problem by adding a priv_release function pointer that > > is set when osm_switch_t:priv is set. This allows the opensm core to > > clean up after any routing engine that is using priv to access > > persistent data (LASH seems to be the only one so far), without > > knowing the details of how to do so. > > > > When multiple routing engines are configured, it also allows a routing > > engine using osm_switch_t:priv to clean up if some other routing engine > > using priv fails in an unexpected way. > > > > With this addition, the rules for using osm_switch_t:priv become: > > 1) Never assign to priv without also assigning to priv_release. > > 2) Always use priv_release() before assigning to priv; this > > prevents memory issues due to unexpected errors in a > > routing engine using priv. > > 3) Always use priv_release() to clean up after a use of priv. > > > > Since updn uses osm_switch_t:priv, fix it up to follow the above > > rules as well, for consistency. > > > > Signed-off-by: Jim Schutt > > --- > > opensm/include/opensm/osm_switch.h | 1 + > > opensm/opensm/osm_switch.c | 2 ++ > > opensm/opensm/osm_ucast_lash.c | 24 ++++++++++++++++++++---- > > opensm/opensm/osm_ucast_updn.c | 15 +++++++++++---- > > 4 files changed, 34 insertions(+), 8 deletions(-) > > > > diff --git a/opensm/include/opensm/osm_switch.h b/opensm/include/opensm/osm_switch.h > > index 7ce28c5..d48f8c6 100644 > > --- a/opensm/include/opensm/osm_switch.h > > +++ b/opensm/include/opensm/osm_switch.h > > @@ -106,6 +106,7 @@ typedef struct osm_switch { > > unsigned endport_links; > > unsigned need_update; > > void *priv; > > + void (*priv_release)(struct osm_switch *p_sw); > > } osm_switch_t; > > /* > > * FIELDS > > diff --git a/opensm/opensm/osm_switch.c b/opensm/opensm/osm_switch.c > > index ce1ca63..fbf3973 100644 > > --- a/opensm/opensm/osm_switch.c > > +++ b/opensm/opensm/osm_switch.c > > @@ -94,6 +94,8 @@ void osm_switch_delete(IN OUT osm_switch_t ** const pp_sw) > > free(p_sw->hops[i]); > > free(p_sw->hops); > > } > > + if (p_sw->priv_release) > > + p_sw->priv_release(p_sw); > > free(*pp_sw); > > *pp_sw = NULL; > > } > > diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c > > index 0a567b3..ceae7d8 100644 > > --- a/opensm/opensm/osm_ucast_lash.c > > +++ b/opensm/opensm/osm_ucast_lash.c > > @@ -603,6 +603,17 @@ static int balance_virtual_lanes(lash_t * p_lash, unsigned lanes_needed) > > return 0; > > } > > > > +static void lash_switch_priv_release(osm_switch_t *osm_sw) > > +{ > > + switch_t *sw = osm_sw->priv; > > + > > + osm_sw->priv_release = NULL; > > + osm_sw->priv = NULL; > > + > > + if (sw && sw->p_sw == osm_sw) > > + sw->p_sw = NULL; > > +} > > + > > static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw) > > { > > unsigned num_switches = p_lash->num_switches; > > @@ -628,8 +639,12 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw > > } > > > > sw->p_sw = p_sw; > > - if (p_sw) > > + if (p_sw) { > > + if (p_sw->priv_release) > > + p_sw->priv_release(p_sw); > > p_sw->priv = sw; > > + p_sw->priv_release = lash_switch_priv_release; > > + } > > > > if (osm_mesh_node_create(p_lash, sw)) { > > free(sw->dij_channels); > > @@ -644,8 +659,8 @@ static void switch_delete(lash_t *p_lash, switch_t * sw) > > { > > if (sw->dij_channels) > > free(sw->dij_channels); > > - if (sw->p_sw) > > - sw->p_sw->priv = NULL; > > + if (sw->p_sw && sw->p_sw->priv_release) > > + sw->p_sw->priv_release(sw->p_sw); > > free(sw); > > } > > > > @@ -1113,7 +1128,8 @@ static void lash_cleanup(lash_t * p_lash) > > while (p_next_sw != (osm_switch_t *) cl_qmap_end(&p_subn->sw_guid_tbl)) { > > p_sw = p_next_sw; > > p_next_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > > - p_sw->priv = NULL; > > + if (p_sw->priv_release) > > + p_sw->priv_release(p_sw); > > } > > > > if (p_lash->switches) { > > diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c > > index bb9ccda..dc5f459 100644 > > --- a/opensm/opensm/osm_ucast_updn.c > > +++ b/opensm/opensm/osm_ucast_updn.c > > @@ -404,10 +404,13 @@ static struct updn_node *create_updn_node(osm_switch_t * sw) > > return u; > > } > > > > -static void delete_updn_node(struct updn_node *u) > > +static void updn_sw_priv_release(osm_switch_t *sw) > > { > > - u->sw->priv = NULL; > > - free(u); > > + if (sw->priv) > > + free(sw->priv); > > + > > + sw->priv_release = NULL; > > + sw->priv = NULL; > > } > > > > /********************************************************************** > > @@ -589,6 +592,8 @@ static int updn_lid_matrices(void *ctx) > > item != cl_qmap_end(&p_updn->p_osm->subn.sw_guid_tbl); > > item = cl_qmap_next(item)) { > > p_sw = (osm_switch_t *)item; > > + if (p_sw->priv_release) > > + p_sw->priv_release(p_sw); > > p_sw->priv = create_updn_node(p_sw); > > if (!p_sw->priv) { > > OSM_LOG(&(p_updn->p_osm->log), OSM_LOG_ERROR, "ERR AA0C: " > > @@ -596,6 +601,7 @@ static int updn_lid_matrices(void *ctx) > > OSM_LOG_EXIT(&p_updn->p_osm->log); > > return -1; > > } > > + p_sw->priv_release = updn_sw_priv_release; > > } > > > > /* First setup root nodes */ > > @@ -653,7 +659,8 @@ static int updn_lid_matrices(void *ctx) > > item != cl_qmap_end(&p_updn->p_osm->subn.sw_guid_tbl); > > item = cl_qmap_next(item)) { > > p_sw = (osm_switch_t *) item; > > - delete_updn_node(p_sw->priv); > > + if (p_sw->priv_release) > > + p_sw->priv_release(p_sw); > > } > > > > OSM_LOG_EXIT(&p_updn->p_osm->log); > > -- > > 1.5.6.GIT > > > > > From sashak at voltaire.com Tue Sep 22 13:33:56 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 22 Sep 2009 23:33:56 +0300 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> Message-ID: <20090922203356.GK24398@me> On 07:32 Thu 17 Sep , Hal Rosenstock wrote: > > Is that (lids in place) always the case ? I don't see immediately how it could be not. > What about if the sets of PortInfo > for LID fail. Set can fail, but internal OpenSM port_lid_tbl will be up to date. Sasha From hal.rosenstock at gmail.com Tue Sep 22 13:44:44 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 22 Sep 2009 16:44:44 -0400 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: <20090922203356.GK24398@me> References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> <20090922203356.GK24398@me> Message-ID: On Tue, Sep 22, 2009 at 4:33 PM, Sasha Khapyorsky wrote: > On 07:32 Thu 17 Sep , Hal Rosenstock wrote: > > > > Is that (lids in place) always the case ? > > I don't see immediately how it could be not. > > > What about if the sets of PortInfo > > for LID fail. > > Set can fail, but internal OpenSM port_lid_tbl will be up to date. > Yeah, the port lid table will be OK but port's PortInfo won't (so base LID/LMC will be broken) for this scenario but it wouldn't affect this code in this way. So I don't have any theories as to how this could occur. Do you ? -- Hal > > Sasha > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenos at ncsa.uiuc.edu Tue Sep 22 14:30:09 2009 From: jenos at ncsa.uiuc.edu (Jeremy Enos) Date: Tue, 22 Sep 2009 16:30:09 -0500 Subject: [ofa-general] Fedora 10 OFED support plans In-Reply-To: <200909131229.29887.jackm@dev.mellanox.co.il> References: <4A8E4854.2060909@ncsa.uiuc.edu> <20090910193454.GB7552@obsidianresearch.com> <4AA95AF8.9020905@ncsa.uiuc.edu> <200909131229.29887.jackm@dev.mellanox.co.il> Message-ID: <4AB941E1.90304@ncsa.uiuc.edu> An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Tue Sep 22 15:27:23 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 22 Sep 2009 15:27:23 -0700 Subject: [ofa-general] RE: [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code In-Reply-To: References: Message-ID: <9DA1536B0B4943E7BC52280C977F1D23@amr.corp.intel.com> >OK so how about something like this? Just hold the lock to mark the >items on the list as being canceled, and then actually cancel the >delayed work without the lock. I think this doesn't leave any races or >holes where the delayed work can mess up the cancel. This looks good to me. Thanks for looking at this. Reviewed-by: Sean Hefty From rdreier at cisco.com Tue Sep 22 16:15:46 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Sep 2009 16:15:46 -0700 Subject: [ofa-general] Re: [PATCH] mlx4: confiugre cache line size In-Reply-To: <20090922185415.GA13020@mtls03> (Eli Cohen's message of "Tue, 22 Sep 2009 21:54:15 +0300") References: <20090916110302.GA32767@mtls03> <20090922185415.GA13020@mtls03> Message-ID: > I agree with you on both comments. Would you like me to resend or will > you make the necessary changes? please resend, thanks. From hnrose at comcast.net Tue Sep 22 16:59:36 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 22 Sep 2009 19:59:36 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/ibportstate.c: Eliminate uninitialized variable compile warning Message-ID: <20090922235936.GA6016@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 6fb97a8..55e1dd5 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -208,7 +208,7 @@ int main(int argc, char **argv) int state, physstate, lwe, lws, lwa, lse, lss, lsa; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, peerlsa; - int width, peerwidth, peerspeed; + int width = 255, peerwidth, peerspeed; uint8_t data[IB_SMP_DATA_SIZE]; ib_portid_t peerportid = { 0 }; int portnum = 0; From sashak at voltaire.com Tue Sep 22 19:53:20 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 23 Sep 2009 05:53:20 +0300 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> <20090922203356.GK24398@me> Message-ID: <20090923025320.GM24398@me> On 16:44 Tue 22 Sep , Hal Rosenstock wrote: > > Yeah, the port lid table will be OK but port's PortInfo won't (so base > LID/LMC will be broken) for this scenario but it wouldn't affect this code > in this way. So I don't have any theories as to how this could occur. Do you > ? I don't, but if you are able to reproduce such case you may try to debug this. Sasha From karun.sharma at qlogic.com Tue Sep 22 22:52:42 2009 From: karun.sharma at qlogic.com (Karun Sharma) Date: Wed, 23 Sep 2009 00:52:42 -0500 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <20090922085620.52b2ac17.weiny2@llnl.gov> References: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD86B@MNEXMB1.qlogic.org> <20090922085620.52b2ac17.weiny2@llnl.gov> Message-ID: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD937@MNEXMB1.qlogic.org> I am referring to ib_send_bw, ib_read_bw, ib_write_bw etc. Usage: ib_read_bw start a server and wait for connection ib_read_bw connect to server at Options: -p, --port= listen on/connect to port (default 18515) -d, --ib-dev= use IB device (default first device found) -i, --ib-port= use port of IB device (default 1) -m, --mtu= mtu size (256 - 4096. default for hermon is 2048) -o, --outs= num of outstanding read/atom(default 4) -s, --size= size of message to exchange (default 65536) -a, --all Run sizes from 2 till 2^23 -t, --tx-depth= size of tx queue (default 100) -n, --iters= number of exchanges (at least 2, default 1000) -u, --qp-timeout= QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14 -S, --sl= SL (default 0) -x, --gid-index= test uses GID with GID index taken from command line (for RDMAoE index should be 0) -b, --bidirectional measure bidirectional bandwidth (default unidirectional) -V, --version display version number -e, --events sleep on CQ events (default poll) -F, --CPU-freq do not fail even if cpufreq_ondemand module is loaded Regards Karun -----Original Message----- From: Ira Weiny [mailto:weiny2 at llnl.gov] Sent: Tuesday, September 22, 2009 9:26 PM To: Karun Sharma Cc: Sneha Mistry; general at lists.openfabrics.org Subject: Re: [ofa-general] Problem while running ib tests On Tue, 22 Sep 2009 04:54:22 -0500 Karun Sharma wrote: > Please use "-F" option while running the tests. It will ignore the "Conflicting CPU frequency" errors. You will still see these messages on your screen, but with "-F", you will also see the results. Which commands are you referring to with the "-F" option? I just did a pull from git://git.openfabrics.org/~mst/perftest.git and I don't see a -F for the commands there. Have these tools moved? Or are you speaking of other tools? I also run into this problem and I have resorted to turning off cpuspeed. Thanks, Ira > > Regards > Karun > > -----Original Message----- > From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Sneha Mistry > Sent: Tuesday, September 22, 2009 2:35 PM > To: general at lists.openfabrics.org > Subject: [ofa-general] Problem while running ib tests > > Hi, > > I have two Dual port HCAa card.Which I have installed in > same PC. > > I am using OpenSuse 10.3 and installed OFED 1.4. > > If I try to run any IB bandwidth test or latency test it end us with warning > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > Is it an OFED related problem or Linux kernel problem. > > HCA are connected back to back and output of ibstat is as given below. > --------------------------------------------------------------------------------------------------------- > CA 'mthca0' > CA type: MT25208 (MT23108 compat mode) > Number of ports: 2 > Firmware version: 4.8.200 > Hardware version: 20 > Node GUID: 0x0002c90200283734 > System image GUID: 0x0002c90200283737 > Port 1: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 1 > LMC: 0 > SM lid: 1 > Capability mask: 0x02510a6a > Port GUID: 0x0002c90200283735 > Port 2: > State: Initializing > Physical state: LinkUp > Rate: 20 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283736 > CA 'mthca1' > CA type: MT25208 (MT23108 compat mode) > Number of ports: 2 > Firmware version: 4.8.200 > Hardware version: 20 > Node GUID: 0x0002c90200283730 > System image GUID: 0x0002c90200283733 > Port 1: > State: Initializing > Physical state: LinkUp > Rate: 20 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283731 > Port 2: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 3 > LMC: 0 > SM lid: 1 > Capability mask: 0x02510a68 > Port GUID: 0x0002c90200283732 > -------------------------------------------------------------------------------------------------------------------------------------- > > Even ibibnetdiscover is detecting one hope. > So there is nothing wrong if I try to test 2 HCA connected back to back. > > Please tell me what can be done for running ib tests. > > Thanks, > SGM > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jackm at dev.mellanox.co.il Tue Sep 22 23:33:51 2009 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Sep 2009 09:33:51 +0300 Subject: [ofa-general] Fedora 11, kernel 2.6.31 In-Reply-To: References: Message-ID: <200909230933.51858.jackm@dev.mellanox.co.il> On Tuesday 22 September 2009 21:58, Nathan Stratton wrote: > > Having an issue with getting verbs working on 2.6.31. I am running Fedora > 11 with 2.6.31 and 1.1.2-0.1.gb00dc7d libibverbs. Everything looks great > until I run ibv_srq_pingpong to the server. It shows local/remote address > a bunch of times and then freezes. > You do have the latest firmware installed for your HCA. You are using the OFED user libraries with a non-OFED kernel. While I am surprised that it is giving you problems with libmthca, this combination has not been tested. You might try using the FC11 native userspace libs (libibverbs, libmthca), which you can find under http://cvs.fedoraproject.org/viewvc/F-11/ -Jack > I wanted to try OFED, but it does not work with 2.6.31. :( > > [root at xen1 src]# lsmod > Module Size Used by > ib_ucm 13752 0 > rdma_ucm 12112 0 > ib_uverbs 32256 2 ib_ucm,rdma_ucm > 8021q 21200 0 > bonding 87140 0 > fuse 60016 0 > bridge 40488 0 > stp 2588 1 bridge > llc 6240 2 bridge,stp > ib_ipoib 68880 0 > igb 80620 0 > ib_mthca 123700 0 > 3w_9xxx 33092 0 > rng_core 4688 0 > > [root at xen1 src]# ibv_devices > device node GUID > ------ ---------------- > mthca0 0005ad00000327e8 > [root at xen1 src]# ibv_devinfo > hca_id: mthca0 > fw_ver: 3.5.0 > node_guid: 0005:ad00:0003:27e8 > sys_image_guid: 0005:ad00:0100:d050 > vendor_id: 0x02c9 > vendor_part_id: 23108 > hw_ver: 0xA1 > board_id: MT_0270110001 > phys_port_cnt: 2 > port: 1 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 2 > port_lid: 11 > port_lmc: 0x00 > > port: 2 > state: PORT_DOWN (1) > max_mtu: 2048 (4) > active_mtu: 512 (2) > sm_lid: 0 > port_lid: 0 > port_lmc: 0x00 > > [root at xen1 src]# ibv_srq_pingpong 10.13.0.220 > local address: LID 0x000b, QPN 0x470406, PSN 0x697578 > local address: LID 0x000b, QPN 0x470407, PSN 0x4eb139 > local address: LID 0x000b, QPN 0x470408, PSN 0x41b959 > local address: LID 0x000b, QPN 0x470409, PSN 0x3e0cfd > local address: LID 0x000b, QPN 0x47040a, PSN 0x3aff03 > local address: LID 0x000b, QPN 0x47040b, PSN 0x212873 > local address: LID 0x000b, QPN 0x47040c, PSN 0xecd920 > local address: LID 0x000b, QPN 0x47040d, PSN 0xb9fb63 > local address: LID 0x000b, QPN 0x47040e, PSN 0xa76195 > local address: LID 0x000b, QPN 0x47040f, PSN 0x9c74bc > local address: LID 0x000b, QPN 0x470410, PSN 0x5644e6 > local address: LID 0x000b, QPN 0x470411, PSN 0x5b2db8 > local address: LID 0x000b, QPN 0x470412, PSN 0x18420e > local address: LID 0x000b, QPN 0x470413, PSN 0x6a2a6c > local address: LID 0x000b, QPN 0x470414, PSN 0xd70ff0 > local address: LID 0x000b, QPN 0x470415, PSN 0xe3ac2a > remote address: LID 0x0004, QPN 0x5d0406, PSN 0xd6f70e > remote address: LID 0x0004, QPN 0x5d0407, PSN 0x2d2ee0 > remote address: LID 0x0004, QPN 0x5d0408, PSN 0x4bb6aa > remote address: LID 0x0004, QPN 0x5d0409, PSN 0x3821b1 > remote address: LID 0x0004, QPN 0x5d040a, PSN 0xc91470 > remote address: LID 0x0004, QPN 0x5d040b, PSN 0xd2a912 > remote address: LID 0x0004, QPN 0x5d0418, PSN 0x19ea09 > remote address: LID 0x0004, QPN 0x5d0419, PSN 0x5df7ce > remote address: LID 0x0004, QPN 0x5d041a, PSN 0x5c705a > remote address: LID 0x0004, QPN 0x5d041b, PSN 0x0b2fd4 > remote address: LID 0x0004, QPN 0x5d041c, PSN 0x3d0ae8 > remote address: LID 0x0004, QPN 0x5d041d, PSN 0xe4d55b > remote address: LID 0x0004, QPN 0x5d041e, PSN 0x0b87ab > remote address: LID 0x0004, QPN 0x5d041f, PSN 0xbc4f7c > remote address: LID 0x0004, QPN 0x5d0420, PSN 0x66c48a > remote address: LID 0x0004, QPN 0x5d0421, PSN 0xd77a86 > > > > > ><> > Nathan Stratton CTO, BlinkMind, Inc. > nathan at robotics.net nathan at blinkmind.com > http://www.robotics.net http://www.blinkmind.com > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From ofedrnicuser at yahoo.com Wed Sep 23 00:28:36 2009 From: ofedrnicuser at yahoo.com (Bill N) Date: Wed, 23 Sep 2009 00:28:36 -0700 (PDT) Subject: [ofa-general] how to enable MPA CRC on Neteffects cards Message-ID: <595875.50676.qm@web111202.mail.gq1.yahoo.com> Hi, We want to use the MPA markers on the Neteffect adapters (which is disabled by default by the iw_nes driver). How can we enable it? Few minor changes in the MPA negotiation in nes_cm.c. Apart from that, how do I tell hardware to perform the MPA marker insertion & removal? Regards, Bill N From ofedrnicuser at yahoo.com Wed Sep 23 00:32:43 2009 From: ofedrnicuser at yahoo.com (Bill N) Date: Wed, 23 Sep 2009 07:32:43 +0000 (GMT) Subject: [ofa-general] how to enable MPA Markers on Neteffects cards In-Reply-To: <595875.50676.qm@web111202.mail.gq1.yahoo.com> Message-ID: <944911.91983.qm@web111210.mail.gq1.yahoo.com> Sorry, I want to enable MPA markers on Neteffect cards. I messed up the subject line. Parav --- On Wed, 9/23/09, Bill N wrote: > From: Bill N > Subject: [ofa-general] how to enable MPA CRC on Neteffects cards > To: "OFED General" > Date: Wednesday, September 23, 2009, 7:28 AM > Hi, > > We want to use the MPA markers on the Neteffect adapters > (which is disabled by default by the iw_nes driver). > > How can we enable it? > > Few minor changes in the MPA negotiation in nes_cm.c. > Apart from that, how do I tell hardware to perform the MPA > marker insertion & removal? > > Regards, > Bill N > > > >       > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From eli at mellanox.co.il Wed Sep 23 00:36:58 2009 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 23 Sep 2009 10:36:58 +0300 Subject: [ofa-general] [PATCH v2] mlx4: configure cache line size Message-ID: <20090923073658.GA23252@mtls03> ConnectX can work more efficiently if the CPU cache line size is configured to it at INIT_HCA. This patch configures the CPU cache line size. Signed-off-by: Eli Cohen --- As per Roland's comments, the following changes were made: 1. Remove #ifdef cache_line_size and include linux/cache.h 2. Assume cache line size is a power of 2 and use ilog2 instead of order_base_2 drivers/net/mlx4/fw.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index cee199c..3c16602 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -33,6 +33,7 @@ */ #include +#include #include "fw.h" #include "icm.h" @@ -698,6 +699,7 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) #define INIT_HCA_IN_SIZE 0x200 #define INIT_HCA_VERSION_OFFSET 0x000 #define INIT_HCA_VERSION 2 +#define INIT_HCA_CACHELINE_SZ_OFFSET 0x0e #define INIT_HCA_FLAGS_OFFSET 0x014 #define INIT_HCA_QPC_OFFSET 0x020 #define INIT_HCA_QPC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x10) @@ -735,6 +737,9 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) *((u8 *) mailbox->buf + INIT_HCA_VERSION_OFFSET) = INIT_HCA_VERSION; + *((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) = + (ilog2(cache_line_size()) - 4) << 5; + #if defined(__LITTLE_ENDIAN) *(inbox + INIT_HCA_FLAGS_OFFSET / 4) &= ~cpu_to_be32(1 << 1); #elif defined(__BIG_ENDIAN) -- 1.6.4.3 From monis at Voltaire.COM Wed Sep 23 02:01:02 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Wed, 23 Sep 2009 12:01:02 +0300 Subject: [ofa-general] Problem while running ib tests In-Reply-To: References: Message-ID: <4AB9E3CE.4030409@Voltaire.COM> Sneha Mistry wrote: > Hi, > > I have two Dual port HCAa card.Which I have installed in > same PC. > > I am using OpenSuse 10.3 and installed OFED 1.4. > > If I try to run any IB bandwidth test or latency test it end us with warning > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > Is it an OFED related problem or Linux kernel problem. > > HCA are connected back to back and output of ibstat is as given below. One more option to solve it (besides the -F) is to disable the power saving of the CPU. In the power saving mode not all CPUs operate at the same Hz/sec rate (cycle rate) and this is the reason for the conflict. Disabling the power saving mode is done in the BIOS and it is sometimes called economy mode or performance mode (the other side of the trade off) From bart.vanassche at gmail.com Wed Sep 23 02:20:50 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 23 Sep 2009 11:20:50 +0200 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <4AB9E3CE.4030409@Voltaire.COM> References: <4AB9E3CE.4030409@Voltaire.COM> Message-ID: On Wed, Sep 23, 2009 at 11:01 AM, Moni Shoua wrote: > > Sneha Mistry wrote: > > I have two Dual port HCAa card.Which I have installed in > > same PC. > > > > I am using OpenSuse 10.3  and installed OFED 1.4. > > > > If I try to run any IB bandwidth test or latency test it end us with warning > > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > > > Is it an OFED related problem or Linux kernel problem. > > > > HCA are connected back to back and output of ibstat is as given below. > > One more option to solve it (besides the -F) is to disable the power saving of the CPU. > In the power saving mode not all CPUs operate at the same Hz/sec rate (cycle rate) and this is the reason for the conflict. > Disabling the power saving mode is done in the BIOS and it is sometimes called economy mode or performance mode (the other side of the trade off) An alternative for configuring CPU frequency scaling via the BIOS is to modify the variables in /sys/devices/system/cpu/cpu0/cpufreq. This doesn't even require a reboot. Bart. From vlad at lists.openfabrics.org Wed Sep 23 03:13:45 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 23 Sep 2009 03:13:45 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090923-0200 daily build status Message-ID: <20090923101345.7628AE62174@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.18-164.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -fno-delete-null-pointer-checks -fwrapv -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(connection)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/.tmp_connection.o /home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c /home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c: In function 'rds_conn_bucket': /home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: warning: passing argument 1 of 'inet_ehashfn' makes integer from pointer without a cast /home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: error: too many arguments to function 'inet_ehashfn' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090923-0200_linux-2.6.18-164.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-164.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From sashak at voltaire.com Wed Sep 23 03:47:34 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 23 Sep 2009 13:47:34 +0300 Subject: [ofa-general] Re: [PATCH] infiniband-diags/ibportstate.c: Eliminate uninitialized variable compile warning In-Reply-To: <20090922235936.GA6016@comcast.net> References: <20090922235936.GA6016@comcast.net> Message-ID: <20090923104734.GP24398@me> On 19:59 Tue 22 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From kalcher at kip.uni-heidelberg.de Wed Sep 23 03:54:29 2009 From: kalcher at kip.uni-heidelberg.de (Sebastian Kalcher) Date: Wed, 23 Sep 2009 12:54:29 +0200 Subject: [ofa-general] OFED interfering with Ethernet In-Reply-To: <20090920233813.GC22310@obsidianresearch.com> References: <20090919235701.14724oeyxigf9rqc@mail.kip.uni-heidelberg.de> <20090919230241.GB22310@obsidianresearch.com> <20090921011755.57846mh1c5eqopc8@mail.kip.uni-heidelberg.de> <20090920233813.GC22310@obsidianresearch.com> Message-ID: <4AB9FE65.4020405@kip.uni-heidelberg.de> Jason Gunthorpe schrieb: > > Hmm, double weird.. > > bonding related perhaps? > Bonding was enabled on two eth devices at same point in time. But it was disabled before running with IB. I just found out, however, that in the HP switches passive LACP is still enabled. I will try to test without ASAP. Sebastian From hal.rosenstock at gmail.com Wed Sep 23 04:00:33 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 23 Sep 2009 07:00:33 -0400 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: <20090923025320.GM24398@me> References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> <20090922203356.GK24398@me> <20090923025320.GM24398@me> Message-ID: On Tue, Sep 22, 2009 at 10:53 PM, Sasha Khapyorsky wrote: > On 16:44 Tue 22 Sep , Hal Rosenstock wrote: > > > > Yeah, the port lid table will be OK but port's PortInfo won't (so base > > LID/LMC will be broken) for this scenario but it wouldn't affect this > code > > in this way. > Let me try this again... The port LID table is fine but the lookup is done based on the LID in the received portInfo as it is the result of osm_physp_get_base_lid() (osm_link_mgr.c:link_mgr_get_smsl line 83). In the case of failed Sets, this is invalid so LID 0 is used and that's what causes the NULL p_src_port which in turn causes the seg fault. So I'm back to: I can see two ways to fix this: 1. Replace with port GUID search 2. Have osm_get_lash_sl handle NULL for p_src_port Maybe you see other ways to deal with this. Do you have a preferred approach ? -- Hal -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Wed Sep 23 06:03:06 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 23 Sep 2009 16:03:06 +0300 Subject: [ofa-general] Re: osm_link_mgr.c:link_mgr_get_smsl question In-Reply-To: References: <20090829204508.GH21238@me> <20090830120011.GG21909@me> <20090922203356.GK24398@me> <20090923025320.GM24398@me> Message-ID: <20090923130306.GR24398@me> On 07:00 Wed 23 Sep , Hal Rosenstock wrote: > > Let me try this again... The port LID table is fine but the lookup is done > based on the LID in the received portInfo as it is the result of > osm_physp_get_base_lid() (osm_link_mgr.c:link_mgr_get_smsl line 83). Ok, makes sense. Assuming that the case is caused by PortInfo set failure we can expect new sweep - OpenSM will set initialization errors flag. > In the > case of failed Sets, this is invalid so LID 0 is used and that's what causes > the NULL p_src_port which in turn causes the seg fault. > > So I'm back to: > > I can see two ways to fix this: > 1. Replace with port GUID search > 2. Have osm_get_lash_sl handle NULL for p_src_port > Maybe you see other ways to deal with this. > > Do you have a preferred approach ? Maybe and I think that we discussed this already in this thread - setting SMSL is useless for port which have LID uninitialized, so something like: if (!p_src_port) return; , or to be even more detailed if (!slid || !(p_src_port = osm_get_port_by_lid(&sm->p_subn, slid))) { OSM_LOG("print some verbose message\n"); return; } in link_mgr_get_smsl() should be good enough. Right? Sasha From hnrose at comcast.net Wed Sep 23 05:57:41 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 23 Sep 2009 08:57:41 -0400 Subject: [ofa-general] [PATCH] libibmad: Add support for PortXmitDiscardDetails Message-ID: <20090923125741.GA7307@comcast.net> Also, some additional commentary changes to mad.h and fields.c Signed-off-by: Hal Rosenstock --- diff --git a/libibmad/include/infiniband/mad.h b/libibmad/include/infiniband/mad.h index 94b64cf..cfa9105 100644 --- a/libibmad/include/infiniband/mad.h +++ b/libibmad/include/infiniband/mad.h @@ -168,6 +168,7 @@ enum GSI_ATTR_ID { IB_GSI_PORT_SAMPLES_CONTROL = 0x10, IB_GSI_PORT_SAMPLES_RESULT = 0x11, IB_GSI_PORT_COUNTERS = 0x12, + IB_GSI_PORT_XMIT_DISCARD_DETAILS = 0x16, IB_GSI_PORT_COUNTERS_EXT = 0x1D, IB_GSI_PORT_XMIT_DATA_SL = 0x36, IB_GSI_PORT_RCV_DATA_SL = 0x37, @@ -604,6 +605,9 @@ enum MAD_FIELDS { IB_CPI_TRAP_QP_F, IB_CPI_TRAP_QKEY_F, + /* + * PortXmitDataSL fields + */ IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL0_F = IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL1_F, @@ -623,6 +627,9 @@ enum MAD_FIELDS { IB_PC_XMT_DATA_SL15_F, IB_PC_XMT_DATA_SL_LAST_F, + /* + * PortRcvDataSL fields + */ IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL0_F = IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL1_F, @@ -642,6 +649,15 @@ enum MAD_FIELDS { IB_PC_RCV_DATA_SL15_F, IB_PC_RCV_DATA_SL_LAST_F, + /* + * PortXmitDiscardDetails fields + */ + IB_PC_XMT_INACT_DISC_F, + IB_PC_XMT_NEIGH_MTU_DISC_F, + IB_PC_XMT_SW_LIFE_DISC_F, + IB_PC_XMT_SW_HOL_DISC_F, + IB_PC_XMT_DISC_LAST_F, + IB_FIELD_LAST_ /* must be last */ }; @@ -963,7 +979,8 @@ MAD_EXPORT ib_mad_dump_fn mad_dump_node_type, mad_dump_sltovl, mad_dump_vlarbitration, mad_dump_nodedesc, mad_dump_nodeinfo, mad_dump_portinfo, mad_dump_switchinfo, mad_dump_perfcounters, mad_dump_perfcounters_ext, - mad_dump_perfcounters_xmt_sl, mad_dump_perfcounters_rcv_sl; + mad_dump_perfcounters_xmt_sl, mad_dump_perfcounters_rcv_sl, + mad_dump_perfcounters_xmt_disc; MAD_EXPORT int ibdebug; diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c index 5151882..48f59ab 100644 --- a/libibmad/src/dump.c +++ b/libibmad/src/dump.c @@ -729,6 +729,16 @@ void mad_dump_perfcounters_rcv_sl(char *buf, int bufsz, void *val, int valsz) IB_PC_RCV_DATA_SL_LAST_F); } +void mad_dump_perfcounters_xmt_disc(char *buf, int bufsz, void *val, int valsz) +{ + int cnt; + + cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, + IB_PC_EXT_XMT_BYTES_F); + _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_XMT_INACT_DISC_F, + IB_PC_XMT_DISC_LAST_F); +} + void xdump(FILE * file, char *msg, void *p, int size) { #define HEX(x) ((x) < 10 ? '0' + (x) : 'a' + ((x) -10)) diff --git a/libibmad/src/fields.c b/libibmad/src/fields.c index 5f30116..f274aff 100644 --- a/libibmad/src/fields.c +++ b/libibmad/src/fields.c @@ -406,6 +406,9 @@ static const ib_field_t ib_mad_f[] = { {BITSOFFS(520, 24), "TrapQP", mad_dump_hex}, {544, 32, "TrapQKey", mad_dump_hex}, + /* + * PortXmitDataSL fields + */ {32, 32, "XmtDataSL0", mad_dump_uint}, {64, 32, "XmtDataSL1", mad_dump_uint}, {96, 32, "XmtDataSL2", mad_dump_uint}, @@ -424,6 +427,9 @@ static const ib_field_t ib_mad_f[] = { {512, 32, "XmtDataSL15", mad_dump_uint}, {0, 0}, /* IB_PC_XMT_DATA_SL_LAST_F */ + /* + * PortRcvDataSL fields + */ {32, 32, "RcvDataSL0", mad_dump_uint}, {64, 32, "RcvDataSL1", mad_dump_uint}, {96, 32, "RcvDataSL2", mad_dump_uint}, @@ -442,6 +448,15 @@ static const ib_field_t ib_mad_f[] = { {512, 32, "RcvDataSL15", mad_dump_uint}, {0, 0}, /* IB_PC_RCV_DATA_SL_LAST_F */ + /* + * PortXmitDiscardDetails fields + */ + {32, 16, "PortInactiveDiscards", mad_dump_uint}, + {48, 16, "PortNeighborMTUDiscards", mad_dump_uint}, + {64, 16, "PortSwLifetimeLimitDiscards", mad_dump_uint}, + {80, 16, "PortSwHOQLifetimeLimitDiscards", mad_dump_uint}, + {0, 0}, /* IB_PC_XMT_DISC_LAST_F */ + {0, 0} /* IB_FIELD_LAST_ */ }; diff --git a/libibmad/src/libibmad.map b/libibmad/src/libibmad.map index b9a890c..2a6a253 100644 --- a/libibmad/src/libibmad.map +++ b/libibmad/src/libibmad.map @@ -24,6 +24,7 @@ IBMAD_1.3 { mad_dump_perfcounters_ext; mad_dump_perfcounters_xmt_sl; mad_dump_perfcounters_rcv_sl; + mad_dump_perfcounters_xmt_disc; mad_dump_physportstate; mad_dump_portcapmask; mad_dump_portinfo; From hnrose at comcast.net Wed Sep 23 06:00:17 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 23 Sep 2009 09:00:17 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/pergquery: Add support for optional PortXmitDiscardDetails counter Message-ID: <20090923130017.GB7307@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/infiniband-diags/man/perfquery.8 b/infiniband-diags/man/perfquery.8 index 2a80f30..4510e7d 100644 --- a/infiniband-diags/man/perfquery.8 +++ b/infiniband-diags/man/perfquery.8 @@ -1,4 +1,4 @@ -.TH PERFQUERY 8 "March 10, 2009" "OpenIB" "OpenIB Diagnostics" +.TH PERFQUERY 8 "September 21, 2009" "OpenIB" "OpenIB Diagnostics" .SH NAME perfquery \- query InfiniBand port counters @@ -6,6 +6,7 @@ perfquery \- query InfiniBand port counters .SH SYNOPSIS .B perfquery [\-d(ebug)] [\-G(uid)] [\-x|\-\-extended] [\-X|\-\-xmtsl] [\-S|\-\-rcvsl] +[\-D|\-\-xmtdisc] [-a(ll_ports)] [-l(oop_ports)] [-r(eset_after_read)] [-R(eset_only)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\-h(elp)] [ [[port] [reset_mask]]] @@ -38,6 +39,9 @@ show transmit data SL counter. This is an optional counter for QoS. \fB\-S\fR, \fB\-\-rcvsl\fR show receive data SL counter. This is an optional counter for QoS. .TP +\fB\-D\fR, \fB\-\-xmtdisc\fR +show transmit discard details. This is an optional counter. +.TP \fB\-a\fR, \fB\-\-all_ports\fR show aggregated counters for all ports of the destination lid or reset all counters for all ports. If the destination lid diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..74f9235 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -344,7 +344,7 @@ static void reset_counters(int extended, int timeout, int mask, } static int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl, - rcv_sl; + rcv_sl, xmt_disc; void xmt_sl_query(ib_portid_t * portid, int port, int mask) { @@ -396,6 +396,33 @@ void rcv_sl_query(ib_portid_t * portid, int port, int mask) IBERROR("perfslreset"); } +void xmt_disc_query(ib_portid_t * portid, int port, int mask) +{ + char buf[1024]; + + if (reset_only) { + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR("xmtdiscreset"); + return; + } + + if (!pma_query_via(pc, portid, port, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, srcport)) + IBERROR("xmtdiscquery"); + + mad_dump_perfcounters_xmt_disc(buf, sizeof buf, pc, sizeof pc); + printf("# PortXmitDiscardDetails: %s port %d\n%s", portid2str(portid), + port, buf); + + if (reset) + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR("xmtdiscreset"); +} + static int process_opt(void *context, int ch, char *optarg) { switch (ch) { @@ -408,6 +435,9 @@ static int process_opt(void *context, int ch, char *optarg) case 'S': rcv_sl = 1; break; + case 'D': + xmt_disc = 1; + break; case 'a': all_ports++; port = ALL_PORTS; @@ -446,6 +476,7 @@ int main(int argc, char **argv) {"extended", 'x', 0, NULL, "show extended port counters"}, {"xmtsl", 'X', 0, NULL, "show Xmt SL port counters"}, {"rcvsl", 'S', 0, NULL, "show Rcv SL port counters"}, + {"xmtdisc", 'D', 0, NULL, "show Xmt Discard Details"}, {"all_ports", 'a', 0, NULL, "show aggregated counters"}, {"loop_ports", 'l', 0, NULL, "iterate through each port"}, {"reset_after_read", 'r', 0, NULL, "reset counters after read"}, @@ -516,6 +547,11 @@ int main(int argc, char **argv) goto done; } + if (xmt_disc) { + xmt_disc_query(&portid, port, mask); + goto done; + } + if (all_ports_loop || (loop_ports && (all_ports || port == ALL_PORTS))) { if (smp_query_via(data, &portid, IB_ATTR_NODE_INFO, 0, 0, srcport) < 0) From hnrose at comcast.net Wed Sep 23 06:27:38 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 23 Sep 2009 09:27:38 -0400 Subject: [ofa-general] [PATCH] opensm/osm_link_mgr.c: In link_mgr_set_physp_pi, only call link_mgr_get_smsl when LID valid Message-ID: <20090923132738.GA11105@comcast.net> Fix seg fault which occurs when get_osm_switch_from_port is called with NULL port (which in this case was caused by calling cl_ptr_vector_get on port LID table with LID 0) Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c index c9bdfee..35f83e2 100644 --- a/opensm/opensm/osm_link_mgr.c +++ b/opensm/opensm/osm_link_mgr.c @@ -131,27 +131,32 @@ static int link_mgr_set_physp_pi(osm_sm_t * sm, IN osm_physp_t * p_physp, if (ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info) == FALSE) { - /* Even for base port 0 we might have to set smsl - (if we are using lash routing) */ - smsl = link_mgr_get_smsl(sm, p_physp); - if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { - send_set = TRUE; - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Setting SMSL to %d on port 0 GUID 0x%016" - PRIx64 "\n", smsl, - cl_ntoh64(osm_physp_get_port_guid - (p_physp))); - } else { - /* This means the switch doesn't support - enhanced port 0 and we don't need to - change SMSL. Can skip it. */ - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Skipping port 0, GUID 0x%016" PRIx64 - "\n", - cl_ntoh64(osm_physp_get_port_guid - (p_physp))); - goto Exit; + /* Make sure LID is valid prior to calling link_mgr_get_smsl */ + if (osm_physp_get_base_lid(p_physp)) { + + /* Even for base port 0 we might have to set + smsl (if we are using lash routing) */ + smsl = link_mgr_get_smsl(sm, p_physp); + if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { + send_set = TRUE; + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Setting SMSL to %d on port 0 " + "GUID 0x%016" PRIx64 "\n", smsl, + cl_ntoh64(osm_physp_get_port_guid + (p_physp))); + } else { + /* This means the switch doesn't support + enhanced port 0 and we don't need to + change SMSL. Can skip it. */ + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Skipping port 0, GUID 0x%016" + PRIx64 "\n", + cl_ntoh64(osm_physp_get_port_guid + (p_physp))); + goto Exit; + } } + } else esp0 = TRUE; } @@ -217,18 +222,22 @@ static int link_mgr_set_physp_pi(osm_sm_t * sm, IN osm_physp_t * p_physp, sizeof(p_pi->master_sm_base_lid))) send_set = TRUE; - smsl = link_mgr_get_smsl(sm, p_physp); - if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { + /* Make sure LID is valid prior to calling link_mgr_get_smsl */ + if (osm_physp_get_base_lid(p_physp)) { + smsl = link_mgr_get_smsl(sm, p_physp); + if (smsl != ib_port_info_get_master_smsl(p_old_pi)) { - ib_port_info_set_master_smsl(p_pi, smsl); + ib_port_info_set_master_smsl(p_pi, smsl); - OSM_LOG(sm->p_log, OSM_LOG_DEBUG, - "Setting SMSL to %d on GUID 0x%016" - PRIx64 ", port %d\n", smsl, - cl_ntoh64(osm_physp_get_port_guid - (p_physp)), port_num); + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, + "Setting SMSL to %d on GUID " + "0x%016" PRIx64 ", port %d\n", + smsl, + cl_ntoh64(osm_physp_get_port_guid + (p_physp)), port_num); - send_set = TRUE; + send_set = TRUE; + } } p_pi->m_key_lease_period = From hnrose at comcast.net Wed Sep 23 06:29:43 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 23 Sep 2009 09:29:43 -0400 Subject: [ofa-general] [PATCHv2] infiniband-diags/perfquery: Add support for optional PortXmitDiscardDetails counter Message-ID: <20090923132943.GA11145@comcast.net> Signed-off-by: Hal Rosenstock --- Changes since v1: Fix typo in [PATCH] subject diff --git a/infiniband-diags/man/perfquery.8 b/infiniband-diags/man/perfquery.8 index 2a80f30..4510e7d 100644 --- a/infiniband-diags/man/perfquery.8 +++ b/infiniband-diags/man/perfquery.8 @@ -1,4 +1,4 @@ -.TH PERFQUERY 8 "March 10, 2009" "OpenIB" "OpenIB Diagnostics" +.TH PERFQUERY 8 "September 21, 2009" "OpenIB" "OpenIB Diagnostics" .SH NAME perfquery \- query InfiniBand port counters @@ -6,6 +6,7 @@ perfquery \- query InfiniBand port counters .SH SYNOPSIS .B perfquery [\-d(ebug)] [\-G(uid)] [\-x|\-\-extended] [\-X|\-\-xmtsl] [\-S|\-\-rcvsl] +[\-D|\-\-xmtdisc] [-a(ll_ports)] [-l(oop_ports)] [-r(eset_after_read)] [-R(eset_only)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\-h(elp)] [ [[port] [reset_mask]]] @@ -38,6 +39,9 @@ show transmit data SL counter. This is an optional counter for QoS. \fB\-S\fR, \fB\-\-rcvsl\fR show receive data SL counter. This is an optional counter for QoS. .TP +\fB\-D\fR, \fB\-\-xmtdisc\fR +show transmit discard details. This is an optional counter. +.TP \fB\-a\fR, \fB\-\-all_ports\fR show aggregated counters for all ports of the destination lid or reset all counters for all ports. If the destination lid diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..74f9235 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -344,7 +344,7 @@ static void reset_counters(int extended, int timeout, int mask, } static int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl, - rcv_sl; + rcv_sl, xmt_disc; void xmt_sl_query(ib_portid_t * portid, int port, int mask) { @@ -396,6 +396,33 @@ void rcv_sl_query(ib_portid_t * portid, int port, int mask) IBERROR("perfslreset"); } +void xmt_disc_query(ib_portid_t * portid, int port, int mask) +{ + char buf[1024]; + + if (reset_only) { + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR("xmtdiscreset"); + return; + } + + if (!pma_query_via(pc, portid, port, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, srcport)) + IBERROR("xmtdiscquery"); + + mad_dump_perfcounters_xmt_disc(buf, sizeof buf, pc, sizeof pc); + printf("# PortXmitDiscardDetails: %s port %d\n%s", portid2str(portid), + port, buf); + + if (reset) + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_XMIT_DISCARD_DETAILS, + srcport)) + IBERROR("xmtdiscreset"); +} + static int process_opt(void *context, int ch, char *optarg) { switch (ch) { @@ -408,6 +435,9 @@ static int process_opt(void *context, int ch, char *optarg) case 'S': rcv_sl = 1; break; + case 'D': + xmt_disc = 1; + break; case 'a': all_ports++; port = ALL_PORTS; @@ -446,6 +476,7 @@ int main(int argc, char **argv) {"extended", 'x', 0, NULL, "show extended port counters"}, {"xmtsl", 'X', 0, NULL, "show Xmt SL port counters"}, {"rcvsl", 'S', 0, NULL, "show Rcv SL port counters"}, + {"xmtdisc", 'D', 0, NULL, "show Xmt Discard Details"}, {"all_ports", 'a', 0, NULL, "show aggregated counters"}, {"loop_ports", 'l', 0, NULL, "iterate through each port"}, {"reset_after_read", 'r', 0, NULL, "reset counters after read"}, @@ -516,6 +547,11 @@ int main(int argc, char **argv) goto done; } + if (xmt_disc) { + xmt_disc_query(&portid, port, mask); + goto done; + } + if (all_ports_loop || (loop_ports && (all_ports || port == ALL_PORTS))) { if (smp_query_via(data, &portid, IB_ATTR_NODE_INFO, 0, 0, srcport) < 0) From eli at dev.mellanox.co.il Wed Sep 23 08:04:54 2009 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 23 Sep 2009 18:04:54 +0300 Subject: [ofa-general] Possible process deadlock in RMPP flow Message-ID: <20090923150454.GA26150@mtls03> Hi Sean, one of our customers experiences problems when running ibnetdiscover. The problem happens from time to time. Here is the call stack the he gets: ibnetdiscover D ffffffff80149b8d 0 26968 26544 (L-TLB) ffff8102c900bd88 0000000000000046 ffff81037e8e0000 ffff81037e8e02e8 ffff8102c900bd78 000000000000000a ffff8102c5b50820 ffff81038a929820 0000011837bf6105 0000000000000ede ffff8102c5b50a08 0000000100000000 Call Trace: [] wait_for_completion+0x79/0xa2 [] default_wake_function+0x0/0xe [] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde [] :ib_mad:ib_unregister_mad_agent+0x30d/0x424 [] :ib_umad:ib_umad_close+0x9d/0xd6 [] __fput+0xae/0x198 [] filp_close+0x5c/0x64 [] put_files_struct+0x63/0xae [] do_exit+0x31c/0x911 [] cpuset_exit+0x0/0x6c [] system_call+0x7e/0x83 >From the dump it seems that the process is waits on the call to flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is OFED 1.4.2. Do you have any idea or suggestions how to sort this out? Thanks. From jgunthorpe at obsidianresearch.com Wed Sep 23 08:11:49 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Wed, 23 Sep 2009 09:11:49 -0600 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <4AB9E3CE.4030409@Voltaire.COM> References: <4AB9E3CE.4030409@Voltaire.COM> Message-ID: <20090923151149.GJ22310@obsidianresearch.com> On Wed, Sep 23, 2009 at 12:01:02PM +0300, Moni Shoua wrote: > > If I try to run any IB bandwidth test or latency test it end us with warning > > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > One more option to solve it (besides the -F) is to disable the power > saving of the CPU. In the power saving mode not all CPUs operate at > the same Hz/sec rate (cycle rate) and this is the reason for the > conflict. Disabling the power saving mode is done in the BIOS and > it is sometimes called economy mode or performance mode (the other > side of the trade off) This whole thing should just be removed. x86-64 linux has a VDSO clock_gettime(CLOCK_MONOTONIC) that is fast, doesn't trap into the kernel, and doesn't suffer from these problems. Jason From weiny2 at llnl.gov Wed Sep 23 08:29:52 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Sep 2009 08:29:52 -0700 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD937@MNEXMB1.qlogic.org> References: <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD86B@MNEXMB1.qlogic.org> <20090922085620.52b2ac17.weiny2@llnl.gov> <4C2744E8AD2982428C5BFE523DF8CDCB45E91CD937@MNEXMB1.qlogic.org> Message-ID: <20090923082952.91c19567.weiny2@llnl.gov> OK, my bad. I remember the email thread now. Ido is the maintainer and the git repo changed. Just to complete this thread the new git tree is: git://git.openfabrics.org/~shamoya/perftest.git Sorry, Ira On Wed, 23 Sep 2009 00:52:42 -0500 Karun Sharma wrote: > I am referring to ib_send_bw, ib_read_bw, ib_write_bw etc. > > Usage: > ib_read_bw start a server and wait for connection > ib_read_bw connect to server at > > Options: > -p, --port= listen on/connect to port (default 18515) > -d, --ib-dev= use IB device (default first device found) > -i, --ib-port= use port of IB device (default 1) > -m, --mtu= mtu size (256 - 4096. default for hermon is 2048) > -o, --outs= num of outstanding read/atom(default 4) > -s, --size= size of message to exchange (default 65536) > -a, --all Run sizes from 2 till 2^23 > -t, --tx-depth= size of tx queue (default 100) > -n, --iters= number of exchanges (at least 2, default 1000) > -u, --qp-timeout= QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14 > -S, --sl= SL (default 0) > -x, --gid-index= test uses GID with GID index taken from command line (for RDMAoE index should be 0) > -b, --bidirectional measure bidirectional bandwidth (default unidirectional) > -V, --version display version number > -e, --events sleep on CQ events (default poll) > -F, --CPU-freq do not fail even if cpufreq_ondemand module is loaded > > > Regards > Karun > > -----Original Message----- > From: Ira Weiny [mailto:weiny2 at llnl.gov] > Sent: Tuesday, September 22, 2009 9:26 PM > To: Karun Sharma > Cc: Sneha Mistry; general at lists.openfabrics.org > Subject: Re: [ofa-general] Problem while running ib tests > > On Tue, 22 Sep 2009 04:54:22 -0500 > Karun Sharma wrote: > > > Please use "-F" option while running the tests. It will ignore the "Conflicting CPU frequency" errors. You will still see these messages on your screen, but with "-F", you will also see the results. > > Which commands are you referring to with the "-F" option? > > I just did a pull from git://git.openfabrics.org/~mst/perftest.git and I don't see a -F for the commands there. Have these tools moved? Or are you speaking of other tools? > > I also run into this problem and I have resorted to turning off cpuspeed. > > Thanks, > Ira > > > > > Regards > > Karun > > > > -----Original Message----- > > From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Sneha Mistry > > Sent: Tuesday, September 22, 2009 2:35 PM > > To: general at lists.openfabrics.org > > Subject: [ofa-general] Problem while running ib tests > > > > Hi, > > > > I have two Dual port HCAa card.Which I have installed in > > same PC. > > > > I am using OpenSuse 10.3 and installed OFED 1.4. > > > > If I try to run any IB bandwidth test or latency test it end us with warning > > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > > > Is it an OFED related problem or Linux kernel problem. > > > > HCA are connected back to back and output of ibstat is as given below. > > --------------------------------------------------------------------------------------------------------- > > CA 'mthca0' > > CA type: MT25208 (MT23108 compat mode) > > Number of ports: 2 > > Firmware version: 4.8.200 > > Hardware version: 20 > > Node GUID: 0x0002c90200283734 > > System image GUID: 0x0002c90200283737 > > Port 1: > > State: Active > > Physical state: LinkUp > > Rate: 20 > > Base lid: 1 > > LMC: 0 > > SM lid: 1 > > Capability mask: 0x02510a6a > > Port GUID: 0x0002c90200283735 > > Port 2: > > State: Initializing > > Physical state: LinkUp > > Rate: 20 > > Base lid: 0 > > LMC: 0 > > SM lid: 0 > > Capability mask: 0x02510a68 > > Port GUID: 0x0002c90200283736 > > CA 'mthca1' > > CA type: MT25208 (MT23108 compat mode) > > Number of ports: 2 > > Firmware version: 4.8.200 > > Hardware version: 20 > > Node GUID: 0x0002c90200283730 > > System image GUID: 0x0002c90200283733 > > Port 1: > > State: Initializing > > Physical state: LinkUp > > Rate: 20 > > Base lid: 0 > > LMC: 0 > > SM lid: 0 > > Capability mask: 0x02510a68 > > Port GUID: 0x0002c90200283731 > > Port 2: > > State: Active > > Physical state: LinkUp > > Rate: 20 > > Base lid: 3 > > LMC: 0 > > SM lid: 1 > > Capability mask: 0x02510a68 > > Port GUID: 0x0002c90200283732 > > -------------------------------------------------------------------------------------------------------------------------------------- > > > > Even ibibnetdiscover is detecting one hope. > > So there is nothing wrong if I try to test 2 HCA connected back to back. > > > > Please tell me what can be done for running ib tests. > > > > Thanks, > > SGM > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://**lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://**openib.org/mailman/listinfo/openib-general > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://**lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://**openib.org/mailman/listinfo/openib-general > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > weiny2 at llnl.gov > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From weiny2 at llnl.gov Wed Sep 23 08:32:20 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Sep 2009 08:32:20 -0700 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <20090923151149.GJ22310@obsidianresearch.com> References: <4AB9E3CE.4030409@Voltaire.COM> <20090923151149.GJ22310@obsidianresearch.com> Message-ID: <20090923083220.2c844367.weiny2@llnl.gov> On Wed, 23 Sep 2009 09:11:49 -0600 Jason Gunthorpe wrote: > On Wed, Sep 23, 2009 at 12:01:02PM +0300, Moni Shoua wrote: > > > > If I try to run any IB bandwidth test or latency test it end us with warning > > > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > > One more option to solve it (besides the -F) is to disable the power > > saving of the CPU. In the power saving mode not all CPUs operate at > > the same Hz/sec rate (cycle rate) and this is the reason for the > > conflict. Disabling the power saving mode is done in the BIOS and > > it is sometimes called economy mode or performance mode (the other > > side of the trade off) > > This whole thing should just be removed. x86-64 linux has a VDSO > clock_gettime(CLOCK_MONOTONIC) that is fast, doesn't trap into the > kernel, and doesn't suffer from these problems. What about other architectures? I would be willing to do a quick patch but... Ira > > Jason > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jgunthorpe at obsidianresearch.com Wed Sep 23 08:41:28 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Wed, 23 Sep 2009 09:41:28 -0600 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <20090923083220.2c844367.weiny2@llnl.gov> References: <4AB9E3CE.4030409@Voltaire.COM> <20090923151149.GJ22310@obsidianresearch.com> <20090923083220.2c844367.weiny2@llnl.gov> Message-ID: <20090923154128.GL22310@obsidianresearch.com> On Wed, Sep 23, 2009 at 08:32:20AM -0700, Ira Weiny wrote: > On Wed, 23 Sep 2009 09:11:49 -0600 > Jason Gunthorpe wrote: > > > On Wed, Sep 23, 2009 at 12:01:02PM +0300, Moni Shoua wrote: > > > > > > If I try to run any IB bandwidth test or latency test it end us with warning > > > > "Conflicting CPU frequency values detected: 2394.000000 != 1596.000000". > > > > > One more option to solve it (besides the -F) is to disable the power > > > saving of the CPU. In the power saving mode not all CPUs operate at > > > the same Hz/sec rate (cycle rate) and this is the reason for the > > > conflict. Disabling the power saving mode is done in the BIOS and > > > it is sometimes called economy mode or performance mode (the other > > > side of the trade off) > > > > This whole thing should just be removed. x86-64 linux has a VDSO > > clock_gettime(CLOCK_MONOTONIC) that is fast, doesn't trap into the > > kernel, and doesn't suffer from these problems. > > What about other architectures? I would be willing to do a quick patch but... It varies. 32 bit x86 still doesn't have a VDSO for timekeeping for some reason.. s390 and ppc are OK. But this is a fluffy diagnostic - does it matter? More people seem to get tripped up by the wonky way to keep time than I've ever seen it benefit. Jason From monis at Voltaire.COM Wed Sep 23 08:43:12 2009 From: monis at Voltaire.COM (Moni Shoua) Date: Wed, 23 Sep 2009 18:43:12 +0300 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <20090923151149.GJ22310@obsidianresearch.com> References: <4AB9E3CE.4030409@Voltaire.COM> <20090923151149.GJ22310@obsidianresearch.com> Message-ID: <4ABA4210.7010704@Voltaire.COM> > This whole thing should just be removed. x86-64 linux has a VDSO > clock_gettime(CLOCK_MONOTONIC) that is fast, doesn't trap into the > kernel, and doesn't suffer from these problems. > > Jason In the man page for clock_gettime() I see that CLOCK_PROCESS_CPUTIME_ID may be more accurate for (high) performance measurements. What do you think? CLOCK_MONOTONIC Clock that cannot be set and represents monotonic time since some unspecified starting point. CLOCK_PROCESS_CPUTIME_ID High-resolution per-process timer from the CPU. From jgunthorpe at obsidianresearch.com Wed Sep 23 08:48:46 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Wed, 23 Sep 2009 09:48:46 -0600 Subject: [ofa-general] Problem while running ib tests In-Reply-To: <4ABA4210.7010704@Voltaire.COM> References: <4AB9E3CE.4030409@Voltaire.COM> <20090923151149.GJ22310@obsidianresearch.com> <4ABA4210.7010704@Voltaire.COM> Message-ID: <20090923154846.GM22310@obsidianresearch.com> On Wed, Sep 23, 2009 at 06:43:12PM +0300, Moni Shoua wrote: > In the man page for clock_gettime() I see that > CLOCK_PROCESS_CPUTIME_ID may be more accurate for (high) performance > measurements. What do you think? The CPUTIME counters are of the 'time spent running code' variety, they do not time absolute time and they are executed in the kernel. CLOCK_MONOTONIC is the closest thing to tsc The man page is a bit misleading.. See the SUSv3 spec: https://www.opengroup.org/onlinepubs/000095399/functions/clock_getres.html Jason From sean.hefty at intel.com Wed Sep 23 09:08:28 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 23 Sep 2009 09:08:28 -0700 Subject: [ofa-general] RE: Possible process deadlock in RMPP flow In-Reply-To: <20090923150454.GA26150@mtls03> References: <20090923150454.GA26150@mtls03> Message-ID: <7A32EEE20DF5432CADB60B8F8B1E0093@amr.corp.intel.com> >ibnetdiscover D ffffffff80149b8d 0 26968 26544 >(L-TLB) > ffff8102c900bd88 0000000000000046 ffff81037e8e0000 ffff81037e8e02e8 > ffff8102c900bd78 000000000000000a ffff8102c5b50820 ffff81038a929820 > 0000011837bf6105 0000000000000ede ffff8102c5b50a08 0000000100000000 >Call Trace: > [] wait_for_completion+0x79/0xa2 > [] default_wake_function+0x0/0xe > [] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde > [] :ib_mad:ib_unregister_mad_agent+0x30d/0x424 > [] :ib_umad:ib_umad_close+0x9d/0xd6 > [] __fput+0xae/0x198 > [] filp_close+0x5c/0x64 > [] put_files_struct+0x63/0xae > [] do_exit+0x31c/0x911 > [] cpuset_exit+0x0/0x6c > [] system_call+0x7e/0x83 > >From the dump it seems that the process is waits on the call to >flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is >OFED 1.4.2. Roland just submitted a patch in this area yesterday. I don't know if the patch would fix their issue, but it may be worth trying. What kernel does 1.4.2 map to? What RMPP messages does ibnetdiscover use? If the program is completing successfully, there may be a different race with the rmpp cleanup. I'll see if anything else stands out in that area. - Sean From hal.rosenstock at gmail.com Wed Sep 23 09:20:48 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 23 Sep 2009 12:20:48 -0400 Subject: [ofa-general] Re: Possible process deadlock in RMPP flow In-Reply-To: <7A32EEE20DF5432CADB60B8F8B1E0093@amr.corp.intel.com> References: <20090923150454.GA26150@mtls03> <7A32EEE20DF5432CADB60B8F8B1E0093@amr.corp.intel.com> Message-ID: On Wed, Sep 23, 2009 at 12:08 PM, Sean Hefty wrote: > >ibnetdiscover D ffffffff80149b8d 0 26968 26544 > >(L-TLB) > > ffff8102c900bd88 0000000000000046 ffff81037e8e0000 ffff81037e8e02e8 > > ffff8102c900bd78 000000000000000a ffff8102c5b50820 ffff81038a929820 > > 0000011837bf6105 0000000000000ede ffff8102c5b50a08 0000000100000000 > >Call Trace: > > [] wait_for_completion+0x79/0xa2 > > [] default_wake_function+0x0/0xe > > [] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde > > [] :ib_mad:ib_unregister_mad_agent+0x30d/0x424 > > [] :ib_umad:ib_umad_close+0x9d/0xd6 > > [] __fput+0xae/0x198 > > [] filp_close+0x5c/0x64 > > [] put_files_struct+0x63/0xae > > [] do_exit+0x31c/0x911 > > [] cpuset_exit+0x0/0x6c > > [] system_call+0x7e/0x83 > > > >From the dump it seems that the process is waits on the call to > >flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is > >OFED 1.4.2. > > Roland just submitted a patch in this area yesterday. I don't know if the > patch > would fix their issue, but it may be worth trying. What kernel does 1.4.2 > map > to? > > What RMPP messages does ibnetdiscover use? None AFAIK. -- Hal > If the program is completing > successfully, there may be a different race with the rmpp cleanup. I'll > see if > anything else stands out in that area. > > - Sean > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paravpandit at yahoo.com Wed Sep 23 10:22:20 2009 From: paravpandit at yahoo.com (Parav Pandit) Date: Wed, 23 Sep 2009 10:22:20 -0700 (PDT) Subject: [ofa-general] Is really MPA markers supported by Neteffect?? Message-ID: <795876.1677.qm@web30105.mail.mud.yahoo.com> Hi, Are MPA markers really supported by the Neteffect adapters? Is any active maintainer of iWarp driver for Neteffect? We are trying to enable this feature in driver, but don't find any such configuration. Chelsio and other adapters provide such feature as command line argument. Regards, Parav Pandit From eli at dev.mellanox.co.il Wed Sep 23 10:25:32 2009 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 23 Sep 2009 20:25:32 +0300 Subject: [ofa-general] Re: Possible process deadlock in RMPP flow In-Reply-To: <7A32EEE20DF5432CADB60B8F8B1E0093@amr.corp.intel.com> References: <20090923150454.GA26150@mtls03> <7A32EEE20DF5432CADB60B8F8B1E0093@amr.corp.intel.com> Message-ID: <20090923172532.GA32223@mtls03> On Wed, Sep 23, 2009 at 09:08:28AM -0700, Sean Hefty wrote: > > Roland just submitted a patch in this area yesterday. I don't know if the patch > would fix their issue, but it may be worth trying. What kernel does 1.4.2 map > to? I think OFED 1.4.2 is based on kernel 2.6.27 but they're using RHEL 5.3. Thanks, we'll try this. > > What RMPP messages does ibnetdiscover use? If the program is completing > successfully, there may be a different race with the rmpp cleanup. I'll see if > anything else stands out in that area. > > - Sean > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html From chien.tin.tung at intel.com Wed Sep 23 10:36:57 2009 From: chien.tin.tung at intel.com (Tung, Chien Tin) Date: Wed, 23 Sep 2009 10:36:57 -0700 Subject: [ofa-general] Is really MPA markers supported by Neteffect?? In-Reply-To: <795876.1677.qm@web30105.mail.mud.yahoo.com> References: <795876.1677.qm@web30105.mail.mud.yahoo.com> Message-ID: <603F8A3875DCE940BA37B49D0A6EA0AE03626B1C@azsmsx501.amr.corp.intel.com> >Are MPA markers really supported by the Neteffect adapters? our HW does not support markers. >Is any active maintainer of iWarp driver for Neteffect? We are still around as Intel. I will be happy to answer any questions, Parav/Bill. Chien -- Chien Tung | chien.tin.tung at intel.com From paravpandit at yahoo.com Wed Sep 23 11:04:39 2009 From: paravpandit at yahoo.com (Parav Pandit) Date: Wed, 23 Sep 2009 11:04:39 -0700 (PDT) Subject: [ofa-general] Is really MPA markers supported by Neteffect?? In-Reply-To: <603F8A3875DCE940BA37B49D0A6EA0AE03626B1C@azsmsx501.amr.corp.intel.com> Message-ID: <977477.71433.qm@web30107.mail.mud.yahoo.com> Hi Chien, Thanks for the inputs. I was looking at nes_hw.h, one of the error named NES_AEQE_AEID_LLP_RECEIVED_MARKER_AND_LENGTH_FIELDS_DONT_MATCH Looking at the intuitive error flags, I thought, may be by enabling some setting, if I could us the markers. Regards, Parav Pandit --- On Wed, 9/23/09, Tung, Chien Tin wrote: > From: Tung, Chien Tin > Subject: RE: [ofa-general] Is really MPA markers supported by Neteffect?? > To: "Parav Pandit" , "general at lists.openfabrics.org" > Date: Wednesday, September 23, 2009, 11:06 PM > > >Are MPA markers really supported by the Neteffect > adapters? > > our HW does not support markers. > > >Is any active maintainer of iWarp driver for > Neteffect? > > We are still around as Intel.  I will be happy to > answer any > questions, Parav/Bill. > > Chien > > -- > Chien Tung | chien.tin.tung at intel.com > > > > From rdreier at cisco.com Wed Sep 23 11:09:52 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Sep 2009 11:09:52 -0700 Subject: [ofa-general] Re: [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code In-Reply-To: <9DA1536B0B4943E7BC52280C977F1D23@amr.corp.intel.com> (Sean Hefty's message of "Tue, 22 Sep 2009 15:27:23 -0700") References: <9DA1536B0B4943E7BC52280C977F1D23@amr.corp.intel.com> Message-ID: > Reviewed-by: Sean Hefty Thanks, I applied this. From vst at vlnb.net Wed Sep 23 12:11:29 2009 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Wed, 23 Sep 2009 23:11:29 +0400 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4A9FA945.4070408@vlnb.net> <4AA4F561.504@vlnb.net> <4AB7B0EC.20109@vlnb.net> Message-ID: <4ABA72E1.6030309@vlnb.net> Chris Worley, on 09/22/2009 02:00 AM wrote: > On Mon, Sep 21, 2009 at 10:59 AM, Vladislav Bolkhovitin wrote: >> Chris Worley, on 09/19/2009 01:31 AM wrote: >>> On Mon, Sep 7, 2009 at 5:58 AM, Vladislav Bolkhovitin >>> wrote: >>>> Chris Worley, on 09/06/2009 05:41 PM wrote: >>>>> On Sun, Sep 6, 2009 at 3:36 PM, Chris Worley wrote: >>>>>> On Sun, Sep 6, 2009 at 3:17 PM, Bart Van >>>>>> Assche >>>>>> wrote: >>>>>>> On Fri, Sep 4, 2009 at 1:20 AM, Chris Worley >>>>>>> wrote: >>>>>>>> On Thu, Sep 3, 2009 at 11:38 AM, Chris Worley >>>>>>>> wrote: >>>>>>>>> I've used a couple of initiators (different systems) w/ different >>>>>>>>> OSes, w/ different IB cards (all QDR) and different IB stacks >>>>>>>>> (built-in vs. OFED) and can repeat the problem in all but the >>>>>>>>> RHEL5.2/OFED 1.4.1 target and initiator (but, if the initiator is >>>>>>>>> WinOF and the target is RHEL5.2/OFED1.4.1, then the problem does >>>>>>>>> repeat). >>>>>>>> Here's a twist: I used the Ubuntu initiator w/ one of the RHEL >>>>>>>> targets, and the RHEL initiator (same machine as was running WinOF >>>>>>>> from the beginning of this thread) w/ one of the Ubuntu targets: in >>>>>>>> both cases, the problem does not repeat. >>>>>>>> >>>>>>>> That makes it sound like OFED is the cure on either side of the >>>>>>>> connection, but does not explain the issue w/ WinOF (which does fail >>>>>>>> w/ either Ununtu or RHEL targets). >>>>>>> These results are strange. Regarding the Linux-only tests, I was >>>>>>> assuming failure of a single component (Ubuntu SRP initiator, OFED SRP >>>>>>> initiator, Ubuntu IB driver, OFED IB driver or SRP target), but for >>>>>>> each of these components there is at least one test that passes and at >>>>>>> least one test that fails. So either my assumption is wrong or one of >>>>>>> the above test results is not repeatable. Do you have the time to >>>>>>> repeat the Linux-only tests ? >>>>>> Last night I was rerunning the RHEL5.2 initiator w/ Ubuntu client, and >>>>>> the problem repeated; now, I can't repeat the case where it didn't >>>>>> fail. Still, no errors, other than the eventual timeouts previously >>>>>> shown; the target thinks all is fine, the initiator is stuck. >>>>> ... and I haven't had any success w/ Ubuntu target and initiator, 8.10 >>>>> or >>>>> 9.04. >>>> 1. Try with kernel parameter maxcpus=1. It will somehow relax possible >>>> races >>>> you have, although not completely. >>> I finally got around to this test... 1 CPU works very well, w/o hangs >>> (will test all night to see if this holds true), 2 or more don't. >>> This is dual-socket NHM, so I can't specify more than one processor >>> w/o getting more than one socket. >> Where 1 CPU works well, on the target or initiator? > > That was on the target. > >> The race is on the >> corresponding host. >> >> I'd suggest you to reproduce the problem with the latest SCST trunk, lockdep >> enabled on the suspected host (better on both) and mgmt_minor trace level >> enabled on the target. Then, after the hang, let the system stay for about a >> half an hour, then send us with Bart (privately, compressed) kernel logs >> from both systems starting from the early boot messages. > > I believe I comprehensively tested w/ Lockdep and complete scst > messages dumps on the target (and lockdep on the initiator) and came > up with no messages or lock issues salient to the issue. > > If you think I should repeat this, I will. You didn't leave it for half an hour and didn't send us the logs, did you? But since Bart reproduced something similar, it isn't too important now, although still desired. >> If you have dmesg only output, please enable printk timestamps >> (CONFIG_PRINTK_TIME). > > Ubuntu has been pretty good about that. > > Thanks, > > Chris >>> Chris >>>> 2. Try with another hardware, including motherboard. You can have >>>> something >>>> like http://lkml.org/lkml/2007/7/31/558 (not exactly it, of course) >>>> >>>>> Chris >>>>>> Chris >>>>>>> Bart. >>>>>>> >> > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From nathan at robotics.net Wed Sep 23 14:26:47 2009 From: nathan at robotics.net (Nathan Stratton) Date: Wed, 23 Sep 2009 16:26:47 -0500 (CDT) Subject: [ofa-general] Fedora 11, kernel 2.6.31 In-Reply-To: <200909230933.51858.jackm@dev.mellanox.co.il> References: <200909230933.51858.jackm@dev.mellanox.co.il> Message-ID: On Wed, 23 Sep 2009, Jack Morgenstein wrote: > On Tuesday 22 September 2009 21:58, Nathan Stratton wrote: >> >> Having an issue with getting verbs working on 2.6.31. I am running Fedora >> 11 with 2.6.31 and 1.1.2-0.1.gb00dc7d libibverbs. Everything looks great >> until I run ibv_srq_pingpong to the server. It shows local/remote address >> a bunch of times and then freezes. >> > You do have the latest firmware installed for your HCA. Yes > You are using the OFED user libraries with a non-OFED kernel. While I am surprised > that it is giving you problems with libmthca, this combination has not been tested. How do I get a xen/OFED 2.6.31 kernel? Any way to get OFED to compile on 2.6.31? -Nathan From weiny2 at llnl.gov Wed Sep 23 14:45:55 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Sep 2009 14:45:55 -0700 Subject: [ofa-general] [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan Message-ID: <20090923144555.9efa2c75.weiny2@llnl.gov> From: Ira Weiny Date: Wed, 23 Sep 2009 14:26:55 -0700 Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan Also ibd_sm_id is never valid in this tool as the "-s" option is used for "suppress" Signed-off-by: Ira Weiny --- infiniband-diags/src/ibqueryerrors.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index f73ca6f..892e539 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c @@ -441,8 +441,8 @@ int main(int argc, char **argv) } else if (switch_guid_str) { if ((resolved = ib_resolve_portid_str_via(&portid, switch_guid_str, - IB_DEST_GUID, ibd_sm_id, - ibmad_port)) >= 0) + IB_DEST_GUID, NULL, + ibmad_port)) < 0) IBWARN("Failed to resolve %s; attempting full scan\n", switch_guid_str); } -- 1.5.4.5 From weiny2 at llnl.gov Wed Sep 23 15:09:23 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Sep 2009 15:09:23 -0700 Subject: [ofa-general] [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router Message-ID: <20090923150923.a5281107.weiny2@llnl.gov> From: Ira Weiny Date: Wed, 23 Sep 2009 11:38:11 -0700 Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router By default ibqueryerrors should print errors for all node types. Adding the other options allows for the limitation of this output. Also change the --switch option to be --node-guid which is really more accurate and use "-G" for better compliance with other utilities. "-S" is left in for backward compatibility for the time being. Update the man page Signed-off-by: Ira Weiny --- infiniband-diags/man/ibqueryerrors.8 | 62 +++++++++++++++++++----------- infiniband-diags/src/ibqueryerrors.c | 70 +++++++++++++++++++++++++--------- 2 files changed, 92 insertions(+), 40 deletions(-) diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 index a327f3b..8f83a7b 100644 --- a/infiniband-diags/man/ibqueryerrors.8 +++ b/infiniband-diags/man/ibqueryerrors.8 @@ -5,7 +5,7 @@ ibqueryerrors.pl \- query and report non-zero IB port counters .SH SYNOPSIS .B ibqueryerrors.pl -[-a -c -r -R -C -P -s -S +[-s -c -r -C -P -s -G -D -d] .SH DESCRIPTION @@ -20,41 +20,59 @@ reported. .PP .TP -\fB\-a\fR -Report an action to take. Some of the counters are not errors in and of -themselves. This reports some more information on what the counters mean and -what actions can/should be taken if they are non-zero. +\fB\-s \fR +Suppress the errors listed in the comma separated list provided. .TP \fB\-c\fR Suppress some of the common "side effect" counters. These counters usually do not indicate an error condition and can be usually be safely ignored. .TP +\fB\-G \fR +Report results only for the node guid specified. +.TP +\fB\-S \fR +\-S is provided only for backward compatibility and works the same as "-G" +.TP +\fB\-D \fR +Report results only for the switch specified by the direct route path. +.TP \fB\-r\fR Report the port information. This includes LID, port, external port (if applicable), link speed setting, remote GUID, remote port, remote external port (if applicable), and remote node description information. .TP -\fB\-R\fR -Recalculate the ibnetdiscover information, ie do not use the cached -information. This option is slower but should be used if the diag tools have -not been used for some time or if there are other reasons to believe that -the fabric has changed. +\fB\-\-data\fR +Include the optional transmit and receive data counters. .TP -\fB\-s \fR -Suppress the errors listed in the comma separated list provided. +\fB\-\-switch\fR print data for switches only .TP -\fB\-S \fR -Report results only for the switch specified. (hex format) +\fB\-\-ca\fR print data for CA's only .TP -\fB\-D \fR -Report results only for the switch specified by the direct route path. +\fB\-\-router\fR print data for routers only .TP -\fB\-d\fR -Include the optional transmit and receive data counters. -.TP -\fB\-C \fR use the specified ca_name for the search. -.TP -\fB\-P \fR use the specified ca_port for the search. +\fB\-R\fR (This option is obsolete and does nothing) + +.SH COMMON OPTIONS +.PP +\-d raise the IB debugging level. + May be used several times (-ddd or -d -d -d). +.PP +\-e show send and receive errors (timeouts and others) +.PP +\-h show the usage message +.PP +\-v increase the application verbosity level. + May be used several times (-vv or -v -v -v) +.PP +\-V show the version info. + +# Other common flags: +.PP +\-C use the specified ca_name. +.PP +\-P use the specified ca_port. +.PP +\-t override the default timeout for the solicited mads. .SH AUTHOR diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index f73ca6f..ecfd662 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c @@ -59,12 +59,17 @@ static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; int data_counters = 0; int port_config = 0; -uint64_t switch_guid = 0; -char *switch_guid_str = NULL; +uint64_t node_guid = 0; +char *node_guid_str = NULL; int sup_total = 0; enum MAD_FIELDS *suppressed_fields = NULL; char *dr_path = NULL; -int all_nodes = 0; + +#define PRINT_ALL 0xFF /* all nodes default flag */ +uint8_t node_type_to_print = PRINT_ALL; +#define PRINT_SWITCH 0x1 +#define PRINT_CA 0x2 +#define PRINT_ROUTER 0x4 static unsigned int get_max(unsigned int num) { @@ -304,8 +309,21 @@ void print_node(ibnd_node_t * node, void *user_data) int header_printed = 0; int p = 0; int startport = 1; + int type = 0; + + switch (node->type) { + case IB_NODE_SWITCH: + type = PRINT_SWITCH; + break; + case IB_NODE_CA: + type = PRINT_CA; + break; + case IB_NODE_ROUTER: + type = PRINT_ROUTER; + break; + } - if (!all_nodes && node->type != IB_NODE_SWITCH) + if ((type & node_type_to_print) == 0) return; if (node->type == IB_NODE_SWITCH && node->smaenhsp0) @@ -361,11 +379,24 @@ static int process_opt(void *context, int ch, char *optarg) data_counters++; break; case 3: - all_nodes++; + if (node_type_to_print == PRINT_ALL) + node_type_to_print = 0; + node_type_to_print |= PRINT_SWITCH; + break; + case 4: + if (node_type_to_print == PRINT_ALL) + node_type_to_print = 0; + node_type_to_print |= PRINT_CA; + break; + case 5: + if (node_type_to_print == PRINT_ALL) + node_type_to_print = 0; + node_type_to_print |= PRINT_ROUTER; break; + case 'G': case 'S': - switch_guid_str = optarg; - switch_guid = strtoull(optarg, 0, 0); + node_guid_str = optarg; + node_guid = strtoull(optarg, 0, 0); break; case 'D': dr_path = strdup(optarg); @@ -399,8 +430,9 @@ int main(int argc, char **argv) {"suppress-common", 'c', 0, NULL, "suppress some of the common counters"}, {"node-name-map", 1, 1, "", "node name map file"}, - {"switch", 'S', 1, "", - "query only (hex format)"}, + {"node-guid", 'G', 1, "", "query only "}, + {"", 'S', 1, "", + "Same as \"-G\" for backward compatibility"}, {"Direct", 'D', 1, "", "query only switch specified by "}, {"report-port", 'r', 0, NULL, @@ -408,7 +440,9 @@ int main(int argc, char **argv) {"GNDN", 'R', 0, NULL, "(This option is obsolete and does nothing)"}, {"data", 2, 0, NULL, "include the data counters in the output"}, - {"all", 3, 0, NULL, "output all nodes (not just switches)"}, + {"switch", 3, 0, NULL, "print data for switches only"}, + {"ca", 4, 0, NULL, "print data for CA's only"}, + {"router", 5, 0, NULL, "print data for routers only"}, {0} }; char usage_args[] = ""; @@ -438,13 +472,13 @@ int main(int argc, char **argv) NULL, ibmad_port)) < 0) IBWARN("Failed to resolve %s; attempting full scan\n", dr_path); - } else if (switch_guid_str) { + } else if (node_guid_str) { if ((resolved = - ib_resolve_portid_str_via(&portid, switch_guid_str, + ib_resolve_portid_str_via(&portid, node_guid_str, IB_DEST_GUID, ibd_sm_id, ibmad_port)) >= 0) IBWARN("Failed to resolve %s; attempting full scan\n", - switch_guid_str); + node_guid_str); } if (resolved >= 0) @@ -463,13 +497,13 @@ int main(int argc, char **argv) report_suppressed(); - if (switch_guid_str) { - ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid); + if (node_guid_str) { + ibnd_node_t *node = ibnd_find_node_guid(fabric, node_guid); if (node) print_node(node, NULL); else fprintf(stderr, "Failed to find node: %s\n", - switch_guid_str); + node_guid_str); } else if (dr_path) { ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path); uint8_t ni[IB_SMP_DATA_SIZE]; @@ -477,9 +511,9 @@ int main(int argc, char **argv) if (!smp_query_via(ni, &portid, IB_ATTR_NODE_INFO, 0, ibd_timeout, ibmad_port)) return -1; - mad_decode_field(ni, IB_NODE_GUID_F, &(switch_guid)); + mad_decode_field(ni, IB_NODE_GUID_F, &(node_guid)); - node = ibnd_find_node_guid(fabric, switch_guid); + node = ibnd_find_node_guid(fabric, node_guid); if (node) print_node(node, NULL); else -- 1.5.4.5 From weiny2 at llnl.gov Wed Sep 23 17:24:51 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Sep 2009 17:24:51 -0700 Subject: [ofa-general] [PATCH] infiniband-diags: Fix IB network discovery from switch node. In-Reply-To: <4A9548AA.4020900@gmail.com> References: <4A9548AA.4020900@gmail.com> Message-ID: <20090923172451.fb20ab9b.weiny2@llnl.gov> Eli, On Wed, 26 Aug 2009 17:37:30 +0300 "Eli Dorfman (Voltaire)" wrote: > Subject: [PATCH] Fix IB network discovery from switch node. Sorry for the late inquiry on this but what exactly was the bug here? I just found that this change introduced a bug. The problem is that if you don't do this query, even when the first found node is a switch, the port you came into the switch on will not get reported properly. Here is what I mean. Running with the current master: 17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: 8 1[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) ... 8 9[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) 8 10[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 15 24[ ] "ISR9024D Voltaire" ( ) 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" ( ) 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) ... The DR path "came in" on port 12 and is reported as Active/LinkUp but has no information on the other end. Here is what the output should look like with your change removed. 17:22:36 > ./iblinkinfo -S 0x000b8cffff00490c Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: 8 1[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) ... 8 9[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) 8 10[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 15 24[ ] "ISR9024D Voltaire" ( ) 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 7 8[ ] "Cisco Switch SFS7000D" ( ) 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) ... This properly reports the other end of this link as another switch. Could you explain the problem a bit more so we can come up with a better solution? Thanks, Ira > > Signed-off-by: Eli Dorfman > --- > infiniband-diags/libibnetdisc/src/ibnetdisc.c | 16 +++++++++------- > 1 files changed, 9 insertions(+), 7 deletions(-) > > diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > index c69467e..779e659 100644 > --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c > +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > @@ -590,13 +590,15 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * ibmad_port, > if (!port) > goto error; > > - rc = get_remote_node(ibmad_port, fabric, node, port, from, > - mad_get_field(node->info, 0, > - IB_NODE_LOCAL_PORT_F), 0); > - if (rc < 0) > - goto error; > - if (rc > 0) /* non-fatal error, nothing more to be done */ > - return ((ibnd_fabric_t *) fabric); > + if (node->node.type != IB_NODE_SWITCH) { > + rc = get_remote_node(ibmad_port, fabric, node, port, from, > + mad_get_field(node->info, 0, > + IB_NODE_LOCAL_PORT_F), 0); > + if (rc < 0) > + goto error; > + if (rc > 0) /* non-fatal error, nothing more to be done */ > + return ((ibnd_fabric_t *) fabric); > + } > > for (dist = 0; dist <= max_hops; dist++) { > > -- > 1.5.5 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From vlad at lists.openfabrics.org Thu Sep 24 03:08:48 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 24 Sep 2009 03:08:48 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090924-0200 daily build status Message-ID: <20090924100848.4C5CEE61F0C@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.18-164.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -fno-delete-null-pointer-checks -fwrapv -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(connection)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/.tmp_connection.o /home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c /home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c: In function 'rds_conn_bucket': /home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: warning: passing argument 1 of 'inet_ehashfn' makes integer from pointer without a cast /home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: error: too many arguments to function 'inet_ehashfn' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090924-0200_linux-2.6.18-164.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-164.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From rdreier at cisco.com Thu Sep 24 11:03:17 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Sep 2009 11:03:17 -0700 Subject: [ofa-general] Re: [PATCH v2] mlx4: configure cache line size In-Reply-To: <20090923073658.GA23252@mtls03> (Eli Cohen's message of "Wed, 23 Sep 2009 10:36:58 +0300") References: <20090923073658.GA23252@mtls03> Message-ID: thanks, applied. From rdreier at cisco.com Thu Sep 24 11:57:07 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Sep 2009 11:57:07 -0700 Subject: [ofa-general] Re: [PATCH] mthca: Fix access to freed memory in catas processing In-Reply-To: <200909211312.43359.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Mon, 21 Sep 2009 13:12:42 +0300") References: <200909211312.43359.jackm@dev.mellanox.co.il> Message-ID: Thanks, applied. I almost missed this one because you sent it to general@ and not linux-rdma at vger.kernel.org -- and I was using the fancy patchwork.kernel.org patch tracking stuff to see what I had to apply. So sending things to linux-rdma@ helps you too! Thanks, Roland From rdreier at cisco.com Thu Sep 24 12:00:54 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Sep 2009 12:00:54 -0700 Subject: [ofa-general] Re: [PATCH V2] IB/ipoib: Do not turn on carrier to a non active port In-Reply-To: <4AB74A78.2080108@Voltaire.COM> (Moni Shoua's message of "Mon, 21 Sep 2009 12:42:16 +0300") References: <4AB74A78.2080108@Voltaire.COM> Message-ID: thanks, applied. From hnrose at comcast.net Thu Sep 24 12:34:44 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Thu, 24 Sep 2009 15:34:44 -0400 Subject: [ofa-general] [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked Message-ID: <20090924193444.GA15377@comcast.net> When multiple switches are unlinked and then a switch is relinked, it should behave like a cable pull or power down of switch so it depends on the state of the remote peer port (as to linked or not). This is not represented in the IB port/port physical state and is additional state. Signed-off-by: Hal Rosenstock --- diff --git a/ibsim/sim.h b/ibsim/sim.h index bf85875..52eb73b 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -210,6 +211,7 @@ struct Port { int remoteport; Node *previous_remotenode; int previous_remoteport; + int unlinked; int errrate; uint16_t errattr; Node *node; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index cb6e639..d27ab0f 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This file is part of ibsim. * @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) rport = node_get_port(rnode, rportnum); + if (rport->unlinked) { + lport->unlinked = 0; + return -1; + } + if (link_ports(lport, rport) < 0) return -fprintf(f, "# can't link: local/remote port are already connected\n"); lport->previous_remotenode = NULL; rport->previous_remotenode = NULL; + lport->unlinked = 0; return 0; } @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) numports++; // To make the for-loop below run up to last port else lportnum--; - + if (lportnum >= 0) { lport = ports + lnode->portsbase + lportnum; @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport->previous_remotenode, lport->previous_remoteport); + if (rport->unlinked) { + lport->unlinked = 0; + return -1; + } + if (link_ports(lport, rport) < 0) return -fprintf(f, "# can't link: local/remote port are already connected\n"); lport->previous_remotenode = NULL; rport->previous_remotenode = NULL; + lport->unlinked = 0; return 1; } @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) rport = node_get_port(lport->previous_remotenode, lport->previous_remoteport); + if (rport->unlinked) { + lport->unlinked = 0; + continue; + } + if (link_ports(lport, rport) < 0) continue; lport->previous_remotenode = NULL; rport->previous_remotenode = NULL; + lport->unlinked = 0; relinked++; } @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) lport->previous_remoteport = lport->remoteport; rport->previous_remotenode = rport->remotenode; rport->previous_remoteport = rport->remoteport; + lport->unlinked = 1; lport->remotenode = rport->remotenode = 0; lport->remoteport = rport->remoteport = 0; @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) if (portnum >= 0) { port = ports + node->portsbase + portnum; if (!clear && !port->remotenode) { + port->unlinked = 1; fprintf(f, "# port %d at nodeid \"%s\" is not linked\n", portnum, nodeid); return -1; @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) for (port = ports + node->portsbase, e = port + numports; port < e; port++) { - if (!clear && !port->remotenode) + if (!clear && !port->remotenode) { + port->unlinked = 1; continue; + } if (port->remotenode) unlink_port(node, port, port->remotenode, port->remoteport); diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c index 8a5d281..0092068 100644 --- a/ibsim/sim_net.c +++ b/ibsim/sim_net.c @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) port->linkwidth = LINKWIDTH_4x; port->linkspeedena = netspeed; port->linkspeed = LINKSPEED_SDR; + port->unlinked = 0; size = (type == SWITCH_NODE && i) ? sw_pkey_size : ca_pkey_size; if (size) { From rdreier at cisco.com Thu Sep 24 12:45:12 2009 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Sep 2009 12:45:12 -0700 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get the batch of RDMA/InfiniBand changes for the 2.6.32 merge window: The following changes since commit 86d710146fb9975f04c505ec78caa43d227c1018: Linus Torvalds (1): Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 are available in the git repository at: master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus Eli Cohen (1): mlx4_core: Pass cache line size to device FW Jack Morgenstein (1): IB/mthca: Fix access to freed memory in catastrophic event handling Julia Lawall (1): RDMA/nes: Remove duplicate .ndo_set_mac_address field initialization Moni Shoua (1): IPoIB: Don't turn on carrier for a non-active port Roland Dreier (2): IB/mad: Fix lock-lock-timer deadlock in RMPP code Merge branches 'ipoib', 'mad', 'mlx4', 'mthca' and 'nes' into for-linus drivers/infiniband/core/mad_rmpp.c | 17 +++++++++++++---- drivers/infiniband/hw/mthca/mthca_catas.c | 11 ++++++++--- drivers/infiniband/hw/nes/nes_nic.c | 1 - drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 7 +++++++ drivers/net/mlx4/fw.c | 5 +++++ 5 files changed, 33 insertions(+), 8 deletions(-) From weiny2 at llnl.gov Thu Sep 24 23:50:11 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Sep 2009 23:50:11 -0700 Subject: [ofa-general] [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options Message-ID: <20090924235011.a9a16022.weiny2@llnl.gov> Sasha, This applies after "infiniband-diags/src/ibqueryerrors: move --all option and replace it with --switch, --ca, --router" From: Ira Weiny Date: Thu, 24 Sep 2009 20:39:29 -0700 Subject: [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options Add -k and -K options to clear errors and counters. If both are specified they will both be cleared. Update man page In addition fix 2 bugs fix the printing of Xmt Wait errors properly skip the counter select field. Signed-off-by: Ira Weiny --- infiniband-diags/man/ibqueryerrors.8 | 20 +++++-- infiniband-diags/src/ibqueryerrors.c | 91 +++++++++++++++++++++++++++++---- 2 files changed, 94 insertions(+), 17 deletions(-) diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 index 8f83a7b..56c6024 100644 --- a/infiniband-diags/man/ibqueryerrors.8 +++ b/infiniband-diags/man/ibqueryerrors.8 @@ -6,15 +6,14 @@ ibqueryerrors.pl \- query and report non-zero IB port counters .SH SYNOPSIS .B ibqueryerrors.pl [-s -c -r -C -P -s -G --D -d] +-D -d -k -K] .SH DESCRIPTION .PP -ibqueryerrors.pl reports the port counters of switches. This is similar to -ibcheckerrors with the additional ability to filter out selected errors, -include the optional transmit and receive data counters, report actions to -remedy a non-zero count, and report full link information for the link -reported. +ibqueryerrors.pl reports port counters. This is similar to ibcheckerrors with +the additional ability to filter out selected errors, include the optional +transmit and receive data counters, and report full link information for the +link reported. .SH OPTIONS @@ -50,6 +49,15 @@ Include the optional transmit and receive data counters. .TP \fB\-\-router\fR print data for routers only .TP +\fB\-\-clear\-errors\fR \fB\-k\fR Clear error counters after read. +\-k and \-K can be used together to clear both errors and counters. +.TP +\fB\-\-clear\-counts\fR \fB\-K\fR Clear data counters after read. +\fBCAUTION\fR clearing data counters will occur regardless of if they are +printed or not. This is because data counters are only \fBprinted\fR on ports +which have errors. This means if a port has 0 errors and the \-K option is +specified the data counters will be cleared without any printed output. +.TP \fB\-R\fR (This option is obsolete and does nothing) .SH COMMON OPTIONS diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index ecfd662..e379a42 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c @@ -64,6 +64,8 @@ char *node_guid_str = NULL; int sup_total = 0; enum MAD_FIELDS *suppressed_fields = NULL; char *dr_path = NULL; +int clear_errors = 0; +int clear_counts = 0; #define PRINT_ALL 0xFF /* all nodes default flag */ uint8_t node_type_to_print = PRINT_ALL; @@ -222,6 +224,10 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, if (suppress(i)) continue; + /* this is not a counter, skip it */ + if (i == IB_PC_COUNTER_SELECT2_F) + continue; + mad_decode_field(pc, i, (void *)&val); if (val) n += snprintf(str + n, 1024 - n, " [%s == %d]", @@ -232,7 +238,7 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, mad_decode_field(pc, IB_PC_XMT_WAIT_F, (void *)&val); if (val) n += snprintf(str + n, 1024 - n, " [%s == %d]", - mad_field_name(i), val); + mad_field_name(IB_PC_XMT_WAIT_F), val); } /* if we found errors. */ @@ -264,13 +270,11 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, } } -static void print_port(ibnd_node_t * node, int portnum, int *header_printed) +static int query_cap_mask(ibnd_node_t * node, int portnum, uint16_t * cap_mask) { uint8_t pc[1024]; - uint16_t cap_mask; + uint16_t rc_cap_mask; ib_portid_t portid = { 0 }; - char *nodename = - remap_node_name(node_name_map, node->guid, node->nodedesc); if (node->type == IB_NODE_SWITCH) ib_portid_set(&portid, node->smalid, 0, 0); @@ -281,16 +285,31 @@ static void print_port(ibnd_node_t * node, int portnum, int *header_printed) if (!pma_query_via(pc, &portid, portnum, ibd_timeout, CLASS_PORT_INFO, ibmad_port)) { IBWARN("classportinfo query failed on %s, %s port %d", - nodename, portid2str(&portid), portnum); - goto cleanup; + remap_node_name(node_name_map, node->guid, + node->nodedesc), portid2str(&portid), portnum); + return (-1); } + /* ClassPortInfo should be supported as part of libibmad */ - memcpy(&cap_mask, pc + 2, sizeof(cap_mask)); /* CapabilityMask */ + memcpy(&rc_cap_mask, pc + 2, sizeof(rc_cap_mask)); /* CapabilityMask */ + + *cap_mask = ntohs(rc_cap_mask); + return (0); +} + +static void print_port(ib_portid_t * portid, uint16_t cap_mask, + ibnd_node_t * node, int portnum, int *header_printed) +{ + uint8_t pc[1024]; + char *nodename = + remap_node_name(node_name_map, node->guid, node->nodedesc); + + memset(pc, 0, 1024); - if (!pma_query_via(pc, &portid, portnum, ibd_timeout, + if (!pma_query_via(pc, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS query failed on %s, %s port %d\n", - nodename, portid2str(&portid), portnum); + nodename, portid2str(portid), portnum); goto cleanup; } if (!(cap_mask & 0x1000)) { @@ -304,12 +323,38 @@ cleanup: free(nodename); } +static void clear_port(ib_portid_t * portid, uint16_t cap_mask, + ibnd_node_t * node, int port) +{ + uint8_t pc[1024]; + /* bits defined in Table 228 PortCounters CounterSelect and + * CounterSelect2 + */ + uint32_t mask = 0; + + if (!clear_errors && !clear_counts) + return; + + if (clear_errors) + mask |= 0x10FFF; + if (clear_counts) + mask |= 0xF000; + + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_COUNTERS, ibmad_port)) + IBERROR("Failed to reset errors %s port %d", + node->nodedesc, port); +} + void print_node(ibnd_node_t * node, void *user_data) { int header_printed = 0; int p = 0; int startport = 1; int type = 0; + int all_port_sup = 0; + ib_portid_t portid = { 0 }; + uint16_t cap_mask = 0; switch (node->type) { case IB_NODE_SWITCH: @@ -331,9 +376,25 @@ void print_node(ibnd_node_t * node, void *user_data) for (p = startport; p <= node->numports; p++) { if (node->ports[p]) { - print_port(node, p, &header_printed); + if (query_cap_mask(node, p, &cap_mask) < 0) + continue; + + if (cap_mask & 0x100) + all_port_sup = 1; + + if (node->type == IB_NODE_SWITCH) + ib_portid_set(&portid, node->smalid, 0, 0); + else + ib_portid_set(&portid, node->ports[p]->base_lid, 0, 0); + + print_port(&portid, cap_mask, node, p, &header_printed); + if (!all_port_sup) + clear_port(&portid, cap_mask, node, p); } } + + if (all_port_sup) + clear_port(&portid, cap_mask, node, 0xFF); } static void add_suppressed(enum MAD_FIELDS field) @@ -406,6 +467,12 @@ static int process_opt(void *context, int ch, char *optarg) break; case 'R': /* nop */ break; + case 'k': + clear_errors = 1; + break; + case 'K': + clear_counts = 1; + break; default: return -1; } @@ -443,6 +510,8 @@ int main(int argc, char **argv) {"switch", 3, 0, NULL, "print data for switches only"}, {"ca", 4, 0, NULL, "print data for CA's only"}, {"router", 5, 0, NULL, "print data for routers only"}, + {"clear-errors", 'k', 0, NULL, "Clear error counters after read"}, + {"clear-counts", 'K', 0, NULL, "Clear data counters after read"}, {0} }; char usage_args[] = ""; -- 1.5.4.5 From vlad at lists.openfabrics.org Fri Sep 25 03:07:55 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 25 Sep 2009 03:07:55 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090925-0200 daily build status Message-ID: <20090925100755.9AEA4E61FC9@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.18-164.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -fno-delete-null-pointer-checks -fwrapv -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(connection)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/.tmp_connection.o /home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c /home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c: In function 'rds_conn_bucket': /home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: warning: passing argument 1 of 'inet_ehashfn' makes integer from pointer without a cast /home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.c:56: error: too many arguments to function 'inet_ehashfn' make[3]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds/connection.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_5_kernel-20090925-0200_linux-2.6.18-164.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-164.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From sashak at voltaire.com Fri Sep 25 06:09:08 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Sep 2009 16:09:08 +0300 Subject: ib_types.h moving [was: Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm] In-Reply-To: <20090917132050.041b077d.weiny2@llnl.gov> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> Message-ID: <20090925130908.GD26931@me> On 13:20 Thu 17 Sep , Ira Weiny wrote: > > Sasha, would you be willing to accept such a patch? First move ib_types.h to umad and then move the long inline functions into the lib and separate out the remaining header. > > Or would you prefer a new library? I think there is enough code there but I leave it up to you. Basically cleaning ib_types.h issue (which was raised repeatedly in the past too) and making some order here with libibmad duplications would be a nice thing. However I still not understand clearly yet how things should be organized properly (assumig all histories, ibutils dependencies, etc.). And sure we can try to find an optimal model, so let's discuss: libibumad is an option. However today this library only provides a layer to user_mad kernel API and actually is transparent to MAD's structure. Maybe complicating this with adding ib_types.h and some MAD fields access helpers is not a big deal, but sort of disadvantage anyway. To place this stuff in separate library/package is another possibility, but perspective of adding new package doesn't make me happy. In theory ib_types.h would be also merged with libibmad. However for me the current libibmad seems to be too much heavy for not using it for stuff other than infiniband-diags. Another options? Now I likely would agree with Ira that moving ib_types.h to libibumad is a least painful option. Do we have a better ideas? Sasha From sashak at voltaire.com Fri Sep 25 06:52:56 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Sep 2009 16:52:56 +0300 Subject: [ofa-general] Re: [PATCH 1/2] opensm: avoid LASH use-after-free when switch is deleted from fabric. In-Reply-To: <1253651343.4776.1125.camel@sale659.sandia.gov> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> <20090922185014.GF24398@me> <1253651343.4776.1125.camel@sale659.sandia.gov> Message-ID: <20090925135256.GE26931@me> Hi Jim, On 14:29 Tue 22 Sep , Jim Schutt wrote: > > I'm working on another routing engine That is interesting. What will be a key features of this new routing engine? > that also uses osm_switch_t:priv > to point to data that persists between calls to the routing engine, as > LASH does. And like LASH my objects have a pointer to the corresponding > osm_switch_t. > > Since trying to implement this engine was my first experience with > opensm, it checks these links before using them by making sure > that the osm_switch_t my object references points back to my > object, because I'm paranoid. And my engine doesn't overwrite > pointers it expects to be NULL, because I'm really paranoid. > > So under circumstances where I had two routing engines configured, > if my engine failed over to LASH because of some problem caused by > downing a switch in the fabric, then routing reverted back to my > engine when the problem cleared up, non-NULL osm_switch_t:priv > values would keep my engine from working. > > So I came up with this priv_release() business to provide a general > way for the opensm core to clean up after unexpected behavior > of a routing engine. I think that such "debug-only" things are absolutely fine in development development, but don't add a much in production run-time (the exception could be some extremely complex flows, which would be better avoided for another reasons :)). Also in some cases such extra validations may hide a bugs. > In the event you remove the 'p_sw->priv = NULL' line as the fix > to the use-after-free issue, I guess that it is simplest way to fix this for now. > and I get my routing engine into > good enough shape to submit, should I resubmit this patch too, If it is necessary for your code. > or should I be less paranoid and remove the extra checks in > my engine? If this is only "debug" cases, I would suggest to clean this up after the code stabilization. But let's see then... Sasha From hal.rosenstock at gmail.com Fri Sep 25 07:07:28 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 25 Sep 2009 10:07:28 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options In-Reply-To: <20090924235011.a9a16022.weiny2@llnl.gov> References: <20090924235011.a9a16022.weiny2@llnl.gov> Message-ID: Ira, See one minor comment below: On Fri, Sep 25, 2009 at 2:50 AM, Ira Weiny wrote: > Sasha, > > This applies after > > "infiniband-diags/src/ibqueryerrors: move --all option and replace it > with > --switch, --ca, --router" > > > From: Ira Weiny > Date: Thu, 24 Sep 2009 20:39:29 -0700 > Subject: [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and > counters options > > Add -k and -K options to clear errors and counters. If both are > specified they will both be cleared. > Nice efficiency improvement over running a subsequent ibclearerrors/counters :-) > > Update man page > > In addition fix 2 bugs > fix the printing of Xmt Wait errors > properly skip the counter select field. > > Signed-off-by: Ira Weiny > --- > infiniband-diags/man/ibqueryerrors.8 | 20 +++++-- > infiniband-diags/src/ibqueryerrors.c | 91 > +++++++++++++++++++++++++++++---- > 2 files changed, 94 insertions(+), 17 deletions(-) > > > diff --git a/infiniband-diags/src/ibqueryerrors.c > b/infiniband-diags/src/ibqueryerrors.c > index ecfd662..e379a42 100644 > --- a/infiniband-diags/src/ibqueryerrors.c > +++ b/infiniband-diags/src/ibqueryerrors.c > > +static void clear_port(ib_portid_t * portid, uint16_t cap_mask, > + ibnd_node_t * node, int port) > +{ > + uint8_t pc[1024]; > + /* bits defined in Table 228 PortCounters CounterSelect and > + * CounterSelect2 > + */ > + uint32_t mask = 0; > + > + if (!clear_errors && !clear_counts) > + return; > + > + if (clear_errors) > + mask |= 0x10FFF; > Since PortXmitWait setting is new, shouldn't the setting of this bit in the mask be conditionalized on the CapabilityMask indicating that this is supported ? That seems safer to me. -- Hal > + if (clear_counts) > + mask |= 0xF000; > + > + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, > + IB_GSI_PORT_COUNTERS, ibmad_port)) > + IBERROR("Failed to reset errors %s port %d", > + node->nodedesc, port); > +} > + > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Fri Sep 25 07:50:45 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Sep 2009 17:50:45 +0300 Subject: [ofa-general] [PATCH] opensm/osm_ucast_lash: fix use after free bug In-Reply-To: <20090925135256.GE26931@me> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> <20090922185014.GF24398@me> <1253651343.4776.1125.camel@sale659.sandia.gov> <20090925135256.GE26931@me> Message-ID: <20090925145045.GF26931@me> When LASH runs its switch structures cleanup OpenSM can rediscover a subnet and 'p_sw' pointer may refer already freed memory, so don't touch it, just free our own stuff. (Note also that for valids OpenSM switches objects' 'priv' pointers are cleared on lash_cleanup()). Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_ucast_lash.c | 5 +---- 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c index dbc6bcc..3c424cb 100644 --- a/opensm/opensm/osm_ucast_lash.c +++ b/opensm/opensm/osm_ucast_lash.c @@ -628,8 +628,7 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw } sw->p_sw = p_sw; - if (p_sw) - p_sw->priv = sw; + p_sw->priv = sw; if (osm_mesh_node_create(p_lash, sw)) { free(sw->dij_channels); @@ -644,8 +643,6 @@ static void switch_delete(lash_t *p_lash, switch_t * sw) { if (sw->dij_channels) free(sw->dij_channels); - if (sw->p_sw) - sw->p_sw->priv = NULL; free(sw); } -- 1.6.5.rc1 From weiny2 at llnl.gov Fri Sep 25 09:33:58 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 25 Sep 2009 09:33:58 -0700 Subject: [ofa-general] [PATCH V2] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options In-Reply-To: References: <20090924235011.a9a16022.weiny2@llnl.gov> Message-ID: <20090925093358.f9f747d4.weiny2@llnl.gov> On Fri, 25 Sep 2009 10:07:28 -0400 Hal Rosenstock wrote: > Ira, > See one minor comment below: > > > On Fri, Sep 25, 2009 at 2:50 AM, Ira Weiny wrote: > > > Sasha, > > > > This applies after > > > > "infiniband-diags/src/ibqueryerrors: move --all option and replace it > > with > > --switch, --ca, --router" > > > > > > From: Ira Weiny > > Date: Thu, 24 Sep 2009 20:39:29 -0700 > > Subject: [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and > > counters options > > > > Add -k and -K options to clear errors and counters. If both are > > specified they will both be cleared. > > > > Nice efficiency improvement over running a subsequent ibclearerrors/counters > :-) > > > > > > Update man page > > > > In addition fix 2 bugs > > fix the printing of Xmt Wait errors > > properly skip the counter select field. > > > > Signed-off-by: Ira Weiny > > --- > > infiniband-diags/man/ibqueryerrors.8 | 20 +++++-- > > infiniband-diags/src/ibqueryerrors.c | 91 > > +++++++++++++++++++++++++++++---- > > 2 files changed, 94 insertions(+), 17 deletions(-) > > > > > > > > diff --git a/infiniband-diags/src/ibqueryerrors.c > > b/infiniband-diags/src/ibqueryerrors.c > > index ecfd662..e379a42 100644 > > --- a/infiniband-diags/src/ibqueryerrors.c > > +++ b/infiniband-diags/src/ibqueryerrors.c > > > > > > > +static void clear_port(ib_portid_t * portid, uint16_t cap_mask, > > + ibnd_node_t * node, int port) > > +{ > > + uint8_t pc[1024]; > > + /* bits defined in Table 228 PortCounters CounterSelect and > > + * CounterSelect2 > > + */ > > + uint32_t mask = 0; > > + > > + if (!clear_errors && !clear_counts) > > + return; > > + > > + if (clear_errors) > > + mask |= 0x10FFF; > > > Since PortXmitWait setting is new, shouldn't the setting of this bit in the > mask be conditionalized on the CapabilityMask indicating that this is > supported ? That seems safer to me. Yes, I forgot about that. I passed the cap_mask in! ;-) V2 is below. Ira From: Ira Weiny Date: Thu, 24 Sep 2009 20:39:29 -0700 Subject: [PATCH] infiniband-diags/src/ibqueryerrors: Add clear errors and counters options V2 add check for XMT_WAIT support on clear Add -k and -K options to clear errors and counters. If both are specified they will both be cleared. Update man page In addition fix 2 bugs fix the printing of Xmt Wait errors properly skip the counter select field. Signed-off-by: Ira Weiny --- infiniband-diags/man/ibqueryerrors.8 | 20 +++++-- infiniband-diags/src/ibqueryerrors.c | 94 ++++++++++++++++++++++++++++++---- 2 files changed, 97 insertions(+), 17 deletions(-) diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 index 8f83a7b..56c6024 100644 --- a/infiniband-diags/man/ibqueryerrors.8 +++ b/infiniband-diags/man/ibqueryerrors.8 @@ -6,15 +6,14 @@ ibqueryerrors.pl \- query and report non-zero IB port counters .SH SYNOPSIS .B ibqueryerrors.pl [-s -c -r -C -P -s -G --D -d] +-D -d -k -K] .SH DESCRIPTION .PP -ibqueryerrors.pl reports the port counters of switches. This is similar to -ibcheckerrors with the additional ability to filter out selected errors, -include the optional transmit and receive data counters, report actions to -remedy a non-zero count, and report full link information for the link -reported. +ibqueryerrors.pl reports port counters. This is similar to ibcheckerrors with +the additional ability to filter out selected errors, include the optional +transmit and receive data counters, and report full link information for the +link reported. .SH OPTIONS @@ -50,6 +49,15 @@ Include the optional transmit and receive data counters. .TP \fB\-\-router\fR print data for routers only .TP +\fB\-\-clear\-errors\fR \fB\-k\fR Clear error counters after read. +\-k and \-K can be used together to clear both errors and counters. +.TP +\fB\-\-clear\-counts\fR \fB\-K\fR Clear data counters after read. +\fBCAUTION\fR clearing data counters will occur regardless of if they are +printed or not. This is because data counters are only \fBprinted\fR on ports +which have errors. This means if a port has 0 errors and the \-K option is +specified the data counters will be cleared without any printed output. +.TP \fB\-R\fR (This option is obsolete and does nothing) .SH COMMON OPTIONS diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index ecfd662..f36cf0d 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c @@ -64,6 +64,8 @@ char *node_guid_str = NULL; int sup_total = 0; enum MAD_FIELDS *suppressed_fields = NULL; char *dr_path = NULL; +int clear_errors = 0; +int clear_counts = 0; #define PRINT_ALL 0xFF /* all nodes default flag */ uint8_t node_type_to_print = PRINT_ALL; @@ -222,6 +224,10 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, if (suppress(i)) continue; + /* this is not a counter, skip it */ + if (i == IB_PC_COUNTER_SELECT2_F) + continue; + mad_decode_field(pc, i, (void *)&val); if (val) n += snprintf(str + n, 1024 - n, " [%s == %d]", @@ -232,7 +238,7 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, mad_decode_field(pc, IB_PC_XMT_WAIT_F, (void *)&val); if (val) n += snprintf(str + n, 1024 - n, " [%s == %d]", - mad_field_name(i), val); + mad_field_name(IB_PC_XMT_WAIT_F), val); } /* if we found errors. */ @@ -264,13 +270,11 @@ static void print_results(ibnd_node_t * node, uint8_t * pc, int portnum, } } -static void print_port(ibnd_node_t * node, int portnum, int *header_printed) +static int query_cap_mask(ibnd_node_t * node, int portnum, uint16_t * cap_mask) { uint8_t pc[1024]; - uint16_t cap_mask; + uint16_t rc_cap_mask; ib_portid_t portid = { 0 }; - char *nodename = - remap_node_name(node_name_map, node->guid, node->nodedesc); if (node->type == IB_NODE_SWITCH) ib_portid_set(&portid, node->smalid, 0, 0); @@ -281,16 +285,31 @@ static void print_port(ibnd_node_t * node, int portnum, int *header_printed) if (!pma_query_via(pc, &portid, portnum, ibd_timeout, CLASS_PORT_INFO, ibmad_port)) { IBWARN("classportinfo query failed on %s, %s port %d", - nodename, portid2str(&portid), portnum); - goto cleanup; + remap_node_name(node_name_map, node->guid, + node->nodedesc), portid2str(&portid), portnum); + return (-1); } + /* ClassPortInfo should be supported as part of libibmad */ - memcpy(&cap_mask, pc + 2, sizeof(cap_mask)); /* CapabilityMask */ + memcpy(&rc_cap_mask, pc + 2, sizeof(rc_cap_mask)); /* CapabilityMask */ + + *cap_mask = ntohs(rc_cap_mask); + return (0); +} + +static void print_port(ib_portid_t * portid, uint16_t cap_mask, + ibnd_node_t * node, int portnum, int *header_printed) +{ + uint8_t pc[1024]; + char *nodename = + remap_node_name(node_name_map, node->guid, node->nodedesc); + + memset(pc, 0, 1024); - if (!pma_query_via(pc, &portid, portnum, ibd_timeout, + if (!pma_query_via(pc, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS query failed on %s, %s port %d\n", - nodename, portid2str(&portid), portnum); + nodename, portid2str(portid), portnum); goto cleanup; } if (!(cap_mask & 0x1000)) { @@ -304,12 +323,41 @@ cleanup: free(nodename); } +static void clear_port(ib_portid_t * portid, uint16_t cap_mask, + ibnd_node_t * node, int port) +{ + uint8_t pc[1024]; + /* bits defined in Table 228 PortCounters CounterSelect and + * CounterSelect2 + */ + uint32_t mask = 0; + + if (!clear_errors && !clear_counts) + return; + + if (clear_errors) { + mask |= 0xFFF; + if (cap_mask & 0x1000) + mask |= 0x10000; + } + if (clear_counts) + mask |= 0xF000; + + if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, + IB_GSI_PORT_COUNTERS, ibmad_port)) + IBERROR("Failed to reset errors %s port %d", + node->nodedesc, port); +} + void print_node(ibnd_node_t * node, void *user_data) { int header_printed = 0; int p = 0; int startport = 1; int type = 0; + int all_port_sup = 0; + ib_portid_t portid = { 0 }; + uint16_t cap_mask = 0; switch (node->type) { case IB_NODE_SWITCH: @@ -331,9 +379,25 @@ void print_node(ibnd_node_t * node, void *user_data) for (p = startport; p <= node->numports; p++) { if (node->ports[p]) { - print_port(node, p, &header_printed); + if (query_cap_mask(node, p, &cap_mask) < 0) + continue; + + if (cap_mask & 0x100) + all_port_sup = 1; + + if (node->type == IB_NODE_SWITCH) + ib_portid_set(&portid, node->smalid, 0, 0); + else + ib_portid_set(&portid, node->ports[p]->base_lid, 0, 0); + + print_port(&portid, cap_mask, node, p, &header_printed); + if (!all_port_sup) + clear_port(&portid, cap_mask, node, p); } } + + if (all_port_sup) + clear_port(&portid, cap_mask, node, 0xFF); } static void add_suppressed(enum MAD_FIELDS field) @@ -406,6 +470,12 @@ static int process_opt(void *context, int ch, char *optarg) break; case 'R': /* nop */ break; + case 'k': + clear_errors = 1; + break; + case 'K': + clear_counts = 1; + break; default: return -1; } @@ -443,6 +513,8 @@ int main(int argc, char **argv) {"switch", 3, 0, NULL, "print data for switches only"}, {"ca", 4, 0, NULL, "print data for CA's only"}, {"router", 5, 0, NULL, "print data for routers only"}, + {"clear-errors", 'k', 0, NULL, "Clear error counters after read"}, + {"clear-counts", 'K', 0, NULL, "Clear data counters after read"}, {0} }; char usage_args[] = ""; -- 1.5.4.5 From sean.hefty at intel.com Fri Sep 25 10:19:15 2009 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 25 Sep 2009 10:19:15 -0700 Subject: ib_types.h moving [was: Re: [ofa-general] [RFC] 3/5: IB ACM: libibacm] In-Reply-To: <20090925130908.GD26931@me> References: <20090917101804.12e9e5ce.weiny2@llnl.gov> <20090917132050.041b077d.weiny2@llnl.gov> <20090925130908.GD26931@me> Message-ID: >Now I likely would agree with Ira that moving ib_types.h to libibumad >is a least painful option. Do we have a better ideas? Just a random thought, but what about longer term adding a second set of interfaces to libibumad? Basically, something more like the kernel ib_sa. I don't know that we need a new library just to expand the interface. For ib_types.h, I'd rather see it broken up into separate header files, at least some of which get distributed with libibumad. - Sean From hal.rosenstock at gmail.com Fri Sep 25 16:11:31 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 25 Sep 2009 19:11:31 -0400 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_lash: fix use after free bug In-Reply-To: <20090925145045.GF26931@me> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> <20090922185014.GF24398@me> <1253651343.4776.1125.camel@sale659.sandia.gov> <20090925135256.GE26931@me> <20090925145045.GF26931@me> Message-ID: On 9/25/09, Sasha Khapyorsky wrote: > > When LASH runs its switch structures cleanup OpenSM can rediscover a > subnet and 'p_sw' pointer may refer already freed memory, so don't touch > it, just free our own stuff. (Note also that for valids OpenSM switches > objects' 'priv' pointers are cleared on lash_cleanup()). > > Signed-off-by: Sasha Khapyorsky Tested-by: Hal Rosenstock > --- > opensm/opensm/osm_ucast_lash.c | 5 +---- > 1 files changed, 1 insertions(+), 4 deletions(-) > > diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c > index dbc6bcc..3c424cb 100644 > --- a/opensm/opensm/osm_ucast_lash.c > +++ b/opensm/opensm/osm_ucast_lash.c > @@ -628,8 +628,7 @@ static switch_t *switch_create(lash_t * p_lash, unsigned > id, osm_switch_t * p_sw > } > > sw->p_sw = p_sw; > - if (p_sw) > - p_sw->priv = sw; > + p_sw->priv = sw; > > if (osm_mesh_node_create(p_lash, sw)) { > free(sw->dij_channels); > @@ -644,8 +643,6 @@ static void switch_delete(lash_t *p_lash, switch_t * sw) > { > if (sw->dij_channels) > free(sw->dij_channels); > - if (sw->p_sw) > - sw->p_sw->priv = NULL; > free(sw); > } > > -- > 1.6.5.rc1 > > From vlad at lists.openfabrics.org Sat Sep 26 03:07:02 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 26 Sep 2009 03:07:02 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090926-0200 daily build status Message-ID: <20090926100702.61C43E62040@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From bugzilla-daemon at bugzilla.kernel.org Sat Sep 26 07:54:37 2009 From: bugzilla-daemon at bugzilla.kernel.org (bugzilla-daemon at bugzilla.kernel.org) Date: Sat, 26 Sep 2009 14:54:37 GMT Subject: [ofa-general] [Bug 14235] New: SRP initiator lockup Message-ID: http://bugzilla.kernel.org/show_bug.cgi?id=14235 Summary: SRP initiator lockup Product: Drivers Version: 2.5 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Infiniband/RDMA AssignedTo: drivers_infiniband-rdma at kernel-bugs.osdl.org ReportedBy: bart.vanassche at gmail.com Regression: No If an SRP target processes SRP I/O slow enough, the SRP initiator locks up. This issue is 100% reproducible with the following setup: Target: * Kernel 2.6.30.4 with SCST patches applied and kernel debugging enabled. * SCST r1153 with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING -DCONFIG_SCST_DEBUG -g added in srpt/src/Makefile and with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING added in scst/src/Makefile. * ib_srpt loaded with kernel module parameters thread=0 and processing_delay_in_us=500. Initiator: * Kernel 2.6.31.1 with kernel debugging enabled. * SRP login has been performed as follows: rmmod ib_srp; modprobe ib_srp; ibsrpdm -c | while read target_info; do echo "${target_info}"; echo "${target_info}" > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target; done * After SRP login succeeded the following fio command was started: fio --rw=rw --bs=64M --rwmixread=100 --numjobs=1 --iodepth=1 --sync=0 --direct=1 --ioengine=sync --filename=/dev/${srp_initiator_device} --name=test --loops=1000 --runtime=600 --size=2G After a few minutes fio locked up (I/O rate dropped from 1500 MB/s to 0 MB/s) and the following kernel message started appearing periodically: INFO: task fio:6389 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 6389 6388 0x00000000 ffff880071dc5bd8 0000000000000046 ffff880071dc5b08 000000018107764d 0000000000012cc0 000000000000de20 0000000000000001 ffff880070cd8000 ffff880070cd83b0 0000000100000000 000000010001193e ffff88007fb99050 Call Trace: [] ? _spin_unlock_irqrestore+0x65/0x80 [] io_schedule+0x37/0x50 [] __blockdev_direct_IO+0x692/0xd80 [] ? get_super+0x27/0xc0 [] blkdev_direct_IO+0x49/0x50 [] ? blkdev_get_blocks+0x0/0xc0 [] generic_file_aio_read+0x679/0x690 [] ? __dentry_open+0x13a/0x340 [] do_sync_read+0xf1/0x140 [] ? trace_hardirqs_on_caller+0x14d/0x1a0 [] ? autoremove_wake_function+0x0/0x40 [] ? trace_hardirqs_on_caller+0x14d/0x1a0 [] ? trace_hardirqs_on+0xd/0x10 [] vfs_read+0xc8/0x180 [] sys_read+0x50/0x90 [] system_call_fastpath+0x16/0x1b no locks held by fio/6389. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. From hnrose at comcast.net Sat Sep 26 14:17:26 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Sat, 26 Sep 2009 17:17:26 -0400 Subject: [ofa-general] [PATCH] infiniband-diags/perfquery.c: Fix extended counter reset mask Message-ID: <20090926211726.GA29861@comcast.net> to not have any bits on for reserved components Signed-off-by: Hal Rosenstock --- diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index d70af9e..5d4046b 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -91,6 +91,8 @@ struct perf_count perf_count = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; struct perf_count_ext perf_count_ext = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; +int not_def_mask = 0; + #define ALL_PORTS 0xFF /* Notes: IB semantics is to cap counters if count has exceeded limits. @@ -337,8 +339,10 @@ static void reset_counters(int extended, int timeout, int mask, IB_GSI_PORT_COUNTERS, srcport)) IBERROR("perf reset"); } else { - if (!performance_reset_via(pc, portid, port, mask, timeout, - IB_GSI_PORT_COUNTERS_EXT, srcport)) + if (!performance_reset_via(pc, portid, port, + not_def_mask ? mask : mask & 0xff, + timeout, IB_GSI_PORT_COUNTERS_EXT, + srcport)) IBERROR("perf ext reset"); } } @@ -476,8 +480,10 @@ int main(int argc, char **argv) if (argc > 1) port = strtoul(argv[1], 0, 0); - if (argc > 2) + if (argc > 2) { mask = strtoul(argv[2], 0, 0); + not_def_mask = 1; + } srcport = mad_rpc_open_port(ibd_ca, ibd_ca_port, mgmt_classes, 4); if (!srcport) From vlad at lists.openfabrics.org Sun Sep 27 03:09:20 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 27 Sep 2009 03:09:20 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090927-0200 daily build status Message-ID: <20090927100920.7BD7FE620E2@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From sashak at voltaire.com Sun Sep 27 12:36:16 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Sep 2009 21:36:16 +0200 Subject: [ofa-general] Re: [PATCH 1/2] opensm: avoid LASH use-after-free when switch is deleted from fabric. In-Reply-To: <1253917245.4776.1224.camel@sale659.sandia.gov> References: <1251486496-24812-1-git-send-email-jaschut@sandia.gov> <1251486496-24812-2-git-send-email-jaschut@sandia.gov> <20090922185014.GF24398@me> <1253651343.4776.1125.camel@sale659.sandia.gov> <20090925135256.GE26931@me> <1253917245.4776.1224.camel@sale659.sandia.gov> Message-ID: <20090927193616.GH26931@me> On 16:20 Fri 25 Sep , Jim Schutt wrote: > > On Fri, 2009-09-25 at 07:52 -0600, Sasha Khapyorsky wrote: > > Hi Jim, > > > > On 14:29 Tue 22 Sep , Jim Schutt wrote: > > > > > > I'm working on another routing engine > > > > That is interesting. What will be a key features of this new routing > > engine? > > It's designed to provide the following functionality on a 3D torus: > - routing that is free of credit loops > - two levels of QoS, assuming switches support 8 data VLs > - ability to route around a single failed switch, and/or multiple failed > links, without > - introducing credit loops > - changing path SL values > - short run times, with good scaling properties as fabric size > increases Sounds great :). Sasha From sashak at voltaire.com Sun Sep 27 12:46:51 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Sep 2009 21:46:51 +0200 Subject: [ofa-general] Re: [PATCH] opensm/osm_mesh.c: Add dump_mesh routine at OSM_LOG_DEBUG log level In-Reply-To: <20090922183858.GA1984@comcast.net> References: <20090922183858.GA1984@comcast.net> Message-ID: <20090927194651.GI26931@me> On 14:38 Tue 22 Sep , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c > index 260e2f8..beb6bd7 100644 > --- a/opensm/opensm/osm_mesh.c > +++ b/opensm/opensm/osm_mesh.c > @@ -1565,6 +1565,39 @@ err: > return -1; > } > > +static void dump_mesh(lash_t *p_lash) > +{ > + osm_log_t *p_log = &p_lash->p_osm->log; > + int sw; > + int num_switches = p_lash->num_switches; > + int dimension; > + int i, j, k; > + switch_t *s, *s2; > + char buf[256], *p; > + > + OSM_LOG_ENTER(p_log); > + > + for (sw = 0; sw < num_switches; sw++) { > + p = buf; > + s = p_lash->switches[sw]; > + dimension = s->node->dimension; > + p += sprintf(p, "["); > + for (i = 0; i < dimension; i++) > + p += sprintf(p, "%2d%s", s->node->coord[i], > + (i == dimension - 1) ? "]" : ","); I think that you can move ']' printing out of the loop, just place it on a last character (it may be easier using snprintf()). > + for (j = 0; j < s->node->num_links; j++) { > + s2 = p_lash->switches[s->node->links[j]->switch_id]; > + p += sprintf(p, " [%d]->[", j); > + for (k = 0; k < dimension; k++) > + p += sprintf(p, "%2d%s", s2->node->coord[k], > + (k == dimension - 1) ? "] " : ","); > + } Using sprintf()s above should we care about potential 'buf' overflow (and likely to use snprintf() instead)? Sasha > + OSM_LOG(p_log, OSM_LOG_DEBUG, "%s\n", buf); > + } > + > + OSM_LOG_EXIT(p_log); > +} > + > /* > * osm_do_mesh_analysis > */ > @@ -1653,6 +1686,9 @@ int osm_do_mesh_analysis(lash_t *p_lash) > OSM_LOG(p_log, OSM_LOG_INFO, "%s", buf); > } > > + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) > + dump_mesh(p_lash); > + > done: > mesh_delete(mesh); > OSM_LOG_EXIT(p_log); > From khris4 at gmail.com Sun Sep 27 13:42:40 2009 From: khris4 at gmail.com (chris) Date: Sun, 27 Sep 2009 13:42:40 -0700 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 Message-ID: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> Hello, I been trying to install ofed 1.4 on centos 5.3 for two days now can't i get it work. I don't understand why it won't compile. here is my out put hopeful someone can help. /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/skbuff.h:96: error: redefinition of 'skb_copy_from_linear_data_offset' include/linux/skbuff.h:1504: error: previous definition of 'skb_copy_from_linear_data_offset' was here In file included from include/linux/netdevice.h:672, from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/netdevice.h:4, from include/linux/inetdevice.h:7, from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/inetdevice.h:4, from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:37: /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/interrupt.h:5: error: conflicting types for 'irq_handler_t' include/linux/interrupt.h:67: error: previous declaration of 'irq_handler_t' was here In file included from include/net/route.h:33, from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/net/route.h:4, from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:42: /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/ip.h:7: error: redefinition of 'ip_hdr' include/linux/ip.h:109: error: previous definition of 'ip_hdr' was here make[4]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.o] Error 1 make[3]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core] Error 2 make[2]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband] Error 2 make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4] Error 2 make[1]: Leaving directory `/usr/src/kernels/2.6.18-164.el5-x86_64' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.79424 (%build) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.79424 (%build) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon at opengridcomputing.com Sun Sep 27 20:14:36 2009 From: jon at opengridcomputing.com (Jon Mason) Date: Sun, 27 Sep 2009 22:14:36 -0500 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 In-Reply-To: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> References: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> Message-ID: <20090928031434.GA16515@opengridcomputing.com> On Sun, Sep 27, 2009 at 01:42:40PM -0700, chris wrote: > Hello, > > > I been trying to install ofed 1.4 on centos 5.3 for two days now can't i get > it work. I don't understand why it won't compile. here is my out put hopeful > someone can help. Centos updated their kernel recently to be similar to what is shipping in RHEL 5.4. This breaks the OFED backports. You can either use the 2.6.18-128.el5 kernel, or download the latest nightly build of OFED 1.5. Thanks, Jon > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/skbuff.h:96: > error: redefinition of 'skb_copy_from_linear_data_offset' > include/linux/skbuff.h:1504: error: previous definition of > 'skb_copy_from_linear_data_offset' was here > In file included from include/linux/netdevice.h:672, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/netdevice.h:4, > from include/linux/inetdevice.h:7, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/inetdevice.h:4, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:37: > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/interrupt.h:5: > error: conflicting types for 'irq_handler_t' > include/linux/interrupt.h:67: error: previous declaration of 'irq_handler_t' > was here > In file included from include/net/route.h:33, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/net/route.h:4, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:42: > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/ip.h:7: > error: redefinition of 'ip_hdr' > include/linux/ip.h:109: error: previous definition of 'ip_hdr' was here > make[4]: *** > [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.o] > Error 1 > make[3]: *** > [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core] Error 2 > make[2]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband] > Error 2 > make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4] Error 2 > make[1]: Leaving directory `/usr/src/kernels/2.6.18-164.el5-x86_64' > make: *** [kernel] Error 2 > error: Bad exit status from /var/tmp/rpm-tmp.79424 (%build) > > > RPM build errors: > Bad exit status from /var/tmp/rpm-tmp.79424 (%build) > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From bart.vanassche at gmail.com Sun Sep 27 23:27:38 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Sep 2009 08:27:38 +0200 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> Message-ID: On Mon, Sep 21, 2009 at 8:06 PM, Bart Van Assche wrote: > My hypothesis is that in your setup running ib_srpt with thread=0 > resulted in SRPT's completion queue handler (srpt_completion()), which > keeps running as long as more completion queue elements can be > processed, took up too much time, and that this finally resulted in > remote SRP initiator disconnects. An update regarding the SRPT and SRP issues reported in this e-mail thread: * At least on my setup, the latest SRPT revision (r1153 or later) runs fine -- all issues reported in this (really long) e-mail thread should have been resolved in that SRPT revision. * Unfortunately this doesn't mean that the SRP initiator lockup is gone. I'm still able to trigger this issue, both with the SRP initiator included in the mainstream Linux kernel and with the SRP initiator included in OFED. More details can be found here: http://bugzilla.kernel.org/show_bug.cgi?id=14235 and https://bugs.openfabrics.org/show_bug.cgi?id=1745. Bart. From vlad at dev.mellanox.co.il Mon Sep 28 00:42:04 2009 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Mon, 28 Sep 2009 09:42:04 +0200 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 In-Reply-To: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> References: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> Message-ID: <4AC068CC.1050604@dev.mellanox.co.il> chris wrote: > Hello, > > > I been trying to install ofed 1.4 on centos 5.3 for two days now can't > i get it work. I don't understand why it won't compile. here is my out > put hopeful someone can help. > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/skbuff.h:96: > error: redefinition of 'skb_copy_from_linear_data_offset' > include/linux/skbuff.h:1504: error: previous definition of > 'skb_copy_from_linear_data_offset' was here > In file included from include/linux/netdevice.h:672, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/netdevice.h:4, > from include/linux/inetdevice.h:7, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/inetdevice.h:4, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:37: > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/interrupt.h:5: > error: conflicting types for 'irq_handler_t' > include/linux/interrupt.h:67: error: previous declaration of > 'irq_handler_t' was here > In file included from include/net/route.h:33, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/net/route.h:4, > from > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.c:42: > /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/kernel_addons/backport/2.6.18_FC6/include/linux/ip.h:7: > error: redefinition of 'ip_hdr' > include/linux/ip.h:109: error: previous definition of 'ip_hdr' was here > make[4]: *** > [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core/addr.o] > Error 1 > make[3]: *** > [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband/core] > Error 2 > make[2]: *** > [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4/drivers/infiniband] Error 2 > make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.4] Error 2 > make[1]: Leaving directory `/usr/src/kernels/2.6.18-164.el5-x86_64' > make: *** [kernel] Error 2 > error: Bad exit status from /var/tmp/rpm-tmp.79424 (%build) > > > RPM build errors: > Bad exit status from /var/tmp/rpm-tmp.79424 (%build) > > Hi Chris, Kernel 2.6.18-164.el5 comes from RHEL5.4 and is not supported by OFED-1.4.X. You should try OFED-1.5. Regards, Vladimir From vlad at lists.openfabrics.org Mon Sep 28 03:08:47 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 28 Sep 2009 03:08:47 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090928-0200 daily build status Message-ID: <20090928100847.AEE21E620FD@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From hoot at ptpnow.com Mon Sep 28 04:35:28 2009 From: hoot at ptpnow.com (Hoot Thompson) Date: Mon, 28 Sep 2009 07:35:28 -0400 Subject: [ofa-general] Srp in OFED 1.5 Message-ID: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> Will the srp module be available in the OFED 1.5 release? If so, when? From bart.vanassche at gmail.com Mon Sep 28 04:53:30 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Sep 2009 13:53:30 +0200 Subject: [ofa-general] Srp in OFED 1.5 In-Reply-To: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> References: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> Message-ID: On Mon, Sep 28, 2009 at 1:35 PM, Hoot Thompson wrote: > Will the srp module be available in the OFED 1.5 release?  If so, when? Are you referring in the above to the SRP initiator or SRP target ? Bart. From hoot at ptpnow.com Mon Sep 28 05:06:26 2009 From: hoot at ptpnow.com (Hoot Thompson) Date: Mon, 28 Sep 2009 08:06:26 -0400 Subject: [ofa-general] Srp in OFED 1.5 In-Reply-To: References: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> Message-ID: <5E9DD2E59DA544D4A4AA8B0AA95A2608@ptpdesk> Initiator.... Thanks for the quick response. -----Original Message----- From: Bart Van Assche [mailto:bart.vanassche at gmail.com] Sent: Monday, September 28, 2009 7:54 AM To: Hoot Thompson Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] Srp in OFED 1.5 On Mon, Sep 28, 2009 at 1:35 PM, Hoot Thompson wrote: > Will the srp module be available in the OFED 1.5 release?  If so, when? Are you referring in the above to the SRP initiator or SRP target ? Bart. From worleys at gmail.com Mon Sep 28 05:35:46 2009 From: worleys at gmail.com (Chris Worley) Date: Mon, 28 Sep 2009 06:35:46 -0600 Subject: [Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs In-Reply-To: References: <4AA4F561.504@vlnb.net> Message-ID: Bart, I really appreciate all your work here. It looks like you've really bullet-proofed scst, and well defined where ib_srp has issues. Thanks, Chris On Mon, Sep 28, 2009 at 12:27 AM, Bart Van Assche wrote: > On Mon, Sep 21, 2009 at 8:06 PM, Bart Van Assche > wrote: >> My hypothesis is that in your setup running ib_srpt with thread=0 >> resulted in SRPT's completion queue handler (srpt_completion()), which >> keeps running as long as more completion queue elements can be >> processed, took up too much time, and that this finally resulted in >> remote SRP initiator disconnects. > > An update regarding the SRPT and SRP issues reported in this e-mail thread: > * At least on my setup, the latest SRPT revision (r1153 or later) runs > fine  -- all issues reported in this (really long) e-mail thread > should have been resolved in that SRPT revision. > * Unfortunately this doesn't mean that the SRP initiator lockup is > gone. I'm still able to trigger this issue, both with the SRP > initiator included in the mainstream Linux kernel and with the SRP > initiator included in OFED. More details can be found here: > http://bugzilla.kernel.org/show_bug.cgi?id=14235 and > https://bugs.openfabrics.org/show_bug.cgi?id=1745. > > Bart. > From brian at sun.com Mon Sep 28 07:15:08 2009 From: brian at sun.com (Brian J. Murrell) Date: Mon, 28 Sep 2009 10:15:08 -0400 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 In-Reply-To: <4AC068CC.1050604@dev.mellanox.co.il> References: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> <4AC068CC.1050604@dev.mellanox.co.il> Message-ID: <1254147308.17199.50.camel@pc.interlinx.bc.ca> On Mon, 2009-09-28 at 09:42 +0200, Vladimir Sokolovsky wrote: > > Hi Chris, > Kernel 2.6.18-164.el5 comes from RHEL5.4 and is not supported by OFED-1.4.X. > You should try OFED-1.5. This is a problem we run into with Lustre somewhat frequently. The issue is that deploying OFED 1.5 (i.e. beta software) in a production environment is completely unacceptable, yet leaving one's systems open to kernel vulnerabilities is equally unacceptable. By not backporting support for current kernels to the latest stable OFED release you are putting people between a rock and a hard place. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: From viktor at viktormauch.de Mon Sep 28 08:08:03 2009 From: viktor at viktormauch.de (viktor at viktormauch.de) Date: Mon, 28 Sep 2009 17:08:03 +0200 (CEST) Subject: [ofa-general] update problems after 1.5-daily install on CentOS 5.3 Message-ID: <2133828037.333213.1254150483721.JavaMail.tomcat55@mrmseu2.kundenserver.de> An HTML attachment was scrubbed... URL: From rdreier at cisco.com Mon Sep 28 09:27:58 2009 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 28 Sep 2009 09:27:58 -0700 Subject: [ofa-general] [Bug 14235] New: SRP initiator lockup In-Reply-To: (bugzilla-daemon@bugzilla.kernel.org's message of "Sat, 26 Sep 2009 14:54:37 GMT") References: Message-ID: > If an SRP target processes SRP I/O slow enough, the SRP initiator locks up. > INFO: task fio:6389 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > fio D 0000000000000000 0 6389 6388 0x00000000 > ffff880071dc5bd8 0000000000000046 ffff880071dc5b08 000000018107764d > 0000000000012cc0 000000000000de20 0000000000000001 ffff880070cd8000 > ffff880070cd83b0 0000000100000000 000000010001193e ffff88007fb99050 > Call Trace: > [] ? _spin_unlock_irqrestore+0x65/0x80 > [] io_schedule+0x37/0x50 > [] __blockdev_direct_IO+0x692/0xd80 > [] ? get_super+0x27/0xc0 > [] blkdev_direct_IO+0x49/0x50 > [] ? blkdev_get_blocks+0x0/0xc0 > [] generic_file_aio_read+0x679/0x690 > [] ? __dentry_open+0x13a/0x340 > [] do_sync_read+0xf1/0x140 > [] ? trace_hardirqs_on_caller+0x14d/0x1a0 > [] ? autoremove_wake_function+0x0/0x40 > [] ? trace_hardirqs_on_caller+0x14d/0x1a0 > [] ? trace_hardirqs_on+0xd/0x10 > [] vfs_read+0xc8/0x180 > [] sys_read+0x50/0x90 > [] system_call_fastpath+0x16/0x1b > no locks held by fio/6389. It will probably be a while until I can get the time to build an scst test set up to reproduce this unfortunately. So we'll have to debug this with your set up for the moment. I don't have a good idea of where in the SRP initiator the problem could be... the non-error path for ordinary SCSI commands is pretty trivial. Presumably slowing down the target means that the queue of outstanding commands fills up, but they should complete and let things make progress. I guess the possibilities are a bug higher up in the block or SCSI stack, or some accounting problem in SRP. You could try adding printks to srp_queuecommand() to see that all SCSI commands are sent on the SRP connection and also add tracing to srp_process_rsp() to make sure there's a matching call to ->scsi_done for each SCSI command. And also we should make sure there's no disconnections or task management commands or anything like that confusing things ... there is definitely more room for bugs in the parts of the SRP driver that handle exceptions. - R. From bart.vanassche at gmail.com Mon Sep 28 09:38:47 2009 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Sep 2009 18:38:47 +0200 Subject: [ofa-general] Srp in OFED 1.5 In-Reply-To: <5E9DD2E59DA544D4A4AA8B0AA95A2608@ptpdesk> References: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> <5E9DD2E59DA544D4A4AA8B0AA95A2608@ptpdesk> Message-ID: On Mon, Sep 28, 2009 at 2:06 PM, Hoot Thompson wrote: > > From: Bart Van Assche [mailto:bart.vanassche at gmail.com] > > Sent: Monday, September 28, 2009 7:54 AM > > To: Hoot Thompson > > Cc: general at lists.openfabrics.org > > Subject: Re: [ofa-general] Srp in OFED 1.5 > > On Mon, Sep 28, 2009 at 1:35 PM, Hoot Thompson wrote: > > Will the srp module be available in the OFED 1.5 release?  If so, when? > > Are you referring in the above to the SRP initiator or SRP target ? > Initiator.... I do not have official information about the availability of SRP in OFED 1.5. But as far as I can see in the OFED 1.5. release notes document (which is work in progress), SRP will be included in OFED 1.5. Source: http://www.openfabrics.org/git/?p=~tziporet/docs.git;a=blob;f=OFED_release_notes.txt;h=9d179994122bb55aed0c683f8064067cbcadefa1;hb=96df01ce1a84667e1a1baa1b66a1793284efe57a. Bart. From hoot at ptpnow.com Mon Sep 28 10:09:13 2009 From: hoot at ptpnow.com (Hoot Thompson) Date: Mon, 28 Sep 2009 13:09:13 -0400 Subject: [ofa-general] Srp in OFED 1.5 In-Reply-To: References: <702D40EEAC0B45A5B1733B90456C3248@ptpdesk> <5E9DD2E59DA544D4A4AA8B0AA95A2608@ptpdesk> Message-ID: <2F898141C6324BF591DA48464A6EA2B1@ptpdesk> Thanks for the feedback. Hoot -----Original Message----- From: Bart Van Assche [mailto:bart.vanassche at gmail.com] Sent: Monday, September 28, 2009 12:39 PM To: Hoot Thompson Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] Srp in OFED 1.5 On Mon, Sep 28, 2009 at 2:06 PM, Hoot Thompson wrote: > > From: Bart Van Assche [mailto:bart.vanassche at gmail.com] > > Sent: Monday, September 28, 2009 7:54 AM > > To: Hoot Thompson > > Cc: general at lists.openfabrics.org > > Subject: Re: [ofa-general] Srp in OFED 1.5 > > On Mon, Sep 28, 2009 at 1:35 PM, Hoot Thompson wrote: > > Will the srp module be available in the OFED 1.5 release?  If so, when? > > Are you referring in the above to the SRP initiator or SRP target ? > Initiator.... I do not have official information about the availability of SRP in OFED 1.5. But as far as I can see in the OFED 1.5. release notes document (which is work in progress), SRP will be included in OFED 1.5. Source: http://www.openfabrics.org/git/?p=~tziporet/docs.git;a=blob;f=OFED_release_n otes.txt;h=9d179994122bb55aed0c683f8064067cbcadefa1;hb=96df01ce1a84667e1a1ba a1b66a1793284efe57a. Bart. From jgunthorpe at obsidianresearch.com Mon Sep 28 10:14:29 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Mon, 28 Sep 2009 11:14:29 -0600 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 In-Reply-To: <1254147308.17199.50.camel@pc.interlinx.bc.ca> References: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> <4AC068CC.1050604@dev.mellanox.co.il> <1254147308.17199.50.camel@pc.interlinx.bc.ca> Message-ID: <20090928171429.GW19540@obsidianresearch.com> On Mon, Sep 28, 2009 at 10:15:08AM -0400, Brian J. Murrell wrote: > This is a problem we run into with Lustre somewhat frequently. > > The issue is that deploying OFED 1.5 (i.e. beta software) in a > production environment is completely unacceptable, yet leaving one's > systems open to kernel vulnerabilities is equally unacceptable. Why aren't you just using the IB support directly in RH 5.4? >From the release notes: 8.3.1. Open Fabrics Enterprise Distribution (OFED) Drivers The OpenFabrics Alliance Enterprise Distribution (OFED) is a collection of Infiniband and iWARP hardware diagnostic utilities, the Infiniband fabric management daemon, Infiniband/iWARP kernel module loader, and libraries and development packages for writing applications that use Remote Direct Memory Access (RDMA) technology. Red Hat Enterprise Linux uses the OFED software stack as its complete stack for Infiniband/iWARP/RDMA hardware support. In Red Hat Enterprise Linux 5.4, the following portions of OFED have been updated to the upstream version 1.4.1-rc3 * Remote Direct Memory Access (RDMA) headers (BZ#476301) * Reliable Datagram Sockets (RDS) protocol (BZ#477065, BZ#506907) * Sockets Direct Protocol (SDP) (BZ#476301) * SCSI RDMA Protocol (SRP) (BZ#476301) * IP over InfiniBand (IPoIB) (BZ#434779, BZ#466086, BZ#506907) Additionally, the following OFED drivers have been updated to the upstream version 1.4.1-rc3: * The cxgb3 and iw_cxgb3 drivers for the Chelsio T3 Family of network devices (BZ#476301, BZ#504906) * The driver for mthca-based InfiniBand HCA (Host Channel Adapter) (BZ#476301, BZ#506097) * qlgc_vnic driver (BZ#476301) > By not backporting support for current kernels to the latest stable OFED > release you are putting people between a rock and a hard place. OFED has an identity problem. Many people seem to think it is a back port project, but it isn't run that way. It is more like an upstream fork, testing effort and backport project rolled into one. Jason From hoot at ptpnow.com Mon Sep 28 04:16:42 2009 From: hoot at ptpnow.com (Hoot Thompson) Date: Mon, 28 Sep 2009 07:16:42 -0400 Subject: [ofa-general] srp availability in OFED1.5 Message-ID: <1254136602.10604.1.camel@leftknee> Will the srp module be included in the OFED 1.5 release? If so, when? From pavel at ucw.cz Mon Sep 28 13:49:23 2009 From: pavel at ucw.cz (Pavel Machek) Date: Mon, 28 Sep 2009 22:49:23 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <20090915113434.GF1328@ucw.cz> Message-ID: <20090928204923.GA1960@elf.ucw.cz> On Tue 2009-09-15 07:57:56, Roland Dreier wrote: > > > I don't remember seeing discussion of this on lkml. Yes it is in > > -next... > > eg http://lkml.org/lkml/2009/7/31/197 and followups, or search for v2 > and earlier patches. Well... it seems little overspecialized. Just modifying libc to provide hooks you want looks like better solution. > > Basically it allows app to 'trace itself'? ...with interesting mmap() > > interface, exporting int to userspace, hoping it behaves atomically...? > > Yes, it allows app to trace what the kernel does to memory mappings. I > don't believe there's any real issue to atomicity of mmap'ed memory, > since userspace really just tests whether read value is == to old read > value or not. That still needs memory barriers etc.. to ensure reliable operation, no? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html From jgunthorpe at obsidianresearch.com Mon Sep 28 14:40:57 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Mon, 28 Sep 2009 15:40:57 -0600 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090928204923.GA1960@elf.ucw.cz> References: <20090915113434.GF1328@ucw.cz> <20090928204923.GA1960@elf.ucw.cz> Message-ID: <20090928214057.GX19540@obsidianresearch.com> On Mon, Sep 28, 2009 at 10:49:23PM +0200, Pavel Machek wrote: > > > I don't remember seeing discussion of this on lkml. Yes it is in > > > -next... > > > > eg http://lkml.org/lkml/2009/7/31/197 and followups, or search for v2 > > and earlier patches. > Well... it seems little overspecialized. Just modifying libc to > provide hooks you want looks like better solution. That is what MPI people are doing today and their feedback is that it doesn't work - there are a lot of ways to mess with memory and no good choices to hook the raw syscalls and keep sensible performance. The main focus of this is high performance MPI apps, so lower overhead on critical paths like memory allocation is part of the point. It is ment to go hand-in-hand with the specialized RDMA memory pinning interfaces.. > > > Basically it allows app to 'trace itself'? ...with interesting mmap() > > > interface, exporting int to userspace, hoping it behaves atomically...? > > > > Yes, it allows app to trace what the kernel does to memory mappings. I > > don't believe there's any real issue to atomicity of mmap'ed memory, > > since userspace really just tests whether read value is == to old read > > value or not. > > That still needs memory barriers etc.. to ensure reliable operation, > no? No, I don't think so.. The application is expected to provide sequencing of some sort between the memory call (mmap/munmap/brk/etc) and the int check - usually just by running in the same thread, or through some kind of locking scheme. As long as the mmu notifiers run immediately in the same context as the mmap/etc then it should be fine. For example, the most common problem to solve looks like this: x = mmap(...) do RDMA with x [..] mmunmap(x); [..] y = mmap(..); do RDMA with y if by chance x == y things explode. So this API puts the int test directly before 'do RDMA with'. Due to the above kind of argument the net requirement is either to completely synchronously (and with low overhead) hook every mmap/munmap/brk/etc call into the kernel and do the accounting work, or have a very low over head check every time the memory region is about to be used. Jason From arlin.r.davis at intel.com Mon Sep 28 15:08:10 2009 From: arlin.r.davis at intel.com (Arlin Davis) Date: Mon, 28 Sep 2009 15:08:10 -0700 Subject: [ofa-general] [PATCH 1/3] uDAPL v2: scm: improve serialization of destroy and state changes Message-ID: <8DA143243B144A72BFF45F7B683E66C4@amr.corp.intel.com> WinOF testing with slightly different scheduler and verbs showed some issues with cleanup. Add better protection around destroy and move state change before socket send to insure correct state in multi-thread environment targeting the same device on send and recv. Change DCM_RTU_PENDING to DCM_REP_PENDING and and add static definition to local routines for better readability. Signed-off-by: Arlin Davis --- dapl/openib_common/dapl_ib_common.h | 4 +- dapl/openib_scm/cm.c | 125 +++++++++++++++-------------------- 2 files changed, 56 insertions(+), 73 deletions(-) diff --git a/dapl/openib_common/dapl_ib_common.h b/dapl/openib_common/dapl_ib_common.h index 3cd8885..671073b 100644 --- a/dapl/openib_common/dapl_ib_common.h +++ b/dapl/openib_common/dapl_ib_common.h @@ -265,7 +265,7 @@ typedef enum dapl_cm_state DCM_INIT, DCM_LISTEN, DCM_CONN_PENDING, - DCM_RTU_PENDING, + DCM_REP_PENDING, DCM_ACCEPTING, DCM_ACCEPTING_DATA, DCM_ACCEPTED, @@ -356,7 +356,7 @@ STATIC _INLINE_ char * dapl_cm_state_str(IN int st) "CM_INIT", "CM_LISTEN", "CM_CONN_PENDING", - "CM_RTU_PENDING", + "CM_REP_PENDING", "CM_ACCEPTING", "CM_ACCEPTING_DATA", "CM_ACCEPTED", diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 2403918..87f5446 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -46,6 +46,11 @@ * **************************************************************************/ +#if defined(_WIN32) +#define FD_SETSIZE 1024 +#define DAPL_FD_SETSIZE FD_SETSIZE +#endif + #include "dapl.h" #include "dapl_adapter_util.h" #include "dapl_evd_util.h" @@ -314,12 +319,6 @@ void dapls_ib_cm_free(dp_ib_cm_handle_t cm_ptr, DAPL_EP *ep) cm_ptr->ep = NULL; } - /* close socket if still active */ - if (cm_ptr->socket != DAPL_INVALID_SOCKET) { - shutdown(cm_ptr->socket, SHUT_RDWR); - closesocket(cm_ptr->socket); - cm_ptr->socket = DAPL_INVALID_SOCKET; - } dapl_os_unlock(&cm_ptr->lock); goto notify_thread; @@ -404,25 +403,15 @@ DAT_RETURN dapli_socket_disconnect(dp_ib_cm_handle_t cm_ptr) return DAT_SUCCESS; dapl_os_lock(&cm_ptr->lock); - if ((cm_ptr->state == DCM_INIT) || - (cm_ptr->state == DCM_DISCONNECTED) || - (cm_ptr->state == DCM_DESTROY)) { + if (cm_ptr->state != DCM_CONNECTED) { dapl_os_unlock(&cm_ptr->lock); return DAT_SUCCESS; - } else { - /* send disc date, close socket, schedule destroy */ - if (cm_ptr->socket != DAPL_INVALID_SOCKET) { - if (send(cm_ptr->socket, (char *)&disc_data, - sizeof(disc_data), 0) == -1) - dapl_log(DAPL_DBG_TYPE_WARN, - " cm_disc: write error = %s\n", - strerror(errno)); - shutdown(cm_ptr->socket, SHUT_RDWR); - closesocket(cm_ptr->socket); - cm_ptr->socket = DAPL_INVALID_SOCKET; - } - cm_ptr->state = DCM_DISCONNECTED; } + + /* send disc date, close socket, schedule destroy */ + dapls_modify_qp_state(ep_ptr->qp_handle, IBV_QPS_ERR, 0,0,0); + cm_ptr->state = DCM_DISCONNECTED; + send(cm_ptr->socket, (char *)&disc_data, sizeof(disc_data), 0); dapl_os_unlock(&cm_ptr->lock); /* disconnect events for RC's only */ @@ -472,6 +461,8 @@ static void dapli_socket_connected(dp_ib_cm_handle_t cm_ptr, int err) dapl_log(DAPL_DBG_TYPE_WARN, " CONN_PENDING: NODELAY setsockopt: %s\n", strerror(errno)); + + cm_ptr->state = DCM_REP_PENDING; /* send qp info and pdata to remote peer */ exp = sizeof(ib_cm_msg_t) - DCM_MAX_PDATA_SIZE; @@ -509,9 +500,6 @@ static void dapli_socket_connected(dp_ib_cm_handle_t cm_ptr, int err) htonll(cm_ptr->msg.saddr.ib.gid.global.subnet_prefix), (unsigned long long) htonll(cm_ptr->msg.saddr.ib.gid.global.interface_id)); - - /* queue up to work thread to avoid blocking consumer */ - cm_ptr->state = DCM_RTU_PENDING; return; bail: @@ -745,14 +733,14 @@ static void dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) dapl_dbg_log(DAPL_DBG_TYPE_EP, " connect_rtu: send RTU\n"); /* complete handshake after final QP state change, Just ver+op */ + cm_ptr->state = DCM_CONNECTED; cm_ptr->msg.op = ntohs(DCM_RTU); if (send(cm_ptr->socket, (char *)&cm_ptr->msg, 4, 0) == -1) { dapl_log(DAPL_DBG_TYPE_ERR, " CONN_RTU: write error = %s\n", strerror(errno)); goto bail; } - /* init cm_handle and post the event with private data */ - cm_ptr->state = DCM_CONNECTED; + /* post the event with private data */ event = IB_CME_CONNECTED; dapl_dbg_log(DAPL_DBG_TYPE_EP, " ACTIVE: connected!\n"); @@ -807,9 +795,7 @@ ud_bail: #endif { ep_ptr->cm_handle = cm_ptr; /* only RC, multi CR's on UD */ - dapl_evd_connection_callback(cm_ptr, - event, - cm_ptr->msg.p_data, ep_ptr); + dapl_evd_connection_callback(cm_ptr, event, cm_ptr->msg.p_data, ep_ptr); } return; @@ -883,7 +869,7 @@ dapli_socket_listen(DAPL_IA * ia_ptr, DAT_CONN_QUAL serviceID, DAPL_SP * sp_ptr) ntohs(serviceID), cm_ptr, cm_ptr->socket); return dat_status; - bail: +bail: dapl_dbg_log(DAPL_DBG_TYPE_CM, " listen: ERROR on conn_qual 0x%x\n", serviceID); dapls_ib_cm_free(cm_ptr, cm_ptr->ep); @@ -1026,7 +1012,7 @@ bail: * queue on work thread to receive RTU information to avoid blocking * user thread. */ -DAT_RETURN +static DAT_RETURN dapli_socket_accept_usr(DAPL_EP * ep_ptr, DAPL_CR * cr_ptr, DAT_COUNT p_size, DAT_PVOID p_data) { @@ -1108,10 +1094,14 @@ dapli_socket_accept_usr(DAPL_EP * ep_ptr, local.daddr.so = ia_ptr->hca_ptr->hca_address; ((struct sockaddr_in *)&local.daddr.so)->sin_port = htons((uint16_t)cm_ptr->sp->conn_qual); + cm_ptr->ep = ep_ptr; + cm_ptr->hca = ia_ptr->hca_ptr; + cm_ptr->state = DCM_ACCEPTED; local.p_size = htons(p_size); iov[0].iov_base = (void *)&local; iov[0].iov_len = exp; + if (p_size) { iov[1].iov_base = p_data; iov[1].iov_len = p_size; @@ -1139,14 +1129,9 @@ dapli_socket_accept_usr(DAPL_EP * ep_ptr, (unsigned long long) htonll(local.saddr.ib.gid.global.interface_id)); - /* save state and reference to EP, queue for RTU data */ - cm_ptr->ep = ep_ptr; - cm_ptr->hca = ia_ptr->hca_ptr; - cm_ptr->state = DCM_ACCEPTED; - dapl_dbg_log(DAPL_DBG_TYPE_EP, " PASSIVE: accepted!\n"); return DAT_SUCCESS; - bail: +bail: dapls_ib_cm_free(cm_ptr, cm_ptr->ep); dapls_modify_qp_state(ep_ptr->qp_handle, IBV_QPS_ERR, 0, 0, 0); return DAT_INTERNAL_ERROR; @@ -1155,7 +1140,7 @@ dapli_socket_accept_usr(DAPL_EP * ep_ptr, /* * PASSIVE: read RTU from active peer, post CONN event */ -void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) +static void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) { int len; ib_cm_events_t event = IB_CME_CONNECTED; @@ -1221,8 +1206,9 @@ void dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) closesocket(cm_ptr->socket); cm_ptr->socket = DAPL_INVALID_SOCKET; cm_ptr->state = DCM_RELEASED; - } else { + } else #endif + { cm_ptr->ep->cm_handle = cm_ptr; /* only RC, multi CR's on UD */ dapls_cr_callback(cm_ptr, event, NULL, cm_ptr->sp); } @@ -1399,19 +1385,12 @@ dapls_ib_remove_conn_listener(IN DAPL_IA * ia_ptr, IN DAPL_SP * sp_ptr) /* close accepted socket, free cm_srvc_handle and return */ if (cm_ptr != NULL) { - if (cm_ptr->socket != DAPL_INVALID_SOCKET) { - shutdown(cm_ptr->socket, SHUT_RDWR); - closesocket(cm_ptr->socket); - cm_ptr->socket = DAPL_INVALID_SOCKET; - } /* cr_thread will free */ + dapl_os_lock(&cm_ptr->lock); cm_ptr->state = DCM_DESTROY; sp_ptr->cm_srvc_handle = NULL; - if (send(cm_ptr->hca->ib_trans.scm[1], - "w", sizeof "w", 0) == -1) - dapl_log(DAPL_DBG_TYPE_CM, - " cm_destroy: thread wakeup error = %s\n", - strerror(errno)); + send(cm_ptr->hca->ib_trans.scm[1], "w", sizeof "w", 0); + dapl_os_unlock(&cm_ptr->lock); } return DAT_SUCCESS; } @@ -1492,29 +1471,26 @@ dapls_ib_reject_connection(IN dp_ib_cm_handle_t cm_ptr, if (psize > DCM_MAX_PDATA_SIZE) return DAT_LENGTH_ERROR; - /* write reject data to indicate reject */ - if (cm_ptr->socket != DAPL_INVALID_SOCKET) { - cm_ptr->msg.op = htons(DCM_REJ_USER); - cm_ptr->msg.p_size = htons(psize); - - iov[0].iov_base = (void *)&cm_ptr->msg; - iov[0].iov_len = sizeof(ib_cm_msg_t) - DCM_MAX_PDATA_SIZE; - if (psize) { - iov[1].iov_base = pdata; - iov[1].iov_len = psize; - writev(cm_ptr->socket, iov, 2); - } else { - writev(cm_ptr->socket, iov, 1); - } + dapl_os_lock(&cm_ptr->lock); - shutdown(cm_ptr->socket, SHUT_RDWR); - closesocket(cm_ptr->socket); - cm_ptr->socket = DAPL_INVALID_SOCKET; + /* write reject data to indicate reject */ + cm_ptr->msg.op = htons(DCM_REJ_USER); + cm_ptr->msg.p_size = htons(psize); + + iov[0].iov_base = (void *)&cm_ptr->msg; + iov[0].iov_len = sizeof(ib_cm_msg_t) - DCM_MAX_PDATA_SIZE; + if (psize) { + iov[1].iov_base = pdata; + iov[1].iov_len = psize; + writev(cm_ptr->socket, iov, 2); + } else { + writev(cm_ptr->socket, iov, 1); } /* cr_thread will destroy CR */ cm_ptr->state = DCM_DESTROY; send(cm_ptr->hca->ib_trans.scm[1], "w", sizeof "w", 0); + dapl_os_unlock(&cm_ptr->lock); return DAT_SUCCESS; } @@ -1734,19 +1710,25 @@ void cr_thread(void *arg) next_cr = dapl_llist_next_entry(&hca_ptr->ib_trans.list, (DAPL_LLIST_ENTRY *) & cr->entry); + dapl_os_lock(&cr->lock); if (cr->state == DCM_DESTROY || hca_ptr->ib_trans.cr_state != IB_THREAD_RUN) { + dapl_os_unlock(&cr->lock); dapl_llist_remove_entry(&hca_ptr->ib_trans.list, (DAPL_LLIST_ENTRY *) & cr->entry); dapl_dbg_log(DAPL_DBG_TYPE_CM, " CR FREE: %p ep=%p st=%d sock=%d\n", cr, cr->ep, cr->state, cr->socket); + shutdown(cr->socket, SHUT_RDWR); + closesocket(cr->socket); dapl_os_free(cr, sizeof(*cr)); continue; } - if (cr->socket == DAPL_INVALID_SOCKET) + if (cr->socket == DAPL_INVALID_SOCKET) { + dapl_os_unlock(&cr->lock); continue; + } event = (cr->state == DCM_CONN_PENDING) ? DAPL_FD_WRITE : DAPL_FD_READ; @@ -1757,10 +1739,11 @@ void cr_thread(void *arg) " -> %s\n", cr->state, cr->socket, inet_ntoa(((struct sockaddr_in *) &cr->msg.daddr.so)->sin_addr)); + dapl_os_unlock(&cr->lock); dapls_ib_cm_free(cr, cr->ep); continue; } - + dapl_os_unlock(&cr->lock); dapl_dbg_log(DAPL_DBG_TYPE_CM, " poll cr=%p, sck=%d\n", cr, cr->socket); dapl_os_unlock(&hca_ptr->ib_trans.lock); @@ -1784,7 +1767,7 @@ void cr_thread(void *arg) case DCM_ACCEPTED: dapli_socket_accept_rtu(cr); break; - case DCM_RTU_PENDING: + case DCM_REP_PENDING: dapli_socket_connect_rtu(cr); break; case DCM_CONNECTED: @@ -1846,7 +1829,7 @@ void cr_thread(void *arg) dapl_os_unlock(&hca_ptr->ib_trans.lock); dapl_os_free(set, sizeof(struct dapl_fd_set)); - out: +out: hca_ptr->ib_trans.cr_state = IB_THREAD_EXIT; dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " cr_thread(hca %p) exit\n", hca_ptr); } -- 1.5.2.5 From arlin.r.davis at intel.com Mon Sep 28 15:08:20 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Mon, 28 Sep 2009 15:08:20 -0700 Subject: [ofa-general] [PATCH 3/3] uDAPL v2: scm: tighten up socket options to insure similiar behavior on Windows and Linux. Message-ID: Add IPPROTO_TCP to create socket. Specify device IP address when binding instead of INADDR_ANY and remove setsocketopt REUSEADDR on the listen socket to avoid any issues with portability. Don't want duplicate port bindings. Signed-off-by: Arlin Davis --- dapl/openib_scm/cm.c | 10 +++------- 1 files changed, 3 insertions(+), 7 deletions(-) diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 87f5446..dae1781 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -531,7 +531,7 @@ dapli_socket_connect(DAPL_EP * ep_ptr, /* create, connect, sockopt, and exchange QP information */ if ((cm_ptr->socket = - socket(AF_INET, SOCK_STREAM, 0)) == DAPL_INVALID_SOCKET) { + socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == DAPL_INVALID_SOCKET) { dapl_os_free(cm_ptr, sizeof(*cm_ptr)); return DAT_INSUFFICIENT_RESOURCES; } @@ -815,7 +815,6 @@ dapli_socket_listen(DAPL_IA * ia_ptr, DAT_CONN_QUAL serviceID, DAPL_SP * sp_ptr) { struct sockaddr_in addr; ib_cm_srvc_handle_t cm_ptr = NULL; - int opt = 1; DAT_RETURN dat_status = DAT_SUCCESS; dapl_dbg_log(DAPL_DBG_TYPE_EP, @@ -831,19 +830,16 @@ dapli_socket_listen(DAPL_IA * ia_ptr, DAT_CONN_QUAL serviceID, DAPL_SP * sp_ptr) /* bind, listen, set sockopt, accept, exchange data */ if ((cm_ptr->socket = - socket(AF_INET, SOCK_STREAM, 0)) == DAPL_INVALID_SOCKET) { + socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == DAPL_INVALID_SOCKET) { dapl_log(DAPL_DBG_TYPE_ERR, " ERR: listen socket create: %s\n", strerror(errno)); dat_status = DAT_INSUFFICIENT_RESOURCES; goto bail; } - setsockopt(cm_ptr->socket, SOL_SOCKET, SO_REUSEADDR, - (char *)&opt, sizeof(opt)); - addr.sin_port = htons(serviceID); addr.sin_family = AF_INET; - addr.sin_addr.s_addr = INADDR_ANY; + addr.sin_addr = ((struct sockaddr_in *) &ia_ptr->hca_ptr->hca_address)->sin_addr; if ((bind(cm_ptr->socket, (struct sockaddr *)&addr, sizeof(addr)) < 0) || (listen(cm_ptr->socket, 128) < 0)) { -- 1.5.2.5 From arlin.r.davis at intel.com Mon Sep 28 15:08:16 2009 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Mon, 28 Sep 2009 15:08:16 -0700 Subject: [ofa-general] [PATCH 2/3] uDAPL v2: cma: improve serialization of destroy and event processing Message-ID: WinOF testing with slightly different scheduler and verbs showed some issues with cleanup. Add better protection around destroy and event processing thread. Remove destroy flag and add refs counting to conn objects to block destroy until all references are cleared. Add locking aroung ref counting and passive and active event processing. Signed-off-by: Arlin Davis --- dapl/openib_cma/cm.c | 264 ++++++++++++++++++--------------------- dapl/openib_cma/dapl_ib_util.h | 2 +- dapl/openib_cma/device.c | 17 ++- 3 files changed, 133 insertions(+), 150 deletions(-) diff --git a/dapl/openib_cma/cm.c b/dapl/openib_cma/cm.c index 545190d..40634b2 100644 --- a/dapl/openib_cma/cm.c +++ b/dapl/openib_cma/cm.c @@ -163,6 +163,7 @@ dp_ib_cm_handle_t dapls_ib_cm_create(DAPL_EP *ep) dapl_os_memzero(conn, sizeof(*conn)); dapl_os_lock_init(&conn->lock); + conn->refs++; /* create CM_ID, bind to local device, create QP */ if (rdma_create_id(g_cm_events, &cm_id, (void *)conn, RDMA_PS_TCP)) { @@ -189,46 +190,37 @@ dp_ib_cm_handle_t dapls_ib_cm_create(DAPL_EP *ep) } /* - * Called from consumer thread via dat_ep_free(). - * CANNOT be called from the async event processing thread - * dapli_cma_event_cb() since a cm_id reference is held and - * a deadlock will occur. + * Only called from consumer thread via dat_ep_free() + * accept, reject, or connect. + * Cannot be called from callback thread. + * rdma_destroy_id will block until rdma_get_cm_event is acked. */ - void dapls_ib_cm_free(dp_ib_cm_handle_t conn, DAPL_EP *ep) { - struct rdma_cm_id *cm_id; - - if (conn == NULL) - return; - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " destroy_conn: conn %p id %d\n", conn, conn->cm_id); + " destroy_conn: conn %p id %d\n", + conn, conn->cm_id); dapl_os_lock(&conn->lock); - conn->destroy = 1; + conn->refs--; + dapl_os_unlock(&conn->lock); - if (ep != NULL) { + /* block until event thread complete */ + while (conn->refs) + dapl_os_sleep_usec(10000); + + if (ep) { ep->cm_handle = NULL; ep->qp_handle = NULL; ep->qp_state = IB_QP_STATE_ERROR; } - cm_id = conn->cm_id; - conn->cm_id = NULL; - dapl_os_unlock(&conn->lock); - - /* - * rdma_destroy_id will force synchronization with async CM event - * thread since it blocks until the in-process event reference - * is cleared during our event processing call exit. - */ - if (cm_id) { - if (cm_id->qp) - rdma_destroy_qp(cm_id); - - rdma_destroy_id(cm_id); + if (conn->cm_id) { + if (conn->cm_id->qp) + rdma_destroy_qp(conn->cm_id); + rdma_destroy_id(conn->cm_id); } + dapl_os_free(conn, sizeof(*conn)); } @@ -255,6 +247,7 @@ static struct dapl_cm_id *dapli_req_recv(struct dapl_cm_id *conn, event->id->context = new_conn; /* update CM_ID context */ new_conn->sp = conn->sp; new_conn->hca = conn->hca; + new_conn->refs++; /* Get requesters connect data, setup for accept */ new_conn->params.responder_resources = @@ -308,17 +301,14 @@ static struct dapl_cm_id *dapli_req_recv(struct dapl_cm_id *conn, static void dapli_cm_active_cb(struct dapl_cm_id *conn, struct rdma_cm_event *event) { + DAPL_OS_LOCK *lock = &conn->lock; + ib_cm_events_t ib_cm_event; + const void *pdata = NULL; + dapl_dbg_log(DAPL_DBG_TYPE_CM, " active_cb: conn %p id %d event %d\n", conn, conn->cm_id, event->event); - dapl_os_lock(&conn->lock); - if (conn->destroy) { - dapl_os_unlock(&conn->lock); - return; - } - dapl_os_unlock(&conn->lock); - /* There is a chance that we can get events after * the consumer calls disconnect in a pending state * since the IB CM and uDAPL states are not shared. @@ -340,64 +330,53 @@ static void dapli_cm_active_cb(struct dapl_cm_id *conn, conn->ep->param.ep_state = DAT_EP_STATE_DISCONNECTED; dapl_os_unlock(&conn->ep->header.lock); + dapl_os_lock(lock); switch (event->event) { case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: - { + dapl_log(DAPL_DBG_TYPE_WARN, + "dapl_cma_active: CONN_ERR event=0x%x" + " status=%d %s DST %s, %d\n", + event->event, event->status, + (event->status == -ETIMEDOUT) ? "TIMEOUT" : "", + inet_ntoa(((struct sockaddr_in *) + &conn->cm_id->route.addr.dst_addr)-> + sin_addr), + ntohs(((struct sockaddr_in *) + &conn->cm_id->route.addr.dst_addr)-> + sin_port)); + + /* per DAT SPEC provider always returns UNREACHABLE */ + ib_cm_event = IB_CME_DESTINATION_UNREACHABLE; + break; + case RDMA_CM_EVENT_REJECTED: + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " dapli_cm_active_handler: REJECTED reason=%d\n", + event->status); + + /* valid REJ from consumer will always contain private data */ + if (event->status == 28 && + event->param.conn.private_data_len) { + ib_cm_event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; + pdata = + (unsigned char *)event->param.conn. + private_data + + sizeof(struct dapl_pdata_hdr); + } else { + ib_cm_event = IB_CME_DESTINATION_REJECT; dapl_log(DAPL_DBG_TYPE_WARN, - "dapl_cma_active: CONN_ERR event=0x%x" - " status=%d %s DST %s, %d\n", - event->event, event->status, - (event->status == -ETIMEDOUT) ? "TIMEOUT" : "", + "dapl_cma_active: non-consumer REJ," + " reason=%d, DST %s, %d\n", + event->status, inet_ntoa(((struct sockaddr_in *) - &conn->cm_id->route.addr.dst_addr)-> - sin_addr), + &conn->cm_id->route.addr. + dst_addr)->sin_addr), ntohs(((struct sockaddr_in *) - &conn->cm_id->route.addr.dst_addr)-> - sin_port)); - - /* per DAT SPEC provider always returns UNREACHABLE */ - dapl_evd_connection_callback(conn, - IB_CME_DESTINATION_UNREACHABLE, - NULL, conn->ep); - break; - } - case RDMA_CM_EVENT_REJECTED: - { - ib_cm_events_t cm_event; - unsigned char *pdata = NULL; - - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " dapli_cm_active_handler: REJECTED reason=%d\n", - event->status); - - /* valid REJ from consumer will always contain private data */ - if (event->status == 28 && - event->param.conn.private_data_len) { - cm_event = - IB_CME_DESTINATION_REJECT_PRIVATE_DATA; - pdata = - (unsigned char *)event->param.conn. - private_data + - sizeof(struct dapl_pdata_hdr); - } else { - cm_event = IB_CME_DESTINATION_REJECT; - dapl_log(DAPL_DBG_TYPE_WARN, - "dapl_cma_active: non-consumer REJ," - " reason=%d, DST %s, %d\n", - event->status, - inet_ntoa(((struct sockaddr_in *) - &conn->cm_id->route.addr. - dst_addr)->sin_addr), - ntohs(((struct sockaddr_in *) - &conn->cm_id->route.addr. - dst_addr)->sin_port)); - } - dapl_evd_connection_callback(conn, cm_event, pdata, - conn->ep); - break; + &conn->cm_id->route.addr. + dst_addr)->sin_port)); } + break; case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, " active_cb: cm_id %d PORT %d CONNECTED to %s!\n", @@ -414,58 +393,51 @@ static void dapli_cm_active_cb(struct dapl_cm_id *conn, conn->ep->param.local_port_qual = PORT_TO_SID(rdma_get_src_port(conn->cm_id)); - dapl_evd_connection_callback(conn, IB_CME_CONNECTED, - event->param.conn.private_data, - conn->ep); + ib_cm_event = IB_CME_CONNECTED; + pdata = event->param.conn.private_data; break; - case RDMA_CM_EVENT_DISCONNECTED: dapl_dbg_log(DAPL_DBG_TYPE_CM, " active_cb: DISC EVENT - EP %p\n",conn->ep); rdma_disconnect(conn->cm_id); /* required for DREP */ + ib_cm_event = IB_CME_DISCONNECTED; /* validate EP handle */ - if (!DAPL_BAD_HANDLE(conn->ep, DAPL_MAGIC_EP)) - dapl_evd_connection_callback(conn, - IB_CME_DISCONNECTED, - NULL, conn->ep); + if (DAPL_BAD_HANDLE(conn->ep, DAPL_MAGIC_EP)) + conn = NULL; break; default: dapl_dbg_log(DAPL_DBG_TYPE_ERR, " dapli_cm_active_cb_handler: Unexpected CM " "event %d on ID 0x%p\n", event->event, conn->cm_id); + conn = NULL; break; } - return; + dapl_os_unlock(lock); + if (conn) + dapl_evd_connection_callback(conn, ib_cm_event, pdata, conn->ep); } static void dapli_cm_passive_cb(struct dapl_cm_id *conn, struct rdma_cm_event *event) { - struct dapl_cm_id *new_conn; - + ib_cm_events_t ib_cm_event; + struct dapl_cm_id *conn_recv = conn; + const void *pdata = NULL; + dapl_dbg_log(DAPL_DBG_TYPE_CM, " passive_cb: conn %p id %d event %d\n", conn, event->id, event->event); dapl_os_lock(&conn->lock); - if (conn->destroy) { - dapl_os_unlock(&conn->lock); - return; - } - dapl_os_unlock(&conn->lock); switch (event->event) { case RDMA_CM_EVENT_CONNECT_REQUEST: /* create new conn object with new conn_id from event */ - new_conn = dapli_req_recv(conn, event); - - if (new_conn) - dapls_cr_callback(new_conn, - IB_CME_CONNECTION_REQUEST_PENDING, - event->param.conn.private_data, - new_conn->sp); + conn_recv = dapli_req_recv(conn, event); + ib_cm_event = IB_CME_CONNECTION_REQUEST_PENDING; + pdata = event->param.conn.private_data; break; case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_CONNECT_ERROR: @@ -479,29 +451,22 @@ static void dapli_cm_passive_cb(struct dapl_cm_id *conn, sin_addr), ntohs(((struct sockaddr_in *) &conn->cm_id->route.addr. dst_addr)->sin_port)); - - dapls_cr_callback(conn, IB_CME_DESTINATION_UNREACHABLE, - NULL, conn->sp); + ib_cm_event = IB_CME_DESTINATION_UNREACHABLE; break; - case RDMA_CM_EVENT_REJECTED: - { - /* will alwasys be abnormal NON-consumer from active side */ - dapl_log(DAPL_DBG_TYPE_WARN, - "dapl_cm_passive: non-consumer REJ, reason=%d," - " DST %s, %d\n", - event->status, - inet_ntoa(((struct sockaddr_in *) - &conn->cm_id->route.addr.dst_addr)-> - sin_addr), - ntohs(((struct sockaddr_in *) - &conn->cm_id->route.addr.dst_addr)-> - sin_port)); - - dapls_cr_callback(conn, IB_CME_DESTINATION_REJECT, - NULL, conn->sp); - break; - } + /* will alwasys be abnormal NON-consumer from active side */ + dapl_log(DAPL_DBG_TYPE_WARN, + "dapl_cm_passive: non-consumer REJ, reason=%d," + " DST %s, %d\n", + event->status, + inet_ntoa(((struct sockaddr_in *) + &conn->cm_id->route.addr.dst_addr)-> + sin_addr), + ntohs(((struct sockaddr_in *) + &conn->cm_id->route.addr.dst_addr)-> + sin_port)); + ib_cm_event = IB_CME_DESTINATION_REJECT; + break; case RDMA_CM_EVENT_ESTABLISHED: dapl_dbg_log(DAPL_DBG_TYPE_CM, " passive_cb: cm_id %p PORT %d CONNECTED from 0x%x!\n", @@ -511,26 +476,27 @@ static void dapli_cm_passive_cb(struct dapl_cm_id *conn, ntohl(((struct sockaddr_in *) &conn->cm_id->route.addr.dst_addr)-> sin_addr.s_addr)); - - dapls_cr_callback(conn, IB_CME_CONNECTED, NULL, conn->sp); - + ib_cm_event = IB_CME_CONNECTED; break; case RDMA_CM_EVENT_DISCONNECTED: rdma_disconnect(conn->cm_id); /* required for DREP */ + ib_cm_event = IB_CME_DISCONNECTED; /* validate SP handle context */ - if (!DAPL_BAD_HANDLE(conn->sp, DAPL_MAGIC_PSP) || - !DAPL_BAD_HANDLE(conn->sp, DAPL_MAGIC_RSP)) - dapls_cr_callback(conn, - IB_CME_DISCONNECTED, NULL, conn->sp); + if (DAPL_BAD_HANDLE(conn->sp, DAPL_MAGIC_PSP) && + DAPL_BAD_HANDLE(conn->sp, DAPL_MAGIC_RSP)) + conn_recv = NULL; break; default: dapl_dbg_log(DAPL_DBG_TYPE_ERR, " passive_cb: " "Unexpected CM event %d on ID 0x%p\n", event->event, conn->cm_id); + conn_recv = NULL; break; } - return; + dapl_os_unlock(&conn->lock); + if (conn_recv) + dapls_cr_callback(conn_recv, ib_cm_event, pdata, conn_recv->sp); } /************************ DAPL provider entry points **********************/ @@ -713,6 +679,7 @@ dapls_ib_setup_conn_listener(IN DAPL_IA * ia_ptr, dapl_os_memzero(conn, sizeof(*conn)); dapl_os_lock_init(&conn->lock); + conn->refs++; /* create CM_ID, bind to local device, create QP */ if (rdma_create_id @@ -1196,10 +1163,8 @@ ib_cm_events_t dapls_ib_get_cm_event(IN DAT_EVENT_NUMBER dat_event_num) void dapli_cma_event_cb(void) { struct rdma_cm_event *event; - - dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " cm_event()\n"); - - /* process one CM event, fairness */ + + /* process one CM event, fairness, non-blocking */ if (!rdma_get_cm_event(g_cm_events, &event)) { struct dapl_cm_id *conn; @@ -1212,6 +1177,16 @@ void dapli_cma_event_cb(void) dapl_dbg_log(DAPL_DBG_TYPE_CM, " cm_event: EVENT=%d ID=%p LID=%p CTX=%p\n", event->event, event->id, event->listen_id, conn); + + /* cm_free is blocked waiting for ack */ + dapl_os_lock(&conn->lock); + if (!conn->refs) { + dapl_os_unlock(&conn->lock); + rdma_ack_cm_event(event); + return; + } + conn->refs++; + dapl_os_unlock(&conn->lock); switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: @@ -1317,15 +1292,20 @@ void dapli_cma_event_cb(void) #endif break; default: - dapl_dbg_log(DAPL_DBG_TYPE_WARN, + dapl_dbg_log(DAPL_DBG_TYPE_CM, " cm_event: UNEXPECTED EVENT=%p ID=%p CTX=%p\n", event->event, event->id, event->id->context); break; } + /* ack event, unblocks destroy_cm_id in consumer threads */ rdma_ack_cm_event(event); - } + + dapl_os_lock(&conn->lock); + conn->refs--; + dapl_os_unlock(&conn->lock); + } } /* diff --git a/dapl/openib_cma/dapl_ib_util.h b/dapl/openib_cma/dapl_ib_util.h index 35900e7..309db53 100755 --- a/dapl/openib_cma/dapl_ib_util.h +++ b/dapl/openib_cma/dapl_ib_util.h @@ -58,7 +58,7 @@ struct dapl_cm_id { DAPL_OS_LOCK lock; - int destroy; + int refs; int arp_retries; int arp_timeout; int route_retries; diff --git a/dapl/openib_cma/device.c b/dapl/openib_cma/device.c index c1c1ee2..e9ec733 100644 --- a/dapl/openib_cma/device.c +++ b/dapl/openib_cma/device.c @@ -57,7 +57,7 @@ struct dapl_llist_entry *g_hca_list; #include "..\..\..\..\..\etc\user\comp_channel.cpp" #include -struct ibvw_windata windata; +static COMP_SET ufds; static int getipaddr_netdev(char *name, char *addr, int addr_len) { @@ -101,14 +101,12 @@ release: static int dapls_os_init(void) { - return ibvw_get_windata(&windata, IBVW_WINDATA_VERSION); + return CompSetInit(&ufds); } static void dapls_os_release(void) { - if (windata.comp_mgr) - ibvw_release_windata(&windata, IBVW_WINDATA_VERSION); - windata.comp_mgr = NULL; + CompSetCleanup(&ufds); } static int dapls_config_cm_channel(struct rdma_event_channel *channel) @@ -131,7 +129,7 @@ static int dapls_config_comp_channel(struct ibv_comp_channel *channel) static int dapls_thread_signal(void) { - CompManagerCancel(windata.comp_mgr); + CompSetCancel(&ufds); return 0; } #else // _WIN64 || WIN32 @@ -611,11 +609,16 @@ void dapli_thread(void *arg) g_ib_thread_state == IB_THREAD_RUN; dapl_os_lock(&g_hca_lock)) { + CompSetZero(&ufds); + CompSetAdd(&g_cm_events->channel, &ufds); + idx = 0; hca = dapl_llist_is_empty(&g_hca_list) ? NULL : dapl_llist_peek_head(&g_hca_list); while (hca) { + CompSetAdd(&hca->ib_ctx->channel, &ufds); + CompSetAdd(&hca->ib_cq->comp_channel, &ufds); uhca[idx++] = hca; hca = dapl_llist_next_entry(&g_hca_list, (DAPL_LLIST_ENTRY *) @@ -624,7 +627,7 @@ void dapli_thread(void *arg) cnt = idx; dapl_os_unlock(&g_hca_lock); - ret = CompManagerPoll(windata.comp_mgr, INFINITE, &channel); + ret = CompSetPoll(&ufds, INFINITE); dapl_dbg_log(DAPL_DBG_TYPE_UTIL, " ib_thread(%d) poll_event 0x%x\n", -- 1.5.2.5 From khris4 at gmail.com Mon Sep 28 20:33:12 2009 From: khris4 at gmail.com (chris) Date: Mon, 28 Sep 2009 20:33:12 -0700 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 Message-ID: <9e14a1260909282033h7adafad4q9a2a81d943e428c9@mail.gmail.com> Hey Guys, So after compiling and installing everthing from OFED 1.5 I get this issue when trying to start iscsi with ib_iser. Linux localhost.localdomain 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:03:03 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Starting iSCSI initiator service: FATAL: Error inserting ib_iser (/lib/modules/2.6.18-164.el5xen/kernel/drivers/infiniband/ulp/iser/ib_iser.ko): Unknown symbol in module, or unknown parameter (see dmesg) [ OK ] Setting up iSCSI targets: iscsiadm: No records found! [ OK ] iscsi: registered transport (tcp) ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys -------------- next part -------------- An HTML attachment was scrubbed... URL: From khris4 at gmail.com Mon Sep 28 21:56:27 2009 From: khris4 at gmail.com (Chris) Date: Mon, 28 Sep 2009 21:56:27 -0700 Subject: [ofa-general] Ib_iser error with OFED 1.5.1 and centos 5.3 Message-ID: <416D52E5-D9C3-478A-93A1-252C1952F73A@gmail.com> Hey Guys, So after compiling and installing everthing from OFED 1.5 I get this issue when trying to start iscsi with ib_iser. Linux localhost.localdomain 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:03:03 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Starting iSCSI initiator service: FATAL: Error inserting ib_iser (/lib/ modules/2.6.18-164.el5xen/kernel/drivers/infiniband/ulp/iser/ ib_iser.ko): Unknown symbol in module, or unknown parameter (see dmesg) [ OK ] Setting up iSCSI targets: iscsiadm: No records found! [ OK ] iscsi: registered transport (tcp) ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys ib_iser: disagrees about version of symbol ib_fmr_pool_unmap ib_iser: Unknown symbol ib_fmr_pool_unmap ib_iser: disagrees about version of symbol ib_create_cq ib_iser: Unknown symbol ib_create_cq ib_iser: disagrees about version of symbol rdma_resolve_addr ib_iser: Unknown symbol rdma_resolve_addr ib_iser: disagrees about version of symbol ib_create_fmr_pool ib_iser: Unknown symbol ib_create_fmr_pool ib_iser: disagrees about version of symbol ib_dereg_mr ib_iser: Unknown symbol ib_dereg_mr ib_iser: disagrees about version of symbol rdma_disconnect ib_iser: Unknown symbol rdma_disconnect ib_iser: disagrees about version of symbol rdma_resolve_route ib_iser: Unknown symbol rdma_resolve_route ib_iser: disagrees about version of symbol rdma_create_qp ib_iser: Unknown symbol rdma_create_qp ib_iser: disagrees about version of symbol ib_destroy_cq ib_iser: Unknown symbol ib_destroy_cq ib_iser: disagrees about version of symbol rdma_create_id ib_iser: Unknown symbol rdma_create_id ib_iser: disagrees about version of symbol rdma_destroy_qp ib_iser: Unknown symbol rdma_destroy_qp ib_iser: disagrees about version of symbol ib_get_dma_mr ib_iser: Unknown symbol ib_get_dma_mr ib_iser: disagrees about version of symbol ib_alloc_pd ib_iser: Unknown symbol ib_alloc_pd ib_iser: disagrees about version of symbol rdma_connect ib_iser: Unknown symbol rdma_connect ib_iser: disagrees about version of symbol rdma_destroy_id ib_iser: Unknown symbol rdma_destroy_id ib_iser: disagrees about version of symbol ib_dealloc_pd ib_iser: Unknown symbol ib_dealloc_pd ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys ib_iser: Unknown symbol ib_fmr_pool_map_phys Sent from my iPhone -------------- next part -------------- An HTML attachment was scrubbed... URL: From khris4 at gmail.com Mon Sep 28 22:03:59 2009 From: khris4 at gmail.com (Chris) Date: Mon, 28 Sep 2009 22:03:59 -0700 Subject: [ofa-general] Re: general Digest, Vol 32, Issue 108 In-Reply-To: <20090929050452.59AE1E621D7@openfabrics.org> References: <20090929050452.59AE1E621D7@openfabrics.org> Message-ID: <0D401559-D16C-4DA9-85A3-B6851C1A1276@gmail.com> Sorry for the double post I didn't get a confirmation for the one. So I thought it didn't go thru Sent from my iPhone On Sep 28, 2009, at 10:04 PM, general-request at lists.openfabrics.org wrote: > Send general mailing list submissions to > general at lists.openfabrics.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > general-request at lists.openfabrics.org > > You can reach the person managing the list at > general-owner at lists.openfabrics.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of general digest..." > > > Today's Topics: > > 1. help install ofed 1.4 on Centos 5.2 (chris) > 2. Ib_iser error with OFED 1.5.1 and centos 5.3 (Chris) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 28 Sep 2009 20:33:12 -0700 > From: chris > Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 > To: general at lists.openfabrics.org > Message-ID: > <9e14a1260909282033h7adafad4q9a2a81d943e428c9 at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hey Guys, > > > So after compiling and installing everthing from OFED 1.5 I get > this issue > when trying to start iscsi with ib_iser. > > > Linux localhost.localdomain 2.6.18-164.el5xen #1 SMP Thu Sep 3 > 04:03:03 EDT > 2009 x86_64 x86_64 x86_64 GNU/Linux > > Starting iSCSI initiator service: FATAL: Error inserting ib_iser > (/lib/modules/2.6.18-164.el5xen/kernel/drivers/infiniband/ulp/iser/ > ib_iser.= > ko): > Unknown symbol in module, or unknown parameter (see dmesg) > [ OK ] > Setting up iSCSI targets: iscsiadm: No records found! > [ OK ] > > iscsi: registered transport (tcp) > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://lists.openfabrics.org/pipermail/general/attachments/20090928/9e= > 713013/attachment-0001.htm > > ------------------------------ > > Message: 2 > Date: Mon, 28 Sep 2009 21:56:27 -0700 > From: Chris > Subject: [ofa-general] Ib_iser error with OFED 1.5.1 and centos 5.3 > To: "general at lists.openfabrics.org" > Message-ID: <416D52E5-D9C3-478A-93A1-252C1952F73A at gmail.com> > Content-Type: text/plain; charset="us-ascii" > > > Hey Guys, > > > So after compiling and installing everthing from OFED 1.5 I get this > issue when trying to start iscsi with ib_iser. > > > Linux localhost.localdomain 2.6.18-164.el5xen #1 SMP Thu Sep 3 > 04:03:03 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux > > Starting iSCSI initiator service: FATAL: Error inserting ib_iser (/ > lib/ > modules/2.6.18-164.el5xen/kernel/drivers/infiniband/ulp/iser/ > ib_iser.ko): Unknown symbol in module, or unknown parameter (see > dmesg) > [ OK ] > Setting up iSCSI targets: iscsiadm: No records found! > [ OK ] > > iscsi: registered transport (tcp) > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > ib_iser: disagrees about version of symbol ib_dealloc_pd > ib_iser: Unknown symbol ib_dealloc_pd > ib_iser: disagrees about version of symbol ib_fmr_pool_map_phys > ib_iser: Unknown symbol ib_fmr_pool_map_phys > ib_iser: disagrees about version of symbol ib_fmr_pool_unmap > ib_iser: Unknown symbol ib_fmr_pool_unmap > ib_iser: disagrees about version of symbol ib_create_cq > ib_iser: Unknown symbol ib_create_cq > ib_iser: disagrees about version of symbol rdma_resolve_addr > ib_iser: Unknown symbol rdma_resolve_addr > ib_iser: disagrees about version of symbol ib_create_fmr_pool > ib_iser: Unknown symbol ib_create_fmr_pool > ib_iser: disagrees about version of symbol ib_dereg_mr > ib_iser: Unknown symbol ib_dereg_mr > ib_iser: disagrees about version of symbol rdma_disconnect > ib_iser: Unknown symbol rdma_disconnect > ib_iser: disagrees about version of symbol rdma_resolve_route > ib_iser: Unknown symbol rdma_resolve_route > ib_iser: disagrees about version of symbol rdma_create_qp > ib_iser: Unknown symbol rdma_create_qp > ib_iser: disagrees about version of symbol ib_destroy_cq > ib_iser: Unknown symbol ib_destroy_cq > ib_iser: disagrees about version of symbol rdma_create_id > ib_iser: Unknown symbol rdma_create_id > ib_iser: disagrees about version of symbol rdma_destroy_qp > ib_iser: Unknown symbol rdma_destroy_qp > ib_iser: disagrees about version of symbol ib_get_dma_mr > ib_iser: Unknown symbol ib_get_dma_mr > ib_iser: disagrees about version of symbol ib_alloc_pd > ib_iser: Unknown symbol ib_alloc_pd > ib_iser: disagrees about version of symbol rdma_connect > ib_iser: Unknown symbol rdma_connect > ib_iser: disagrees about version of symbol rdma_destroy_id > ib_iser: Unknown symbol rdma_destroy_id > i From vlad at lists.openfabrics.org Tue Sep 29 03:16:49 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 29 Sep 2009 03:16:49 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090929-0200 daily build status Message-ID: <20090929101650.05A2DE61C31@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From hnrose at comcast.net Tue Sep 29 03:53:02 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Tue, 29 Sep 2009 06:53:02 -0400 Subject: [ofa-general] [PATCHv2] opensm/osm_mesh.c: Add dump_mesh routine at OSM_LOG_DEBUG level Message-ID: <20090929105301.GA26638@comcast.net> Signed-off-by: Hal Rosenstock --- Changes since v1: Use snprintf rather than sprintf Also, moved output of ] diff --git a/opensm/opensm/osm_mesh.c b/opensm/opensm/osm_mesh.c index 260e2f8..53f0f58 100644 --- a/opensm/opensm/osm_mesh.c +++ b/opensm/opensm/osm_mesh.c @@ -1565,6 +1565,63 @@ err: return -1; } +static void dump_mesh(lash_t *p_lash) +{ + osm_log_t *p_log = &p_lash->p_osm->log; + int sw; + int num_switches = p_lash->num_switches; + int dimension; + int i, j, k, n; + switch_t *s, *s2; + char buf[256]; + + OSM_LOG_ENTER(p_log); + + for (sw = 0; sw < num_switches; sw++) { + s = p_lash->switches[sw]; + dimension = s->node->dimension; + n = sprintf(buf, "["); + for (i = 0; i < dimension; i++) { + n += snprintf(buf + n, sizeof(buf) - n, + "%2d", s->node->coord[i]); + if (n > sizeof(buf)) + n = sizeof(buf); + if (i != dimension - 1) { + n += snprintf(buf + n, sizeof(buf) - n, "%s", ","); + if (n > sizeof(buf)) + n = sizeof(buf); + } + } + n += snprintf(buf + n, sizeof(buf) - n, "]"); + if (n > sizeof(buf)) + n = sizeof(buf); + for (j = 0; j < s->node->num_links; j++) { + s2 = p_lash->switches[s->node->links[j]->switch_id]; + n += snprintf(buf + n, sizeof(buf) - n, " [%d]->[", j); + if (n > sizeof(buf)) + n = sizeof(buf); + for (k = 0; k < dimension; k++) { + n += snprintf(buf + n, sizeof(buf) - n, "%2d", + s2->node->coord[k]); + if (n > sizeof(buf)) + n = sizeof(buf); + if (k != dimension - 1) { + n += snprintf(buf + n, sizeof(buf) - n, + ","); + if (n > sizeof(buf)) + n = sizeof(buf); + } + } + n += snprintf(buf + n, sizeof(buf) - n, "]"); + if (n > sizeof(buf)) + n = sizeof(buf); + } + OSM_LOG(p_log, OSM_LOG_DEBUG, "%s\n", buf); + } + + OSM_LOG_EXIT(p_log); +} + /* * osm_do_mesh_analysis */ @@ -1653,6 +1710,9 @@ int osm_do_mesh_analysis(lash_t *p_lash) OSM_LOG(p_log, OSM_LOG_INFO, "%s", buf); } + if (osm_log_is_active(p_log, OSM_LOG_DEBUG)) + dump_mesh(p_lash); + done: mesh_delete(mesh); OSM_LOG_EXIT(p_log); From sashak at voltaire.com Tue Sep 29 04:02:15 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 13:02:15 +0200 Subject: [ofa-general] [PATCH] opensm/osm_get_port_by_lid(): use faster cl_ptr_vector_get() Message-ID: <20090929110215.GK26931@me> Use faster cl_ptr_vector_get() call instead of cl_ptr_vector_at(). In this way eliminate 'stat' variable needs. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_subnet.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index b475031..67bc7e1 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -646,22 +646,22 @@ osm_port_t *osm_get_port_by_guid(IN osm_subn_t const *p_subn, IN ib_net64_t guid osm_port_t *osm_get_port_by_lid(IN osm_subn_t const * subn, IN ib_net16_t lid) { osm_port_t *port = NULL; - ib_api_status_t stat; uint16_t base_lid; uint8_t lmc; lid = cl_ntoh16(lid); + if (lid >= cl_ptr_vector_get_size(&subn->port_lid_tbl)) + return NULL; /* Loop on lmc from 0 up through max LMC possible */ for (lmc = 0; lmc <= IB_PORT_LMC_MAX; lmc++) { /* Calculate a base LID assuming this is the real LMC */ base_lid = lid & ~((1 << lmc) - 1); - stat = cl_ptr_vector_at(&subn->port_lid_tbl, base_lid, - (void *)&port); + port = cl_ptr_vector_get(&subn->port_lid_tbl, base_lid); /* Determine if base LID "tested" is the real base LID */ /* This is true if the LMC "tested" is the port's actual LMC */ - if (stat == CL_SUCCESS && port && lmc == osm_port_get_lmc(port)) + if (port && lmc == osm_port_get_lmc(port)) return port; } -- 1.6.5.rc1 From sashak at voltaire.com Tue Sep 29 04:02:44 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 13:02:44 +0200 Subject: [ofa-general] [PATCH] opensm/osm_get_port_by_lid(): speedup a port lookup In-Reply-To: <20090929110215.GK26931@me> References: <20090929110215.GK26931@me> Message-ID: <20090929110244.GL26931@me> Speedup a port lookup over LMC array - it is not necessary to match LMC exactly for found port because base lid should be equal to requested lid masked value, so '<=' comparison should be enough and we don't need to loop up to an actual port's lmc value match. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_subnet.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 67bc7e1..30f8af5 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -661,7 +661,7 @@ osm_port_t *osm_get_port_by_lid(IN osm_subn_t const * subn, IN ib_net16_t lid) port = cl_ptr_vector_get(&subn->port_lid_tbl, base_lid); /* Determine if base LID "tested" is the real base LID */ /* This is true if the LMC "tested" is the port's actual LMC */ - if (port && lmc == osm_port_get_lmc(port)) + if (port && lmc <= osm_port_get_lmc(port)) return port; } -- 1.6.5.rc1 From sashak at voltaire.com Tue Sep 29 04:03:08 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 13:03:08 +0200 Subject: [ofa-general] [PATCH] opensm/osm_get_port_by_lid(): don't bother with lmc In-Reply-To: <20090929110244.GL26931@me> References: <20090929110215.GK26931@me> <20090929110244.GL26931@me> Message-ID: <20090929110308.GM26931@me> Since subn->port_lid_tbl vector is filled for all port's LIDs in accordance with its LMC value, so we don't need to bother with LMC tracking and instead can just return a pointer indexed by requested lid. Obviously it speeds this helper up significantly. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_subnet.c | 21 ++------------------- 1 files changed, 2 insertions(+), 19 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 30f8af5..97b62c2 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -645,26 +645,9 @@ osm_port_t *osm_get_port_by_guid(IN osm_subn_t const *p_subn, IN ib_net64_t guid **********************************************************************/ osm_port_t *osm_get_port_by_lid(IN osm_subn_t const * subn, IN ib_net16_t lid) { - osm_port_t *port = NULL; - uint16_t base_lid; - uint8_t lmc; - lid = cl_ntoh16(lid); - if (lid >= cl_ptr_vector_get_size(&subn->port_lid_tbl)) - return NULL; - - /* Loop on lmc from 0 up through max LMC possible */ - for (lmc = 0; lmc <= IB_PORT_LMC_MAX; lmc++) { - /* Calculate a base LID assuming this is the real LMC */ - base_lid = lid & ~((1 << lmc) - 1); - - port = cl_ptr_vector_get(&subn->port_lid_tbl, base_lid); - /* Determine if base LID "tested" is the real base LID */ - /* This is true if the LMC "tested" is the port's actual LMC */ - if (port && lmc <= osm_port_get_lmc(port)) - return port; - } - + if (lid < cl_ptr_vector_get_size(&subn->port_lid_tbl)) + return cl_ptr_vector_get(&subn->port_lid_tbl, lid); return NULL; } -- 1.6.5.rc1 From sashak at voltaire.com Tue Sep 29 04:04:23 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 13:04:23 +0200 Subject: [ofa-general] Re: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan In-Reply-To: <20090923144555.9efa2c75.weiny2@llnl.gov> References: <20090923144555.9efa2c75.weiny2@llnl.gov> Message-ID: <20090929110423.GN26931@me> On 14:45 Wed 23 Sep , Ira Weiny wrote: > > From: Ira Weiny > Date: Wed, 23 Sep 2009 14:26:55 -0700 > Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan > > Also ibd_sm_id is never valid in this tool as the "-s" option is used > for "suppress" > > Signed-off-by: Ira Weiny Applied. Thanks. Sasha From tziporet at dev.mellanox.co.il Tue Sep 29 06:28:42 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Tue, 29 Sep 2009 15:28:42 +0200 Subject: [ofa-general] srp availability in OFED1.5 In-Reply-To: <1254136602.10604.1.camel@leftknee> References: <1254136602.10604.1.camel@leftknee> Message-ID: <4AC20B8A.4050404@mellanox.co.il> Hoot Thompson wrote: > Will the srp module be included in the OFED 1.5 release? If so, when? > > > SRP is already in OFED 1.5 If there are any problems please let Vu know about it Tziporet From slavas at Voltaire.COM Tue Sep 29 06:53:18 2009 From: slavas at Voltaire.COM (Slava Strebkov) Date: Tue, 29 Sep 2009 15:53:18 +0200 Subject: [ofa-general] [PATCH 1/2 v4] opensm: Storage organization for multicast groups Message-ID: <4AC2114E.3010303@Voltaire.COM> Main purpose is to prepare infrastructure for (many) mgids to one mlid compression. Proposed the following changes: 1.Element in mlid array is now a multicast group box. 2.mgrp_box keeps a list of mgroups sharing same mlid. With introduction of compression, there will be many multicast groups per mlid. Current implementation keeps one mgid to one mlid ratio. 3.mgrp_box has a map of ports sharing same mlid. Ports sorted by port guid. Port map is necessary for building spanning tree per mgroup_box, not just for single mgroup. 4.Element in port map keeps a list of mgroups opened by this port. This allows quick deletion of mgroups when port changes state to DOWN. 5.Multicast processing functions use mgroup_box object instead of mgroup. Signed-off-by: Slava Strebkov --- opensm/include/opensm/osm_multicast.h | 130 ++++++++++++++++++++++++++++++-- opensm/include/opensm/osm_subnet.h | 49 ++++++++++--- opensm/opensm/osm_drop_mgr.c | 2 +- opensm/opensm/osm_mcast_mgr.c | 110 +++++++++++++-------------- opensm/opensm/osm_multicast.c | 108 ++++++++++++++++++++++++-- opensm/opensm/osm_qos_policy.c | 39 ++++++---- opensm/opensm/osm_sa.c | 32 +++----- opensm/opensm/osm_sa_mcmember_record.c | 53 ++++++++++--- opensm/opensm/osm_sa_path_record.c | 32 ++++++-- opensm/opensm/osm_subnet.c | 33 +++++++-- 10 files changed, 440 insertions(+), 148 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index 32bcb78..d4daf4b 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -97,8 +97,8 @@ BEGIN_C_DECLS */ typedef struct osm_mgrp { cl_fmap_item_t map_item; + cl_list_item_t box_item; ib_net16_t mlid; - osm_mtree_node_t *p_root; cl_qmap_t mcm_port_tbl; ib_member_rec_t mcmember_rec; boolean_t well_known; @@ -109,15 +109,13 @@ typedef struct osm_mgrp { * map_item * Map Item for fmap linkage. Must be first element!! * +* box_item +* List Item for the group in mgroup box +* * mlid * The network ordered LID of this Multicast Group (must be * >= 0xC000). * -* p_root -* Pointer to the root "tree node" in the single spanning tree -* for this multicast group. The nodes of the tree represent -* switches. Member ports are not represented in the tree. -* * mcm_port_tbl * Table (sorted by port GUID) of osm_mcm_port_t objects * representing the member ports of this multicast group. @@ -133,6 +131,71 @@ typedef struct osm_mgrp { * SEE ALSO *********/ +/****s* OpenSM: Multicast Group Holder/osm_mgrp_box_t +* NAME +* osm_mgrp_box_t +* +* DESCRIPTION +* Holder for mgroups. +* +* The osm_mgrp_box_t object should be treated as opaque and should +* be manipulated only through the provided functions. +* +* SYNOPSIS +*/ +typedef struct osm_mgrp_box { + cl_qmap_t mgrp_port_map; + cl_qlist_t mgrp_list; + ib_net16_t mlid; + osm_mtree_node_t *p_root; +} osm_mgrp_box_t; +/* +* FIELDS +* mgrp_port_map +* Map sorted by GUID of osm_mgrp_port_t objects represents +* ports to be routed with same mlid +* +* mgrp_list +* List of mgroups having same mlid +* +* mlid +* The network ordered LID of this Multicast Group (must be +* >= 0xC000). +* +* p_root +* Pointer to the root "tree node" in the single spanning tree +* for this multicast group. The nodes of the tree represent +* switches. Member ports are not represented in the tree. +* +* SEE ALSO +*********/ +/****s* OpenSM: Multicast group Port /osm_mgrp_port_t +* NAME +* osm_mgrp_port_t +* +* DESCRIPTION +* Holder for pointers to mgroups and port guid. +* +* +* SYNOPSIS +*/ +typedef struct osm_mgrp_port { + cl_map_item_t guid_item; + unsigned num_groups; + osm_port_t *p_port; +} osm_mgrp_port_t; +/* +* FIELDS +* guid_item +* Map for ports. Must be first element +* +* num_mgroups +* Number of mgroups opened by this port +* +* p_mcm_port +* pointer to osm_mcm_port_t object +* +*/ /****f* OpenSM: Multicast Group/osm_mgrp_new * NAME * osm_mgrp_new @@ -382,5 +445,58 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, osm_mcm_port_t * mcm_port, ib_member_rec_t * mcmr); void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mpgr); +/****f* OpenSM: Multicast Group Box /osm_mgrp_box_new +* NAME +* osm_mgrp_box_new +* +* DESCRIPTION +* Allocates and initializes a Multicast Group Box for use. +* +* SYNOPSIS +*/ +osm_mgrp_box_t *osm_mgrp_box_new(IN osm_subn_t * p_subn, + IN ib_net16_t mlid); +/* +* PARAMETERS +* p_subn +* (in) pointer to osm_subnet +* mlid +* [in] Multicast LID for this multicast group box. +* +* RETURN VALUES +* pointer to initialized osm_mgrp_box_t +* or NULL, if unsuccessful +* +* SEE ALSO +* Multicast Group Box, osm_mgrp_box_delete +*********/ +/****f* OpenSM: Multicast Group Box /osm_mgrp_box_delete +* NAME +* osm_mgrp_box_delete +* +* DESCRIPTION +* Removes entry from array of boxes +* Removes port from mgroup port list +* +* SYNOPSIS +*/ +void osm_mgrp_box_delete(IN osm_subn_t * p_subn, + IN ib_net16_t mlid); +/* +* PARAMETERS +* p_subn +* [in] Pointer to osm_subnet +* +* mlid +* [in] box's mlid +* +* RETURN VALUES +* None. +* +* NOTES +* +* SEE ALSO +* +*********/ END_C_DECLS #endif /* _OSM_MULTICAST_H_ */ diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 6c20de8..fe4695f 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -69,6 +69,7 @@ BEGIN_C_DECLS #define OSM_SUBNET_VECTOR_CAPACITY 256 struct osm_opensm; struct osm_qos_policy; +struct osm_mgrp_box; /****h* OpenSM/Subnet * NAME @@ -513,7 +514,7 @@ typedef struct osm_subn { boolean_t coming_out_of_standby; unsigned need_update; cl_fmap_t mgrp_mgid_tbl; - void *mgroups[IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + 1]; + void *mboxes[IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + 1]; } osm_subn_t; /* * FIELDS @@ -634,8 +635,8 @@ typedef struct osm_subn { * This flag should be on during first non-master heavy * (including pre-master discovery stage) * -* mgroups -* Array of pointers to all Multicast Group objects in the subnet. +* mboxes +* Array of pointers to all Multicast Group Box objects in the subnet. * Indexed by MLID offset from base MLID. * * SEE ALSO @@ -935,21 +936,21 @@ struct osm_port *osm_get_port_by_guid(IN osm_subn_t const *p_subn, * osm_port_t *********/ -/****f* OpenSM: Subnet/osm_get_mgrp_by_mlid +/****f* OpenSM: Subnet/osm_get_mgrp_box_by_mlid * NAME -* osm_get_mgrp_by_mlid +* osm_get_mgrp_box_by_mlid * * DESCRIPTION -* The looks for the given multicast group in the subnet table by mlid. +* The looks for the given multicast group box in the subnet table by mlid. * NOTE: this code is not thread safe. Need to grab the lock before * calling it. * * SYNOPSIS */ static inline -struct osm_mgrp *osm_get_mgrp_by_mlid(osm_subn_t const *p_subn, ib_net16_t mlid) +struct osm_mgrp_box *osm_get_mgrp_box_by_mlid(osm_subn_t const *p_subn, ib_net16_t mlid) { - return p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; + return p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; } /* * PARAMETERS @@ -960,7 +961,7 @@ struct osm_mgrp *osm_get_mgrp_by_mlid(osm_subn_t const *p_subn, ib_net16_t mlid) * [in] The multicast group mlid in network order * * RETURN VALUES -* The multicast group structure pointer if found. NULL otherwise. +* The multicast group box structure pointer if found. NULL otherwise. *********/ /****f* OpenSM: Helper/osm_get_physp_by_mad_addr @@ -1116,5 +1117,33 @@ int osm_subn_write_conf_file(char *file_name, IN osm_subn_opt_t * const p_opt); *********/ int osm_subn_verify_config(osm_subn_opt_t * const p_opt); +ib_net16_t osm_mgrp_box_get_mlid(IN struct osm_mgrp_box *p_mgrp_box); + +/****f* OpenSM: Subnet/osm_mgrp_box_get_mlid_by_mgid +* NAME +* osm_mgrp_box_get_mlid_by_mgid +* +* DESCRIPTION +* The looks for multicast group by mgid. Returns mlid of found group +* or 0 if no group found. +* NOTE: this code is not thread safe. Need to grab the lock before +* calling it. +* +* SYNOPSIS +*/ +ib_net16_t osm_mgrp_box_get_mlid_by_mgid(IN osm_subn_t const *p_subn, + IN const ib_gid_t * const p_mgid); +/* +* PARAMETERS +* p_subn +* [in] Pointer to an osm_subn_t object +* +* p_mgid +* [in] Pointer to multicast group mgid +* +* RETURN VALUES +* The multicast group mlid if found. 0 otherwise. +*********/ + END_C_DECLS #endif /* _OSM_SUBNET_H_ */ diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 4f98cc9..c86ee72 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 3894677..4fbae91 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -111,14 +111,14 @@ static void mcast_mgr_purge_tree_node(IN osm_mtree_node_t * p_mtn) /********************************************************************** **********************************************************************/ -static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_t * p_mgrp) +static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_box_t * p_mgrp_box) { OSM_LOG_ENTER(sm->p_log); - if (p_mgrp->p_root) - mcast_mgr_purge_tree_node(p_mgrp->p_root); + if (p_mgrp_box->p_root) + mcast_mgr_purge_tree_node(p_mgrp_box->p_root); - p_mgrp->p_root = NULL; + p_mgrp_box->p_root = NULL; OSM_LOG_EXIT(sm->p_log); } @@ -126,28 +126,26 @@ static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_t * p_mgrp) /********************************************************************** **********************************************************************/ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, - const osm_mgrp_t * p_mgrp, + const osm_mgrp_box_t * p_mgrp_box, const osm_switch_t * p_sw) { float avg_hops = 0; uint32_t hops = 0; uint32_t num_ports = 0; - const osm_mcm_port_t *p_mcm_port; - const cl_qmap_t *p_mcm_tbl; + const osm_mgrp_port_t *p_box_port; OSM_LOG_ENTER(sm->p_log); - p_mcm_tbl = &p_mgrp->mcm_port_tbl; /* For each member of the multicast group, compute the number of hops to its base LID. */ - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); - p_mcm_port = - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - hops += osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); + for (p_box_port = (osm_mgrp_port_t *) cl_qmap_head(&p_mgrp_box->mgrp_port_map); + p_box_port != (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_box->mgrp_port_map); + p_box_port = + (osm_mgrp_port_t *) cl_qmap_next(&p_box_port->guid_item)) { + hops += osm_switch_get_port_least_hops(p_sw, p_box_port->p_port); num_ports++; } @@ -168,27 +166,27 @@ static float osm_mcast_mgr_compute_avg_hops(osm_sm_t * sm, of the group HCAs **********************************************************************/ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, - const osm_mgrp_t * p_mgrp, + const osm_mgrp_box_t * p_mgrp_box, const osm_switch_t * p_sw) { uint32_t max_hops = 0; uint32_t hops = 0; - const osm_mcm_port_t *p_mcm_port; - const cl_qmap_t *p_mcm_tbl; + const osm_mgrp_port_t *p_box_port; + const cl_qmap_t *p_box_port_tbl; OSM_LOG_ENTER(sm->p_log); - p_mcm_tbl = &p_mgrp->mcm_port_tbl; + p_box_port_tbl = &p_mgrp_box->mgrp_port_map; /* For each member of the multicast group, compute the number of hops to its base LID. */ - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); - p_mcm_port = - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { - hops = osm_switch_get_port_least_hops(p_sw, p_mcm_port->port); + for (p_box_port = (osm_mgrp_port_t *) cl_qmap_head(p_box_port_tbl); + p_box_port != (osm_mgrp_port_t *) cl_qmap_end(p_box_port_tbl); + p_box_port = + (osm_mgrp_port_t *) cl_qmap_next(&p_box_port->guid_item)) { + hops = osm_switch_get_port_least_hops(p_sw, p_box_port->p_port); if (hops > max_hops) max_hops = hops; } @@ -210,7 +208,7 @@ static float osm_mcast_mgr_compute_max_hops(osm_sm_t * sm, of the multicast group. **********************************************************************/ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, - const osm_mgrp_t * p_mgrp) + const osm_mgrp_box_t * p_mgrp_box) { cl_qmap_t *p_sw_tbl; const osm_switch_t *p_sw; @@ -227,7 +225,7 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, p_sw_tbl = &sm->p_subn->sw_guid_tbl; - CL_ASSERT(!osm_mgrp_is_empty(p_mgrp)); + CL_ASSERT(!osm_mgrp_is_empty(p_mgrp_box)); for (p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl); @@ -236,9 +234,9 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, continue; if (use_avg_hops) - hops = osm_mcast_mgr_compute_avg_hops(sm, p_mgrp, p_sw); + hops = osm_mcast_mgr_compute_avg_hops(sm, p_mgrp_box, p_sw); else - hops = osm_mcast_mgr_compute_max_hops(sm, p_mgrp, p_sw); + hops = osm_mcast_mgr_compute_max_hops(sm, p_mgrp_box, p_sw); OSM_LOG(sm->p_log, OSM_LOG_DEBUG, "Switch 0x%016" PRIx64 ", hops = %f\n", @@ -267,7 +265,7 @@ static osm_switch_t *mcast_mgr_find_optimal_switch(osm_sm_t * sm, This function returns the existing or optimal root swtich for the tree. **********************************************************************/ static osm_switch_t *mcast_mgr_find_root_switch(osm_sm_t * sm, - const osm_mgrp_t * p_mgrp) + const osm_mgrp_box_t * p_mgrp_box) { const osm_switch_t *p_sw = NULL; @@ -279,7 +277,7 @@ static osm_switch_t *mcast_mgr_find_root_switch(osm_sm_t * sm, the root will be always on the first switch attached to it. - Very bad ... */ - p_sw = mcast_mgr_find_optimal_switch(sm, p_mgrp); + p_sw = mcast_mgr_find_optimal_switch(sm, p_mgrp_box); OSM_LOG_EXIT(sm->p_log); return (osm_switch_t *) p_sw; @@ -354,7 +352,7 @@ static int mcast_mgr_set_mft_block(osm_sm_t * sm, IN osm_switch_t * p_sw, spanning tree that eminate from this switch. On input, the p_list contains the group members that must be routed from this switch. **********************************************************************/ -static void mcast_mgr_subdivide(osm_sm_t * sm, osm_mgrp_t * p_mgrp, +static void mcast_mgr_subdivide(osm_sm_t * sm, osm_mgrp_box_t * p_mgrp_box, osm_switch_t * p_sw, cl_qlist_t * p_list, cl_qlist_t * list_array, uint8_t array_size) { @@ -365,7 +363,7 @@ static void mcast_mgr_subdivide(osm_sm_t * sm, osm_mgrp_t * p_mgrp, OSM_LOG_ENTER(sm->p_log); - mlid_ho = cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)); + mlid_ho = cl_ntoh16(osm_mgrp_box_get_mlid(p_mgrp_box)); /* For Multicast Groups, we want not to count on previous @@ -455,7 +453,7 @@ static void mcast_mgr_purge_list(osm_sm_t * sm, cl_qlist_t * p_list) The function returns the newly created mtree node element. **********************************************************************/ -static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, +static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_box_t * p_mgrp_box, osm_switch_t * p_sw, cl_qlist_t * p_list, uint8_t depth, uint8_t upstream_port, @@ -481,7 +479,7 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, node_guid = osm_node_get_node_guid(p_sw->p_node); node_guid_ho = cl_ntoh64(node_guid); - mlid_ho = cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)); + mlid_ho = cl_ntoh16(osm_mgrp_box_get_mlid(p_mgrp_box)); OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, "Routing MLID 0x%X through switch 0x%" PRIx64 @@ -558,7 +556,7 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, for (i = 0; i < max_children; i++) cl_qlist_init(&list_array[i]); - mcast_mgr_subdivide(sm, p_mgrp, p_sw, p_list, list_array, max_children); + mcast_mgr_subdivide(sm, p_mgrp_box, p_sw, p_list, list_array, max_children); p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); @@ -641,7 +639,7 @@ static osm_mtree_node_t *mcast_mgr_branch(osm_sm_t * sm, osm_mgrp_t * p_mgrp, CL_ASSERT(p_remote_physp); p_mtn->child_array[i] = - mcast_mgr_branch(sm, p_mgrp, p_remote_node->sw, + mcast_mgr_branch(sm, p_mgrp_box, p_remote_node->sw, p_port_list, depth, osm_physp_get_port_num (p_remote_physp), p_max_depth); @@ -677,11 +675,10 @@ Exit: /********************************************************************** **********************************************************************/ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, - osm_mgrp_t * p_mgrp) + osm_mgrp_box_t * p_mgrp_box) { - const cl_qmap_t *p_mcm_tbl; - const osm_mcm_port_t *p_mcm_port; uint32_t num_ports; + const osm_mgrp_port_t *p_mgrp_port; cl_qlist_t port_list; osm_switch_t *p_sw; osm_mcast_work_obj_t *p_wobj; @@ -699,14 +696,13 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, on multicast forwarding table information if the user wants to preserve existing multicast routes. */ - mcast_mgr_purge_tree(sm, p_mgrp); + mcast_mgr_purge_tree(sm, p_mgrp_box); - p_mcm_tbl = &p_mgrp->mcm_port_tbl; - num_ports = cl_qmap_count(p_mcm_tbl); + num_ports = cl_qmap_count(&p_mgrp_box->mgrp_port_map); if (num_ports == 0) { OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, "MLID 0x%X has no members - nothing to do\n", - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp))); + cl_ntoh16(osm_mgrp_box_get_mlid(p_mgrp_box))); goto Exit; } @@ -726,11 +722,11 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, Locate the switch around which to create the spanning tree for this multicast group. */ - p_sw = mcast_mgr_find_root_switch(sm, p_mgrp); + p_sw = mcast_mgr_find_root_switch(sm, p_mgrp_box); if (p_sw == NULL) { OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A08: " "Unable to locate a suitable switch for group 0x%X\n", - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp))); + cl_ntoh16(osm_mgrp_box_get_mlid(p_mgrp_box))); status = IB_ERROR; goto Exit; } @@ -738,20 +734,20 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, /* Build the first "subset" containing all member ports. */ - for (p_mcm_port = (osm_mcm_port_t *) cl_qmap_head(p_mcm_tbl); - p_mcm_port != (osm_mcm_port_t *) cl_qmap_end(p_mcm_tbl); - p_mcm_port = - (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item)) { + for (p_mgrp_port = (osm_mgrp_port_t *) cl_qmap_head(&p_mgrp_box->mgrp_port_map); + p_mgrp_port != (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_box->mgrp_port_map); + p_mgrp_port = + (osm_mgrp_port_t *) cl_qmap_next(&p_mgrp_port->guid_item)) { /* Acquire the port object for this port guid, then create the new worker object to build the list. */ - p_wobj = mcast_work_obj_new(p_mcm_port->port); + p_wobj = mcast_work_obj_new(p_mgrp_port->p_port); if (p_wobj == NULL) { OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A10: " "Insufficient memory to route port 0x%016" PRIx64 "\n", - cl_ntoh64(osm_port_get_guid(p_mcm_port->port))); + cl_ntoh64(p_mgrp_port->p_port->guid)); continue; } @@ -759,12 +755,12 @@ static ib_api_status_t mcast_mgr_build_spanning_tree(osm_sm_t * sm, } count = cl_qlist_count(&port_list); - p_mgrp->p_root = mcast_mgr_branch(sm, p_mgrp, p_sw, &port_list, 0, 0, + p_mgrp_box->p_root = mcast_mgr_branch(sm, p_mgrp_box, p_sw, &port_list, 0, 0, &max_depth); OSM_LOG(sm->p_log, OSM_LOG_VERBOSE, "Configured MLID 0x%X for %u ports, max tree depth = %u\n", - cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth); + cl_ntoh16(osm_mgrp_box_get_mlid(p_mgrp_box)), count, max_depth); Exit: OSM_LOG_EXIT(sm->p_log); @@ -971,7 +967,7 @@ Exit: static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) { ib_api_status_t status = IB_SUCCESS; - osm_mgrp_t *mgrp; + osm_mgrp_box_t *p_mgrp_box; OSM_LOG_ENTER(sm->p_log); @@ -983,9 +979,9 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid) port in the group. */ mcast_mgr_clear(sm, mlid); - mgrp = osm_get_mgrp_by_mlid(sm->p_subn, cl_hton16(mlid)); - if (mgrp) { - status = mcast_mgr_build_spanning_tree(sm, mgrp); + p_mgrp_box = osm_get_mgrp_box_by_mlid(sm->p_subn, cl_hton16(mlid)); + if (p_mgrp_box) { + status = mcast_mgr_build_spanning_tree(sm, p_mgrp_box); if (status != IB_SUCCESS) OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 0A17: " "Unable to create spanning tree (%s) for mlid " @@ -1065,7 +1061,7 @@ int osm_mcast_mgr_process(osm_sm_t * sm) for (i = 0; i <= sm->p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; i++) - if (sm->p_subn->mgroups[i] || sm->mlids_req[i]) + if (sm->p_subn->mboxes[i] || sm->mlids_req[i]) mcast_mgr_process_mlid(sm, i + IB_LID_MCAST_START_HO); memset(sm->mlids_req, 0, sm->mlids_req_max); diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 5a10003..01c90d8 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -51,6 +51,18 @@ #include #include +static osm_mgrp_port_t *osm_mgrp_port_new(osm_port_t *p_port) +{ + osm_mgrp_port_t *p_mgrp_port = + (osm_mgrp_port_t *) malloc(sizeof(osm_mgrp_port_t)); + if (!p_mgrp_port) { + return NULL; + } + memset(p_mgrp_port, 0, sizeof(*p_mgrp_port)); + p_mgrp_port->p_port = p_port; + return p_mgrp_port; +} + /********************************************************************** **********************************************************************/ void osm_mgrp_delete(IN osm_mgrp_t * p_mgrp) @@ -69,8 +81,6 @@ void osm_mgrp_delete(IN osm_mgrp_t * p_mgrp) (osm_mcm_port_t *) cl_qmap_next(&p_mcm_port->map_item); osm_mcm_port_delete(p_mcm_port); } - /* destroy the mtree_node structure */ - osm_mtree_destroy(p_mgrp->p_root); free(p_mgrp); } @@ -99,8 +109,6 @@ void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mgrp) if (mgrp->full_members) return; - osm_mtree_destroy(mgrp->p_root); - mgrp->p_root = NULL; while (cl_qmap_count(&mgrp->mcm_port_tbl)) { mcm_port = (osm_mcm_port_t *)cl_qmap_head(&mgrp->mcm_port_tbl); @@ -114,7 +122,6 @@ void osm_mgrp_cleanup(osm_subn_t * subn, osm_mgrp_t * mgrp) return; cl_fmap_remove_item(&subn->mgrp_mgid_tbl, &mgrp->map_item); - subn->mgroups[cl_ntoh16(mgrp->mlid) - IB_LID_MCAST_START_HO] = NULL; free(mgrp); } @@ -157,6 +164,7 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, cl_map_item_t *prev_item; uint8_t prev_join_state = 0, join_state = mcmr->scope_state; uint8_t prev_scope; + osm_mgrp_box_t *p_mgrp_box; if (osm_log_is_active(log, OSM_LOG_VERBOSE)) { char gid_str[INET6_ADDRSTRLEN]; @@ -193,7 +201,20 @@ osm_mcm_port_t *osm_mgrp_add_port(IN osm_subn_t * subn, osm_log_t * log, prev_join_state | join_state); } else { cl_qlist_insert_tail(&port->mcm_list, &mcm_port->list_item); - osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); + p_mgrp_box = osm_get_mgrp_box_by_mlid(subn, mgrp->mlid); + osm_mgrp_port_t *p_mgrp_port = (osm_mgrp_port_t *) + cl_qmap_get(&p_mgrp_box->mgrp_port_map, ib_gid_get_guid(&mcm_port->port_gid)); + if (p_mgrp_port == + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_box->mgrp_port_map)) { + /* new port to mlid */ + p_mgrp_port = osm_mgrp_port_new(mcm_port->port); + if (!p_mgrp_port) { + return NULL; + } + cl_qmap_insert(&p_mgrp_box->mgrp_port_map, + ib_gid_get_guid(&mcm_port->port_gid), &p_mgrp_port->guid_item); + } + osm_sm_reroute_mlid(&subn->p_osm->sm, p_mgrp_box->mlid); } /* o15.0.1.11: copy the join state */ @@ -214,6 +235,7 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, { uint8_t join_state = mcmr->scope_state & 0xf; uint8_t port_join_state, new_join_state; + osm_mgrp_box_t *p_mgrp_box; /* * according to the same o15-0.1.14 we get the stored @@ -222,6 +244,7 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, */ port_join_state = mcm_port->scope_state & 0x0F; new_join_state = port_join_state & ~join_state; + p_mgrp_box = osm_get_mgrp_box_by_mlid(subn, mgrp->mlid); if (osm_log_is_active(log, OSM_LOG_VERBOSE)) { char gid_str[INET6_ADDRSTRLEN]; @@ -242,14 +265,27 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, port_join_state, new_join_state); mcmr->scope_state = mcm_port->scope_state; } else { + osm_mgrp_port_t *p_mgrp_port; mcmr->scope_state = mcm_port->scope_state; OSM_LOG(log, OSM_LOG_DEBUG, "removing port 0x%" PRIx64 "\n", cl_ntoh64(mcm_port->port->guid)); cl_qmap_remove_item(&mgrp->mcm_port_tbl, &mcm_port->map_item); cl_qlist_remove_item(&mcm_port->port->mcm_list, &mcm_port->list_item); + p_mgrp_port = (osm_mgrp_port_t *) + cl_qmap_get(&p_mgrp_box->mgrp_port_map, mcm_port->port->guid); + if (p_mgrp_port != + (osm_mgrp_port_t *) cl_qmap_end(&p_mgrp_box->mgrp_port_map)) { + p_mgrp_port->num_groups--; + if (0 == p_mgrp_port->num_groups) { + /* No mgroups registered on this port for current mlid */ + cl_qmap_remove_item(&p_mgrp_box->mgrp_port_map, + &p_mgrp_port->guid_item); + free(p_mgrp_port); + } + } osm_mcm_port_delete(mcm_port); - osm_sm_reroute_mlid(&subn->p_osm->sm, mgrp->mlid); + osm_sm_reroute_mlid(&subn->p_osm->sm, p_mgrp_box->mlid); } /* no more full members so the group will be deleted after re-route @@ -258,6 +294,12 @@ void osm_mgrp_remove_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, !(new_join_state & IB_JOIN_STATE_FULL) && --mgrp->full_members == 0) { mgrp_send_notice(subn, log, mgrp, 67); + cl_qlist_remove_item(&p_mgrp_box->mgrp_list, &mgrp->box_item); + if (0 == cl_qlist_count(&p_mgrp_box->mgrp_list)) { + /* empty mgrp_box */ + osm_mgrp_box_delete(subn,p_mgrp_box->mlid); + } + osm_mgrp_cleanup(subn, mgrp); } } @@ -266,8 +308,16 @@ void osm_mgrp_delete_port(osm_subn_t * subn, osm_log_t * log, osm_mgrp_t * mgrp, ib_net64_t port_guid) { ib_member_rec_t mcmrec; - cl_map_item_t *item = cl_qmap_get(&mgrp->mcm_port_tbl, port_guid); + osm_mgrp_box_t *p_mgrp_box; + osm_mgrp_port_t *p_mgrp_port; + cl_map_item_t *item = cl_qmap_get(&mgrp->mcm_port_tbl, port_guid); + p_mgrp_box = osm_get_mgrp_box_by_mlid(subn, mgrp->mlid); + p_mgrp_port = (osm_mgrp_port_t *) + cl_qmap_remove(&p_mgrp_box->mgrp_port_map, port_guid); + if (p_mgrp_port != (osm_mgrp_port_t *)cl_qmap_end(&p_mgrp_box->mgrp_port_map)) { + free(p_mgrp_port); + } if (item != cl_qmap_end(&mgrp->mcm_port_tbl)) { mcmrec.scope_state = 0xf; osm_mgrp_remove_port(subn, log, mgrp, (osm_mcm_port_t *) item, @@ -296,3 +346,43 @@ boolean_t osm_mgrp_is_port_present(IN const osm_mgrp_t * p_mgrp, *pp_mcm_port = NULL; return FALSE; } + +/********************************************************************** + **********************************************************************/ +osm_mgrp_box_t *osm_mgrp_box_new(IN osm_subn_t * p_subn,ib_net16_t mlid) +{ + osm_mgrp_box_t *p_mgrp_box; + p_mgrp_box = + p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = + (osm_mgrp_box_t *) calloc(1,sizeof(*p_mgrp_box)); + if (!p_mgrp_box) + return NULL; + p_mgrp_box->mlid = mlid; + cl_qmap_init(&p_mgrp_box->mgrp_port_map); + cl_qlist_init(&p_mgrp_box->mgrp_list); + return p_mgrp_box; +} + +/********************************************************************** + **********************************************************************/ +void osm_mgrp_box_delete(IN osm_subn_t *p_subn, ib_net16_t mlid) +{ + osm_mgrp_port_t *p_osm_mgr_port; + cl_map_item_t *p_item; + osm_mgrp_box_t *p_mgrp_box = + p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO]; + p_item = cl_qmap_head(&p_mgrp_box->mgrp_port_map); + /* Delete ports shared same MLID */ + while (p_item != cl_qmap_end(&p_mgrp_box->mgrp_port_map)) { + p_osm_mgr_port = (osm_mgrp_port_t *) p_item; + cl_qmap_remove_item(&p_mgrp_box->mgrp_port_map, p_item); + p_item = cl_qmap_head(&p_mgrp_box->mgrp_port_map); + free(p_osm_mgr_port); + } + /* Remove mgrp from this MLID */ + cl_qlist_remove_all(&p_mgrp_box->mgrp_list); + /* Destroy the mtree_node structure */ + osm_mtree_destroy(p_mgrp_box->p_root); + p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = NULL; + free(p_mgrp_box); +} diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c index 9b72293..6c0a1e6 100644 --- a/opensm/opensm/osm_qos_policy.c +++ b/opensm/opensm/osm_qos_policy.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -773,6 +773,8 @@ static void __qos_policy_validate_pkey( uint32_t flow; uint8_t hop; osm_mgrp_t * p_mgrp; + osm_mgrp_box_t * p_mgrp_box; + cl_list_item_t *p_item; if (!p_qos_policy || !p_qos_match_rule || !p_prtn) return; @@ -796,28 +798,33 @@ static void __qos_policy_validate_pkey( if (!p_prtn->mlid) return; - p_mgrp = osm_get_mgrp_by_mlid(p_qos_policy->p_subn, p_prtn->mlid); - if (!p_mgrp) { + p_mgrp_box = osm_get_mgrp_box_by_mlid(p_qos_policy->p_subn, p_prtn->mlid); + if (!p_mgrp_box) { OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_ERROR, - "ERR AC16: MCast group for partition with " + "ERR AC16: MCast group box for partition with " "pkey 0x%04X not found\n", cl_ntoh16(p_prtn->pkey)); return; } - CL_ASSERT((cl_ntoh16(p_mgrp->mcmember_rec.pkey) & 0x7fff) == - (cl_ntoh16(p_prtn->pkey) & 0x7fff)); - - ib_member_get_sl_flow_hop(p_mgrp->mcmember_rec.sl_flow_hop, - &sl, &flow, &hop); - if (sl != p_prtn->sl) { - OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, - "Updating MCGroup (MLID 0x%04x) SL to " - "match partition SL (%u)\n", - cl_hton16(p_mgrp->mcmember_rec.mlid), - p_prtn->sl); - p_mgrp->mcmember_rec.sl_flow_hop = + p_item = cl_qlist_head(&p_mgrp_box->mgrp_list); + while (p_item != cl_qlist_end(&p_mgrp_box->mgrp_list)) { + p_mgrp = (osm_mgrp_t *) PARENT_STRUCT(p_item, osm_mgrp_t, + box_item); + p_item = cl_qlist_next(p_item); + CL_ASSERT((cl_ntoh16(p_mgrp->mcmember_rec.pkey) & 0x7fff) == + (cl_ntoh16(p_prtn->pkey) & 0x7fff)); + ib_member_get_sl_flow_hop(p_mgrp->mcmember_rec.sl_flow_hop, + &sl, &flow, &hop); + if (sl != p_prtn->sl) { + OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, + "Updating MCGroup (MLID 0x%04x) SL to " + "match partition SL (%u)\n", + cl_hton16(p_mgrp->mcmember_rec.mlid), + p_prtn->sl); + p_mgrp->mcmember_rec.sl_flow_hop = ib_member_set_sl_flow_hop(p_prtn->sl, flow, hop); + } } } diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index 02737c2..a5d8945 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -706,18 +706,17 @@ static void sa_dump_all_sa(osm_opensm_t * p_osm, FILE * file) { struct opensm_dump_context dump_context; osm_mgrp_t *p_mgrp; - int i; dump_context.p_osm = p_osm; dump_context.file = file; OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, "Dump multicast\n"); cl_plock_acquire(&p_osm->lock); - for (i = 0; i <= p_osm->subn.max_mcast_lid_ho - IB_LID_MCAST_START_HO; - i++) { - p_mgrp = p_osm->subn.mgroups[i]; - if (p_mgrp) - sa_dump_one_mgrp(p_mgrp, &dump_context); + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_osm->subn.mgrp_mgid_tbl); + while (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_osm->subn.mgrp_mgid_tbl)) { + sa_dump_one_mgrp(p_mgrp, &dump_context); + p_mgrp = (osm_mgrp_t*) cl_fmap_next(&p_mgrp->map_item); } + OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, "Dump inform\n"); cl_qlist_apply_func(&p_osm->subn.sa_infr_list, sa_dump_one_inform, &dump_context); @@ -740,22 +739,15 @@ static osm_mgrp_t *load_mcgroup(osm_opensm_t * p_osm, ib_net16_t mlid, unsigned well_known) { ib_net64_t comp_mask; - osm_mgrp_t *p_mgrp; + osm_mgrp_t *p_mgrp = NULL; + cl_fmap_item_t *p_fitem; cl_plock_excl_acquire(&p_osm->lock); - p_mgrp = osm_get_mgrp_by_mlid(&p_osm->subn, mlid); - if (p_mgrp) { - if (!memcmp(&p_mgrp->mcmember_rec.mgid, &p_mcm_rec->mgid, - sizeof(ib_gid_t))) { - OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, - "mgrp %04x is already here.", cl_ntoh16(mlid)); - goto _out; - } - OSM_LOG(&p_osm->log, OSM_LOG_VERBOSE, - "mlid %04x is already used by another MC group. Will " - "request clients reregistration.\n", cl_ntoh16(mlid)); - p_mgrp = NULL; + p_fitem = cl_fmap_get(&p_osm->subn.mgrp_mgid_tbl, &p_mcm_rec->mgid); + if (p_fitem != cl_fmap_end(&p_osm->subn.mgrp_mgid_tbl)) { + OSM_LOG(&p_osm->log, OSM_LOG_DEBUG, + "mgrp %04x is already here.", cl_ntoh16(mlid)); goto _out; } diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8f7816b..b39f986 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -121,12 +121,12 @@ static ib_net16_t get_new_mlid(osm_sa_t * sa, ib_net16_t requested_mlid) if (requested_mlid && cl_ntoh16(requested_mlid) >= IB_LID_MCAST_START_HO && cl_ntoh16(requested_mlid) <= p_subn->max_mcast_lid_ho - && !osm_get_mgrp_by_mlid(p_subn, requested_mlid)) + && !osm_get_mgrp_box_by_mlid(p_subn, requested_mlid)) return requested_mlid; max = p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO + 1; for (i = 0; i < max; i++) - if (!sa->p_subn->mgroups[i]) + if (!sa->p_subn->mboxes[i]) return cl_hton16(i + IB_LID_MCAST_START_HO); return 0; @@ -730,10 +730,11 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, IN const osm_physp_t * p_physp, OUT osm_mgrp_t ** pp_mgrp) { - ib_net16_t mlid; + ib_net16_t mlid,existed_mlid; unsigned zero_mgid, i; uint8_t scope; ib_gid_t *p_mgid; + osm_mgrp_box_t *p_mgrp_box; ib_api_status_t status = IB_SUCCESS; ib_member_rec_t mcm_rec = *p_recvd_mcmember_rec; /* copy for modifications */ @@ -811,6 +812,10 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, goto Exit; } + /* check if there is mgrp_box matched to requested mgid */ + if (0 != (existed_mlid = osm_mgrp_box_get_mlid_by_mgid(sa->p_subn, p_mgid))) { + mlid = existed_mlid; + } /* create a new MC Group */ *pp_mgrp = osm_mgrp_new(mlid); if (*pp_mgrp == NULL) { @@ -833,11 +838,28 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, (*pp_mgrp)->mcmember_rec.pkt_life &= 0x3f; (*pp_mgrp)->mcmember_rec.pkt_life |= 2 << 6; /* exactly */ + /* get mgrp_box for selected mlid */ + p_mgrp_box = osm_get_mgrp_box_by_mlid(sa->p_subn, mlid); + if (!p_mgrp_box) { + OSM_LOG(sa->p_log, OSM_LOG_DEBUG, + "Creating new mgrp_box for mlid:0x%04x\n", + cl_ntoh16(mlid)); + p_mgrp_box = osm_mgrp_box_new(sa->p_subn, mlid); + if (!p_mgrp_box) { + OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B08: " + "osm_mgrp_box_new failed\n"); + osm_mgrp_delete(*pp_mgrp); + free_mlid(sa, mlid); + status = IB_INSUFFICIENT_MEMORY; + goto Exit; + } + } + /* Insert the new group in the data base */ cl_fmap_insert(&sa->p_subn->mgrp_mgid_tbl, &(*pp_mgrp)->mcmember_rec.mgid, &(*pp_mgrp)->map_item); - sa->p_subn->mgroups[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = *pp_mgrp; - + sa->p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = p_mgrp_box; + cl_qlist_insert_tail(&p_mgrp_box->mgrp_list, &(*pp_mgrp)->box_item); Exit: OSM_LOG_EXIT(sa->p_log); return status; @@ -1173,6 +1195,13 @@ static void mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) goto Exit; } + if (is_new_group) { + osm_mgrp_port_t *p_mgrp_port; + osm_mgrp_box_t *p_mgrp_box = osm_get_mgrp_box_by_mlid(sa->p_subn, p_mgrp->mlid); + p_mgrp_port = (osm_mgrp_port_t *) + cl_qmap_get(&p_mgrp_box->mgrp_port_map, portguid); + p_mgrp_port->num_groups++; + } /* Release the lock as we don't need it. */ CL_PLOCK_RELEASE(sa->p_lock); @@ -1386,7 +1415,6 @@ static void mcmr_query_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) osm_physp_t *p_req_physp; boolean_t trusted_req; osm_mgrp_t *p_mgrp; - int i; OSM_LOG_ENTER(sa->p_log); @@ -1415,12 +1443,11 @@ static void mcmr_query_mgrp(IN osm_sa_t * sa, IN osm_madw_t * p_madw) CL_PLOCK_ACQUIRE(sa->p_lock); /* simply go over all MCGs and match */ - for (i = 0; i <= sa->p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; - i++) { - p_mgrp = sa->p_subn->mgroups[i]; - if (p_mgrp) - mcmr_by_comp_mask(sa, p_rcvd_rec, comp_mask, p_mgrp, - p_req_physp, trusted_req, &rec_list); + p_mgrp = (osm_mgrp_t *) cl_fmap_head(&sa->p_subn->mgrp_mgid_tbl); + while (p_mgrp != (osm_mgrp_t *) cl_fmap_end(&sa->p_subn->mgrp_mgid_tbl)) { + mcmr_by_comp_mask(sa, p_rcvd_rec, comp_mask, p_mgrp, + p_req_physp, trusted_req, &rec_list); + p_mgrp = (osm_mgrp_t *) cl_fmap_next(&p_mgrp->map_item); } CL_PLOCK_RELEASE(sa->p_lock); diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 75d9516..6a63092 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2006 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -1433,12 +1433,13 @@ static void pr_rcv_process_pair(IN osm_sa_t * sa, IN const osm_madw_t * p_madw, /********************************************************************** **********************************************************************/ -static osm_mgrp_t *pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) +static osm_mgrp_box_t *pr_get_mgrp_box(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) { ib_path_rec_t *p_pr; const ib_sa_mad_t *p_sa_mad; ib_net64_t comp_mask; osm_mgrp_t *mgrp = NULL; + osm_mgrp_box_t *mgrp_box = NULL; p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); @@ -1454,6 +1455,8 @@ static osm_mgrp_t *pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) sizeof gid_str)); goto Exit; } + if (mgrp) + mgrp_box = osm_get_mgrp_box_by_mlid(sa->p_subn, mgrp->mlid); if (comp_mask & IB_PR_COMPMASK_DLID) { if (mgrp) { @@ -1465,18 +1468,18 @@ static osm_mgrp_t *pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * p_madw) "MC group MLID 0x%x does not match " "PathRecord destination LID 0x%x\n", mgrp->mlid, p_pr->dlid); - mgrp = NULL; + mgrp_box = NULL; goto Exit; } } else - if (!(mgrp = osm_get_mgrp_by_mlid(sa->p_subn, p_pr->dlid))) + if (!(mgrp_box = osm_get_mgrp_box_by_mlid(sa->p_subn, p_pr->dlid))) OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1F11: " "No MC group found for PathRecord " "destination LID 0x%x\n", p_pr->dlid); } Exit: - return mgrp; + return mgrp_box; } /********************************************************************** @@ -1691,20 +1694,31 @@ McastDest: OSM_LOG(sa->p_log, OSM_LOG_DEBUG, "Multicast destination requested\n"); { osm_mgrp_t *p_mgrp = NULL; - ib_api_status_t status; + ib_api_status_t status = IB_SUCCESS; osm_pr_item_t *p_pr_item; uint32_t flow_label; uint8_t sl; uint8_t hop_limit; + cl_list_item_t *p_item; + osm_mgrp_box_t *p_mgrp_box = NULL; /* First, get the MC info */ - p_mgrp = pr_get_mgrp(sa, p_madw); + p_mgrp_box = pr_get_mgrp_box(sa, p_madw); - if (!p_mgrp) + if (!p_mgrp_box) goto Unlock; /* Make sure the rest of the PathRecord matches the MC group attributes */ - status = pr_match_mgrp_attributes(sa, p_madw, p_mgrp); + for (p_item = cl_qlist_head(&p_mgrp_box->mgrp_list); + p_item != cl_qlist_end(&p_mgrp_box->mgrp_list); + p_item = cl_qlist_next(p_item)) { + p_mgrp = (osm_mgrp_t*)PARENT_STRUCT(p_item, osm_mgrp_t, + box_item); + status = pr_match_mgrp_attributes(sa, p_madw, p_mgrp); + if (status == IB_SUCCESS) + break; + } + if (status != IB_SUCCESS) { OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1F19: " "MC group attributes don't match PathRecord request\n"); diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 8d63a75..61d766a 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2008 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. @@ -430,6 +430,7 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) osm_prtn_t *p_prtn, *p_next_prtn; osm_mgrp_t *p_mgrp; osm_infr_t *p_infr, *p_next_infr; + osm_mgrp_box_t *p_mgrp_box; /* it might be a good idea to de-allocate all known objects */ p_next_node = (osm_node_t *) cl_qmap_head(&p_subn->node_guid_tbl); @@ -471,14 +472,19 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) osm_prtn_delete(&p_prtn); } - cl_fmap_remove_all(&p_subn->mgrp_mgid_tbl); for (i = 0; i <= p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; i++) { - p_mgrp = p_subn->mgroups[i]; - p_subn->mgroups[i] = NULL; - if (p_mgrp) - osm_mgrp_delete(p_mgrp); + p_mgrp_box = p_subn->mboxes[i]; + if (p_mgrp_box) + osm_mgrp_box_delete(p_subn, p_mgrp_box->mlid); + } + + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_subn->mgrp_mgid_tbl); + while (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_subn->mgrp_mgid_tbl)) { + cl_fmap_remove_item(&p_subn->mgrp_mgid_tbl, (cl_fmap_item_t*)p_mgrp); + osm_mgrp_delete(p_mgrp); + p_mgrp = (osm_mgrp_t*)cl_fmap_head(&p_subn->mgrp_mgid_tbl); } p_next_infr = (osm_infr_t *) cl_qlist_head(&p_subn->sa_infr_list); @@ -1655,3 +1661,18 @@ int osm_subn_write_conf_file(char *file_name, IN osm_subn_opt_t *const p_opts) return 0; } + +ib_net16_t osm_mgrp_box_get_mlid(IN struct osm_mgrp_box *p_mgrp_box) +{ + return (p_mgrp_box->mlid); +} + +ib_net16_t osm_mgrp_box_get_mlid_by_mgid(IN osm_subn_t const *p_subn, + IN const ib_gid_t * const p_mgid) +{ + osm_mgrp_t *p_mgrp = (osm_mgrp_t*)cl_fmap_get(&p_subn->mgrp_mgid_tbl, p_mgid); + if (p_mgrp != (osm_mgrp_t*)cl_fmap_end(&p_subn->mgrp_mgid_tbl)) { + return p_mgrp->mlid; + } + return 0; +} -- 1.6.3.3 From slavas at Voltaire.COM Tue Sep 29 06:54:12 2009 From: slavas at Voltaire.COM (Slava Strebkov) Date: Tue, 29 Sep 2009 15:54:12 +0200 Subject: [ofa-general] [PATCH 2/2 v4] opensm: Compression of multicast group according to pkey Message-ID: <4AC21184.4010103@Voltaire.COM> Additional data structure added: 1. Map of all partition keys opened in the fabric. 2. Map of all multicast group boxes shared same pkey. MLID assignment for multicast groups works in a usual manner, allocating free entry for newly created group. Proposed compression algorithm starts working when there are no more free entries in the mlid array. List of MLIDs for new multicast group will be chosen from the pkey indexed map according to the requested pkey. MLID which shares minimum number of ports will be given to newly created multicast group. Signed-off-by: Slava Strebkov --- opensm/include/opensm/osm_multicast.h | 135 ++++++++++++++++++++++++++++++++ opensm/include/opensm/osm_subnet.h | 36 +++++++++ opensm/opensm/osm_multicast.c | 103 ++++++++++++++++++++++++ opensm/opensm/osm_sa_mcmember_record.c | 28 ++++--- opensm/opensm/osm_subnet.c | 8 ++ 5 files changed, 299 insertions(+), 11 deletions(-) diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h index d4daf4b..1ced651 100644 --- a/opensm/include/opensm/osm_multicast.h +++ b/opensm/include/opensm/osm_multicast.h @@ -148,6 +148,7 @@ typedef struct osm_mgrp_box { cl_qlist_t mgrp_list; ib_net16_t mlid; osm_mtree_node_t *p_root; + cl_map_item_t mlid_item; } osm_mgrp_box_t; /* * FIELDS @@ -167,6 +168,9 @@ typedef struct osm_mgrp_box { * for this multicast group. The nodes of the tree represent * switches. Member ports are not represented in the tree. * +* mlid_item +* list item in list of boxes shared same pkey. +* * SEE ALSO *********/ /****s* OpenSM: Multicast group Port /osm_mgrp_port_t @@ -498,5 +502,136 @@ void osm_mgrp_box_delete(IN osm_subn_t * p_subn, * SEE ALSO * *********/ + +/****f* OpenSM: Subnet/osm_mlid_pkey_delete +* NAME +* osm_mlid_pkey_delete +* +* DESCRIPTION +* Frees the objects. +* +* SYNOPSIS +*/ +void osm_mlid_pkey_delete(osm_mlid_pkey_t *p_mlid_pkey); +/* +* PARAMETERS +* p_mlid_pkey +* [in] Pointer to an osm_mlid_pkey_t object +* +* RETURN VALUES +* None. +* +* NOTES +* +* SEE ALSO +* osm_mlid_pkey_new +*********/ + +/****f* OpenSM: Subnet/osm_mlid_pkey_new +* NAME +* osm_mlid_pkey_new +* +* DESCRIPTION +* Creates new object of osm_mlid_pkey_t. +* +* SYNOPSIS +*/ +osm_mlid_pkey_t *osm_mlid_pkey_new(IN ib_net16_t pkey); +/* +* PARAMETERS +* pkey +* [in] Partition key for the object +* +* RETURN VALUES +* Pointer to osm_mlid_pkey_t, or NULL. +* +* SEE ALSO +* osm_mlid_pkey_delete +*********/ + +/****f* OpenSM: Subnet/osm_mlid_pkey_add_box +* NAME +* osm_mlid_pkey_add_box +* +* DESCRIPTION +* Adds osm_mgrp_box_t object to map +* +* SYNOPSIS +*/ +void osm_mlid_pkey_add_box(osm_mgrp_box_t *p_mgrp_box, + ib_net16_t pkey, osm_subn_t *p_subn); +/* +* PARAMETERS +* p_mgrp_box +* [in] Pointer to osm_mgrp_box_t +* +* pkey +* [in] Partition key for the object +* +* p_subn +* [in] Pointer to an osm_subn_t object +* +* RETURN VALUES +* None. +* +* SEE ALSO +* osm_mlid_pkey_remove_box +*********/ + +/****f* OpenSM: Subnet/osm_mlid_pkey_remove_box +* NAME +* osm_mlid_pkey_remove_box +* +* DESCRIPTION +* removes osm_mgrp_box_t object from map +* +* SYNOPSIS +*/ +void osm_mlid_pkey_remove_box(osm_mgrp_box_t *p_mgrp_box, + ib_net16_t pkey, osm_subn_t *p_subn); +/* +* PARAMETERS +* p_mgrp_box +* [in] Pointer to osm_mgrp_box_t +* +* pkey +* [in] Partition key for the object +* +* p_subn +* [in] Pointer to an osm_subn_t object +* +* RETURN VALUES +* None. +* +* SEE ALSO +* osm_mlid_pkey_add_box +*********/ + +/****f* OpenSM: Subnet/osm_mlid_pkey_get_existed_mlid +* NAME +* osm_mlid_pkey_get_existed_mlid +* +* DESCRIPTION +* return used mlid with miminum ports, matched by pkey +* +* SYNOPSIS +*/ +ib_net16_t osm_mlid_pkey_get_existed_mlid(IN osm_subn_t *p_subn, IN + ib_net16_t pkey); +/* +* PARAMETERS +* +* p_subn +* [in] Pointer to an osm_subn_t object +* +* pkey +* [in] Partition key for the object +* +* RETURN VALUES +* matched mlid or 0 if not found +* +* SEE ALSO +* osm_mlid_pkey_add_box +*********/ END_C_DECLS #endif /* _OSM_MULTICAST_H_ */ diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index fe4695f..d6ed9da 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -129,6 +129,37 @@ typedef struct osm_qos_options { * *********/ +/****s* OpenSM: Subnet/osm_mlid_pkey_t +* NAME +* osm_mlid_pkey_t +* +* DESCRIPTION +* Structure combines all MLIDs opened on same pkey value. +* Used for mgid to mlid compresion +* +* SYNOPSIS +*/ +typedef struct osm_mlid_pkey { + cl_map_item_t pkey_item; + ib_net16_t pkey; + cl_qmap_t mlid_box_map; +} osm_mlid_pkey_t; +/* +* FIELDS +* pkey_item +* Map Item for qmap linkage. Must be first element!! +* Indexed by pkey. +* +* pkey +* Partition key (P_Key) for multicast group(s). +* +* mlid_box_map +* Map of osm_mgrp_box_t objects. Indexed by mlid +* +* SEE ALSO +* osm_mgrp_box_t +*********/ + /****s* OpenSM: Subnet/osm_subn_opt_t * NAME * osm_subn_opt_t @@ -515,6 +546,7 @@ typedef struct osm_subn { unsigned need_update; cl_fmap_t mgrp_mgid_tbl; void *mboxes[IB_LID_MCAST_END_HO - IB_LID_MCAST_START_HO + 1]; + cl_qmap_t mlid_pkey_tbl; } osm_subn_t; /* * FIELDS @@ -639,6 +671,10 @@ typedef struct osm_subn { * Array of pointers to all Multicast Group Box objects in the subnet. * Indexed by MLID offset from base MLID. * +* mlid_pkey_tbl +* Map of osm_pkey_mlid_t objects. Arranged by mgrp pkey value. +* Contains MLIDs for mgroups with same pkey. +* * SEE ALSO * Subnet object *********/ diff --git a/opensm/opensm/osm_multicast.c b/opensm/opensm/osm_multicast.c index 01c90d8..7e25b10 100644 --- a/opensm/opensm/osm_multicast.c +++ b/opensm/opensm/osm_multicast.c @@ -386,3 +386,106 @@ void osm_mgrp_box_delete(IN osm_subn_t *p_subn, ib_net16_t mlid) p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = NULL; free(p_mgrp_box); } + +/********************************************************************** +**********************************************************************/ +void osm_mlid_pkey_delete(osm_mlid_pkey_t *p_mlid_pkey) +{ + cl_qmap_remove_all(&p_mlid_pkey->mlid_box_map); + free(p_mlid_pkey); +} + +/********************************************************************** +**********************************************************************/ +osm_mlid_pkey_t *osm_mlid_pkey_new(ib_net16_t pkey) +{ + osm_mlid_pkey_t *p_mlid_pkey = calloc(1,sizeof(osm_mlid_pkey_t)); + if (!p_mlid_pkey) { + return NULL; + } + cl_qmap_init(&p_mlid_pkey->mlid_box_map); + p_mlid_pkey->pkey = pkey; + return p_mlid_pkey; +} + +/********************************************************************** +**********************************************************************/ +void osm_mlid_pkey_add_box(osm_mgrp_box_t *p_mgrp_box, + ib_net16_t pkey, osm_subn_t *p_subn) +{ + osm_mlid_pkey_t *p_mlid_pkey = (osm_mlid_pkey_t*) + cl_qmap_get(&p_subn->mlid_pkey_tbl,0x7fff & cl_ntoh16(pkey)); + if (p_mlid_pkey != (osm_mlid_pkey_t*)cl_qmap_end(&p_subn->mlid_pkey_tbl)) { + cl_qmap_insert(&p_mlid_pkey->mlid_box_map, p_mgrp_box->mlid, + &p_mgrp_box->mlid_item); + } + else { + p_mlid_pkey = osm_mlid_pkey_new(pkey); + if (p_mlid_pkey) { + cl_qmap_insert(&p_mlid_pkey->mlid_box_map, p_mgrp_box->mlid, + &p_mgrp_box->mlid_item); + cl_qmap_insert(&p_subn->mlid_pkey_tbl, 0x7fff & cl_ntoh16(pkey), + &p_mlid_pkey->pkey_item); + } + } +} + +/********************************************************************** +**********************************************************************/ +void osm_mlid_pkey_remove_box(osm_mgrp_box_t *p_mgrp_box, + ib_net16_t pkey, osm_subn_t *p_subn) +{ + osm_mlid_pkey_t *p_mlid_pkey = (osm_mlid_pkey_t*) + cl_qmap_get(&p_subn->mlid_pkey_tbl, 0x7fff & cl_ntoh16(pkey)); + if (p_mlid_pkey != (osm_mlid_pkey_t*)cl_qmap_end(&p_subn->mlid_pkey_tbl)) { + cl_qmap_remove_item(&p_mlid_pkey->mlid_box_map, &p_mgrp_box->mlid_item); + if (!cl_qmap_count(&p_mlid_pkey->mlid_box_map)) { + /* no more groups with given pkey exist */ + osm_mlid_pkey_delete(p_mlid_pkey); + } + } +} + +/********************************************************************** +**********************************************************************/ +static ib_net16_t osm_mlid_pkey_get_mlid(IN osm_mlid_pkey_t *p_mlid_pkey) +{ + cl_map_item_t *p_item; + osm_mgrp_box_t *p_mgrp_box; + osm_mgrp_box_t *p_matched_box = NULL; + size_t port_count = 0; + for (p_item = cl_qmap_head(&p_mlid_pkey->mlid_box_map); + p_item != cl_qmap_end(&p_mlid_pkey->mlid_box_map); + p_item = cl_qmap_next(p_item)) { + p_mgrp_box = (osm_mgrp_box_t*) + PARENT_STRUCT(p_item, osm_mgrp_box_t,mlid_item); + if (!port_count) { + /* init p_matched_holder and count */ + port_count = cl_qmap_count(&p_mgrp_box->mgrp_port_map); + p_matched_box = p_mgrp_box; + } + else { + if (port_count > cl_qmap_count(&p_mgrp_box->mgrp_port_map)) { + port_count = cl_qmap_count(&p_mgrp_box->mgrp_port_map); + p_matched_box = p_mgrp_box; + } + } + } + if (p_matched_box) { + return p_matched_box->mlid; + } + return 0; +} + +/********************************************************************** +**********************************************************************/ +ib_net16_t osm_mlid_pkey_get_existed_mlid(IN osm_subn_t *p_subn, IN ib_net16_t pkey) +{ + osm_mlid_pkey_t *p_mlid_pkey = + (osm_mlid_pkey_t*)cl_qmap_get(&p_subn->mlid_pkey_tbl, 0x7fff & cl_ntoh16(pkey)); + if (p_mlid_pkey != (osm_mlid_pkey_t*)cl_qmap_end(&p_subn->mlid_pkey_tbl)) { + /* found obect with mgroups matched requested pkey */ + return osm_mlid_pkey_get_mlid(p_mlid_pkey); + } + return 0; +} diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index b39f986..fad8248 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -730,13 +730,14 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, IN const osm_physp_t * p_physp, OUT osm_mgrp_t ** pp_mgrp) { - ib_net16_t mlid,existed_mlid; + ib_net16_t mlid; unsigned zero_mgid, i; uint8_t scope; ib_gid_t *p_mgid; osm_mgrp_box_t *p_mgrp_box; ib_api_status_t status = IB_SUCCESS; ib_member_rec_t mcm_rec = *p_recvd_mcmember_rec; /* copy for modifications */ + boolean_t new_mlid = TRUE; OSM_LOG_ENTER(sa->p_log); @@ -754,15 +755,22 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, */ mlid = get_new_mlid(sa, mcm_rec.mlid); if (mlid == 0) { - OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B19: " - "get_new_mlid failed request mlid 0x%04x\n", - cl_ntoh16(mcm_rec.mlid)); - status = IB_SA_MAD_STATUS_NO_RESOURCES; - goto Exit; + /* try to add mcgroup to existed mlid */ + mlid = osm_mlid_pkey_get_existed_mlid(sa->p_subn, mcm_rec.pkey); + if (mlid == 0) { + OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 1B19: " + "get_new_mlid failed request mlid 0x%04x\n", + cl_ntoh16(mcm_rec.mlid)); + status = IB_SA_MAD_STATUS_NO_RESOURCES; + goto Exit; + } + new_mlid = FALSE; + OSM_LOG(sa->p_log, OSM_LOG_DEBUG, + "Found existed mlid 0x%X\n", cl_ntoh16(mlid)); } OSM_LOG(sa->p_log, OSM_LOG_DEBUG, - "Obtained new mlid 0x%X\n", cl_ntoh16(mlid)); + "Obtained mlid 0x%X\n", cl_ntoh16(mlid)); /* we need to create the new MGID if it was not defined */ if (zero_mgid) { @@ -812,10 +820,6 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, goto Exit; } - /* check if there is mgrp_box matched to requested mgid */ - if (0 != (existed_mlid = osm_mgrp_box_get_mlid_by_mgid(sa->p_subn, p_mgid))) { - mlid = existed_mlid; - } /* create a new MC Group */ *pp_mgrp = osm_mgrp_new(mlid); if (*pp_mgrp == NULL) { @@ -859,6 +863,8 @@ ib_api_status_t osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, cl_fmap_insert(&sa->p_subn->mgrp_mgid_tbl, &(*pp_mgrp)->mcmember_rec.mgid, &(*pp_mgrp)->map_item); sa->p_subn->mboxes[cl_ntoh16(mlid) - IB_LID_MCAST_START_HO] = p_mgrp_box; + if (new_mlid) + osm_mlid_pkey_add_box(p_mgrp_box,(*pp_mgrp)->mcmember_rec.pkey, sa->p_subn); cl_qlist_insert_tail(&p_mgrp_box->mgrp_list, &(*pp_mgrp)->box_item); Exit: OSM_LOG_EXIT(sa->p_log); diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 61d766a..02989fa 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -416,6 +416,7 @@ void osm_subn_construct(IN osm_subn_t * const p_subn) cl_qmap_init(&p_subn->rtr_guid_tbl); cl_qmap_init(&p_subn->prtn_pkey_tbl); cl_fmap_init(&p_subn->mgrp_mgid_tbl, compar_mgids); + cl_qmap_init(&p_subn->mlid_pkey_tbl); } /********************************************************************** @@ -431,6 +432,7 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) osm_mgrp_t *p_mgrp; osm_infr_t *p_infr, *p_next_infr; osm_mgrp_box_t *p_mgrp_box; + osm_mlid_pkey_t *p_mlid_pkey; /* it might be a good idea to de-allocate all known objects */ p_next_node = (osm_node_t *) cl_qmap_head(&p_subn->node_guid_tbl); @@ -472,6 +474,12 @@ void osm_subn_destroy(IN osm_subn_t * const p_subn) osm_prtn_delete(&p_prtn); } + p_mlid_pkey = (osm_mlid_pkey_t*)cl_qmap_head(&p_subn->mlid_pkey_tbl); + while (p_mlid_pkey != (osm_mlid_pkey_t*)cl_qmap_end(&p_subn->mlid_pkey_tbl)) { + cl_qmap_remove_item(&p_subn->mlid_pkey_tbl, (cl_map_item_t*)p_mlid_pkey); + osm_mlid_pkey_delete(p_mlid_pkey); + p_mlid_pkey = (osm_mlid_pkey_t*)cl_qmap_head(&p_subn->mlid_pkey_tbl); + } for (i = 0; i <= p_subn->max_mcast_lid_ho - IB_LID_MCAST_START_HO; i++) { -- 1.6.3.3 From sashak at voltaire.com Tue Sep 29 08:41:45 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 17:41:45 +0200 Subject: [ofa-general] Re: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan In-Reply-To: <20090923144555.9efa2c75.weiny2@llnl.gov> References: <20090923144555.9efa2c75.weiny2@llnl.gov> Message-ID: <20090929154145.GA9465@me> On 14:45 Wed 23 Sep , Ira Weiny wrote: > > From: Ira Weiny > Date: Wed, 23 Sep 2009 14:26:55 -0700 > Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan > > Also ibd_sm_id is never valid in this tool as the "-s" option is used > for "suppress" > > Signed-off-by: Ira Weiny > --- > infiniband-diags/src/ibqueryerrors.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c > index f73ca6f..892e539 100644 > --- a/infiniband-diags/src/ibqueryerrors.c > +++ b/infiniband-diags/src/ibqueryerrors.c > @@ -441,8 +441,8 @@ int main(int argc, char **argv) > } else if (switch_guid_str) { > if ((resolved = > ib_resolve_portid_str_via(&portid, switch_guid_str, > - IB_DEST_GUID, ibd_sm_id, > - ibmad_port)) >= 0) > + IB_DEST_GUID, NULL, BTW, why should 'ibd_sm_id' be replaced by NULL? Sasha > + ibmad_port)) < 0) > IBWARN("Failed to resolve %s; attempting full scan\n", > switch_guid_str); > } > -- > 1.5.4.5 > From sashak at voltaire.com Tue Sep 29 08:49:49 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 17:49:49 +0200 Subject: [ofa-general] Re: [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router In-Reply-To: <20090923150923.a5281107.weiny2@llnl.gov> References: <20090923150923.a5281107.weiny2@llnl.gov> Message-ID: <20090929154949.GB9465@me> Hi Ira, On 15:09 Wed 23 Sep , Ira Weiny wrote: > > From: Ira Weiny > Date: Wed, 23 Sep 2009 11:38:11 -0700 > Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router > > By default ibqueryerrors should print errors for all node types. > Adding the other options allows for the limitation of this output. > > Also change the --switch option to be --node-guid which is really more > accurate and use "-G" for better compliance with other utilities. "-S" > is left in for backward compatibility for the time being. > > Update the man page > > Signed-off-by: Ira Weiny Applied with comments below. Thanks. > --- > infiniband-diags/man/ibqueryerrors.8 | 62 +++++++++++++++++++----------- > infiniband-diags/src/ibqueryerrors.c | 70 +++++++++++++++++++++++++--------- > 2 files changed, 92 insertions(+), 40 deletions(-) > > diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 > index a327f3b..8f83a7b 100644 > --- a/infiniband-diags/man/ibqueryerrors.8 > +++ b/infiniband-diags/man/ibqueryerrors.8 > @@ -5,7 +5,7 @@ ibqueryerrors.pl \- query and report non-zero IB port counters > > .SH SYNOPSIS > .B ibqueryerrors.pl > -[-a -c -r -R -C -P -s -S > +[-s -c -r -C -P -s -G '-s' is listed twice. I'm fixing this. > -D -d] > > .SH DESCRIPTION > @@ -20,41 +20,59 @@ reported. > > .PP > .TP > -\fB\-a\fR > -Report an action to take. Some of the counters are not errors in and of > -themselves. This reports some more information on what the counters mean and > -what actions can/should be taken if they are non-zero. > +\fB\-s \fR > +Suppress the errors listed in the comma separated list provided. > .TP > \fB\-c\fR > Suppress some of the common "side effect" counters. These counters usually do > not indicate an error condition and can be usually be safely ignored. > .TP > +\fB\-G \fR > +Report results only for the node guid specified. > +.TP > +\fB\-S \fR > +\-S is provided only for backward compatibility and works the same as "-G" > +.TP > +\fB\-D \fR > +Report results only for the switch specified by the direct route path. > +.TP > \fB\-r\fR > Report the port information. This includes LID, port, external port (if > applicable), link speed setting, remote GUID, remote port, remote external port > (if applicable), and remote node description information. > .TP > -\fB\-R\fR > -Recalculate the ibnetdiscover information, ie do not use the cached > -information. This option is slower but should be used if the diag tools have > -not been used for some time or if there are other reasons to believe that > -the fabric has changed. > +\fB\-\-data\fR > +Include the optional transmit and receive data counters. > .TP > -\fB\-s \fR > -Suppress the errors listed in the comma separated list provided. > +\fB\-\-switch\fR print data for switches only > .TP > -\fB\-S \fR > -Report results only for the switch specified. (hex format) > +\fB\-\-ca\fR print data for CA's only > .TP > -\fB\-D \fR > -Report results only for the switch specified by the direct route path. > +\fB\-\-router\fR print data for routers only > .TP > -\fB\-d\fR > -Include the optional transmit and receive data counters. > -.TP > -\fB\-C \fR use the specified ca_name for the search. > -.TP > -\fB\-P \fR use the specified ca_port for the search. > +\fB\-R\fR (This option is obsolete and does nothing) > + > +.SH COMMON OPTIONS > +.PP > +\-d raise the IB debugging level. > + May be used several times (-ddd or -d -d -d). > +.PP > +\-e show send and receive errors (timeouts and others) > +.PP > +\-h show the usage message > +.PP > +\-v increase the application verbosity level. > + May be used several times (-vv or -v -v -v) > +.PP > +\-V show the version info. > + > +# Other common flags: > +.PP > +\-C use the specified ca_name. > +.PP > +\-P use the specified ca_port. > +.PP > +\-t override the default timeout for the solicited mads. > > > .SH AUTHOR > diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c > index f73ca6f..ecfd662 100644 > --- a/infiniband-diags/src/ibqueryerrors.c > +++ b/infiniband-diags/src/ibqueryerrors.c > @@ -59,12 +59,17 @@ static char *node_name_map_file = NULL; > static nn_map_t *node_name_map = NULL; > int data_counters = 0; > int port_config = 0; > -uint64_t switch_guid = 0; > -char *switch_guid_str = NULL; > +uint64_t node_guid = 0; > +char *node_guid_str = NULL; > int sup_total = 0; > enum MAD_FIELDS *suppressed_fields = NULL; > char *dr_path = NULL; > -int all_nodes = 0; > + > +#define PRINT_ALL 0xFF /* all nodes default flag */ > +uint8_t node_type_to_print = PRINT_ALL; > +#define PRINT_SWITCH 0x1 > +#define PRINT_CA 0x2 > +#define PRINT_ROUTER 0x4 > > static unsigned int get_max(unsigned int num) > { > @@ -304,8 +309,21 @@ void print_node(ibnd_node_t * node, void *user_data) > int header_printed = 0; > int p = 0; > int startport = 1; > + int type = 0; > + > + switch (node->type) { > + case IB_NODE_SWITCH: > + type = PRINT_SWITCH; > + break; > + case IB_NODE_CA: > + type = PRINT_CA; > + break; > + case IB_NODE_ROUTER: > + type = PRINT_ROUTER; > + break; > + } > > - if (!all_nodes && node->type != IB_NODE_SWITCH) > + if ((type & node_type_to_print) == 0) > return; > > if (node->type == IB_NODE_SWITCH && node->smaenhsp0) > @@ -361,11 +379,24 @@ static int process_opt(void *context, int ch, char *optarg) > data_counters++; > break; > case 3: > - all_nodes++; > + if (node_type_to_print == PRINT_ALL) > + node_type_to_print = 0; > + node_type_to_print |= PRINT_SWITCH; > + break; > + case 4: > + if (node_type_to_print == PRINT_ALL) > + node_type_to_print = 0; > + node_type_to_print |= PRINT_CA; > + break; > + case 5: > + if (node_type_to_print == PRINT_ALL) > + node_type_to_print = 0; > + node_type_to_print |= PRINT_ROUTER; Instead of repeating 'node_type_to_print' check its setup could be done as: node_type_to_print = 0; process_options()... if (!node_type_to_print) node_type_to_print = ALL; Adding this as separate patch. > break; > + case 'G': > case 'S': > - switch_guid_str = optarg; > - switch_guid = strtoull(optarg, 0, 0); > + node_guid_str = optarg; > + node_guid = strtoull(optarg, 0, 0); > break; > case 'D': > dr_path = strdup(optarg); Some generic thoughts. When -D, -S, -G and port cannot be resolved should this be an error instead of full fabric discovery and errors querying? When such port was resolved shouldn't we skip even partial discover as useless? What about unifying -G, -D, -L options (target address type) usage with other intiniband-diags tools (implemented already with ibdiag_common)? Sasha > @@ -399,8 +430,9 @@ int main(int argc, char **argv) > {"suppress-common", 'c', 0, NULL, > "suppress some of the common counters"}, > {"node-name-map", 1, 1, "", "node name map file"}, > - {"switch", 'S', 1, "", > - "query only (hex format)"}, > + {"node-guid", 'G', 1, "", "query only "}, > + {"", 'S', 1, "", > + "Same as \"-G\" for backward compatibility"}, > {"Direct", 'D', 1, "", > "query only switch specified by "}, > {"report-port", 'r', 0, NULL, > @@ -408,7 +440,9 @@ int main(int argc, char **argv) > {"GNDN", 'R', 0, NULL, > "(This option is obsolete and does nothing)"}, > {"data", 2, 0, NULL, "include the data counters in the output"}, > - {"all", 3, 0, NULL, "output all nodes (not just switches)"}, > + {"switch", 3, 0, NULL, "print data for switches only"}, > + {"ca", 4, 0, NULL, "print data for CA's only"}, > + {"router", 5, 0, NULL, "print data for routers only"}, > {0} > }; > char usage_args[] = ""; > @@ -438,13 +472,13 @@ int main(int argc, char **argv) > NULL, ibmad_port)) < 0) > IBWARN("Failed to resolve %s; attempting full scan\n", > dr_path); > - } else if (switch_guid_str) { > + } else if (node_guid_str) { > if ((resolved = > - ib_resolve_portid_str_via(&portid, switch_guid_str, > + ib_resolve_portid_str_via(&portid, node_guid_str, > IB_DEST_GUID, ibd_sm_id, > ibmad_port)) >= 0) > IBWARN("Failed to resolve %s; attempting full scan\n", > - switch_guid_str); > + node_guid_str); > } > > if (resolved >= 0) > @@ -463,13 +497,13 @@ int main(int argc, char **argv) > > report_suppressed(); > > - if (switch_guid_str) { > - ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid); > + if (node_guid_str) { > + ibnd_node_t *node = ibnd_find_node_guid(fabric, node_guid); > if (node) > print_node(node, NULL); > else > fprintf(stderr, "Failed to find node: %s\n", > - switch_guid_str); > + node_guid_str); > } else if (dr_path) { > ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path); > uint8_t ni[IB_SMP_DATA_SIZE]; > @@ -477,9 +511,9 @@ int main(int argc, char **argv) > if (!smp_query_via(ni, &portid, IB_ATTR_NODE_INFO, 0, > ibd_timeout, ibmad_port)) > return -1; > - mad_decode_field(ni, IB_NODE_GUID_F, &(switch_guid)); > + mad_decode_field(ni, IB_NODE_GUID_F, &(node_guid)); > > - node = ibnd_find_node_guid(fabric, switch_guid); > + node = ibnd_find_node_guid(fabric, node_guid); > if (node) > print_node(node, NULL); > else > -- > 1.5.4.5 > From sashak at voltaire.com Tue Sep 29 08:49:58 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 17:49:58 +0200 Subject: [ofa-general] [PATCH] infiniband-diags/ibqueryerrors: simplify node_type_to_print setup. In-Reply-To: <20090923150923.a5281107.weiny2@llnl.gov> References: <20090923150923.a5281107.weiny2@llnl.gov> Message-ID: <20090929154958.GC9465@me> Siplify node_type_to_print setup. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/ibqueryerrors.c | 13 +++++-------- 1 files changed, 5 insertions(+), 8 deletions(-) diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c index 8a3c236..e67d338 100644 --- a/infiniband-diags/src/ibqueryerrors.c +++ b/infiniband-diags/src/ibqueryerrors.c @@ -64,12 +64,12 @@ char *node_guid_str = NULL; int sup_total = 0; enum MAD_FIELDS *suppressed_fields = NULL; char *dr_path = NULL; +uint8_t node_type_to_print = 0; -#define PRINT_ALL 0xFF /* all nodes default flag */ -uint8_t node_type_to_print = PRINT_ALL; #define PRINT_SWITCH 0x1 #define PRINT_CA 0x2 #define PRINT_ROUTER 0x4 +#define PRINT_ALL 0xFF /* all nodes default flag */ static unsigned int get_max(unsigned int num) { @@ -379,18 +379,12 @@ static int process_opt(void *context, int ch, char *optarg) data_counters++; break; case 3: - if (node_type_to_print == PRINT_ALL) - node_type_to_print = 0; node_type_to_print |= PRINT_SWITCH; break; case 4: - if (node_type_to_print == PRINT_ALL) - node_type_to_print = 0; node_type_to_print |= PRINT_CA; break; case 5: - if (node_type_to_print == PRINT_ALL) - node_type_to_print = 0; node_type_to_print |= PRINT_ROUTER; break; case 'G': @@ -453,6 +447,9 @@ int main(int argc, char **argv) argc -= optind; argv += optind; + if (!node_type_to_print) + node_type_to_print = PRINT_ALL; + if (ibverbose) ibnd_debug(1); -- 1.6.5.rc1 From dorfman.eli at gmail.com Tue Sep 29 09:16:21 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Tue, 29 Sep 2009 18:16:21 +0200 Subject: [ofa-general] [PATCH] infiniband-diags: Fix IB network discovery from switch node. In-Reply-To: <20090923172451.fb20ab9b.weiny2@llnl.gov> References: <4A9548AA.4020900@gmail.com> <20090923172451.fb20ab9b.weiny2@llnl.gov> Message-ID: <4AC232D5.2060806@gmail.com> Ira Weiny wrote: > Eli, > > On Wed, 26 Aug 2009 17:37:30 +0300 > "Eli Dorfman (Voltaire)" wrote: > >> Subject: [PATCH] Fix IB network discovery from switch node. > > Sorry for the late inquiry on this but what exactly was the bug here? Sorry for the late response. The problem is related to wrong discovery when running from the switch. Without the patch ibnetdiscover finds only local switch 4036% ibnetdiscover ibwarn: [2833] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,0) ibwarn: [2833] get_remote_node: NodeInfo on DR path slid 0; dlid 0; 0,0 failed, skipping port # # Topology file: generated on Tue Sep 29 15:29:50 2009 # # Max of 1 hops discovered # Initiated from node 0008f1050010006e port 0008f1050010006e vendid=0x8f1 devid=0x5a5a sysimgguid=0x8f1050010006f switchguid=0x8f1050010006e(8f1050010006e) Switch 36 "S-0008f1050010006e" # "Voltaire 4036 - 36 QDR ports switch" enhanced port 0 lid 1 lmc 0 With the patch we see the switch is connected to 2 HCAs # # Topology file: generated on Tue Sep 29 15:19:24 2009 # # Max of 1 hops discovered # Initiated from node 0008f1050010006e port 0008f1050010006e vendid=0x8f1 devid=0x5a5a sysimgguid=0x8f1050010006f switchguid=0x8f1050010006e(8f1050010006e) Switch 36 "S-0008f1050010006e" # "Voltaire 4036 - 36 QDR ports switch" enhanced port 0 lid 1 lmc 0 [24] "H-0008f104039a0198"[2](8f104039a019a) # "luna6 HCA-1" lid 3 4xQDR [29] "H-0008f1040399f444"[2](8f1040399f446) # "localhost HCA-1" lid 2 4xQDR vendid=0x2c9 devid=0x673c sysimgguid=0x8f1040399f447 caguid=0x8f1040399f444 Ca 2 "H-0008f1040399f444" # "localhost HCA-1" [2](8f1040399f446) "S-0008f1050010006e"[29] # lid 2 lmc 0 "Voltaire 4036 - 36 QDR ports switch" lid 1 4xQDR vendid=0x2c9 devid=0x673c sysimgguid=0x8f104039a019b caguid=0x8f104039a0198 Ca 2 "H-0008f104039a0198" # "luna6 HCA-1" [2](8f104039a019a) "S-0008f1050010006e"[24] # lid 3 lmc 0 "Voltaire 4036 - 36 QDR ports switch" lid 1 4xQDR > > I just found that this change introduced a bug. The problem is that if you > don't do this query, even when the first found node is a switch, the port you > came into the switch on will not get reported properly. Here is what I mean. > > Running with the current master: > > 17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c > Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: > 8 1[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > ... > 8 9[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > 8 10[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 15 24[ ] "ISR9024D Voltaire" ( ) > 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" ( ) > 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > ... > > The DR path "came in" on port 12 and is reported as Active/LinkUp but has no > information on the other end. Here is what the output should look like with > your change removed. > > 17:22:36 > ./iblinkinfo -S 0x000b8cffff00490c > Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: > 8 1[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > ... > 8 9[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > 8 10[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 15 24[ ] "ISR9024D Voltaire" ( ) > 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 7 8[ ] "Cisco Switch SFS7000D" ( ) > 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > ... > > This properly reports the other end of this link as another switch. > > Could you explain the problem a bit more so we can come up with a better > solution? I think that the problem is related to NodeInfo:LocalPort which is 0 in case of a switch. I see that get_remote_node() sends direct route MAD to switch with path 0,0 and that fails (at least for Mellanox IS4 switch chips). Another way to bypass this may be as follows: diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c index 1e93ff8..3dd0dc6 100644 --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c @@ -461,7 +461,7 @@ get_remote_node(struct ibnd_fabric *fabric, struct ibnd_node *node, struct ibnd_ != IB_PORT_PHYS_STATE_LINKUP) return -1; - if (extend_dpath(fabric, path, portnum) < 0) + if (portnum > 0 && extend_dpath(fabric, path, portnum) < 0) return -1; if (query_node(fabric, &node_buf, &port_buf, path)) { Please check whether this is OK and I can send a new patch. Thanks, Eli > > Thanks, > Ira > >> Signed-off-by: Eli Dorfman >> --- >> infiniband-diags/libibnetdisc/src/ibnetdisc.c | 16 +++++++++------- >> 1 files changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> index c69467e..779e659 100644 >> --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> @@ -590,13 +590,15 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * ibmad_port, >> if (!port) >> goto error; >> >> - rc = get_remote_node(ibmad_port, fabric, node, port, from, >> - mad_get_field(node->info, 0, >> - IB_NODE_LOCAL_PORT_F), 0); >> - if (rc < 0) >> - goto error; >> - if (rc > 0) /* non-fatal error, nothing more to be done */ >> - return ((ibnd_fabric_t *) fabric); >> + if (node->node.type != IB_NODE_SWITCH) { >> + rc = get_remote_node(ibmad_port, fabric, node, port, from, >> + mad_get_field(node->info, 0, >> + IB_NODE_LOCAL_PORT_F), 0); >> + if (rc < 0) >> + goto error; >> + if (rc > 0) /* non-fatal error, nothing more to be done */ >> + return ((ibnd_fabric_t *) fabric); >> + } >> >> for (dist = 0; dist <= max_hops; dist++) { >> >> -- >> 1.5.5 >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general >> > > From jsquyres at cisco.com Tue Sep 29 09:16:39 2009 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 29 Sep 2009 12:16:39 -0400 Subject: [ofa-general] This list expires... tomorrow? Message-ID: What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org ) Will mails bounce? The intent is that all mails to the "general" list should be sent to the linux-rdma list instead, right? -- Jeff Squyres jsquyres at cisco.com From jon at opengridcomputing.com Tue Sep 29 09:24:57 2009 From: jon at opengridcomputing.com (Jon Mason) Date: Tue, 29 Sep 2009 11:24:57 -0500 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: Message-ID: <20090929162457.GC10188@opengridcomputing.com> On Tue, Sep 29, 2009 at 12:16:39PM -0400, Jeff Squyres wrote: > What happens to this list after tomorrow? (i.e., > general at lists.openfabrics.org) Will mails bounce? > > The intent is that all mails to the "general" list should be sent to the > linux-rdma list instead, right? Can we set up an auto forward on the e-mail server? > > -- > Jeff Squyres > jsquyres at cisco.com > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From jsquyres at cisco.com Tue Sep 29 10:25:13 2009 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 29 Sep 2009 13:25:13 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <20090929162457.GC10188@opengridcomputing.com> References: <20090929162457.GC10188@opengridcomputing.com> Message-ID: It seems like we did that last time and people are still mailing to openib at openib.org . Why not cut the cord completely? Perhaps setup an auto-responder saying "your was discarded; all mail should be re-sent to linux- rdma at ...etc." People will make the mistake of mailing to the old/non- existent address once and then update their addressbooks. Plus, won't it be harder on the spam controls on the new list to recognize a variety of old names for this list? ** My $0.02: out with the old, in with the new. Otherwise we continue a chain of really old/outdated email addresses and infrastructure. On Sep 29, 2009, at 12:24 PM, Jon Mason wrote: > On Tue, Sep 29, 2009 at 12:16:39PM -0400, Jeff Squyres wrote: > > What happens to this list after tomorrow? (i.e., > > general at lists.openfabrics.org) Will mails bounce? > > > > The intent is that all mails to the "general" list should be sent > to the > > linux-rdma list instead, right? > > Can we set up an auto forward on the e-mail server? > > > > > -- > > Jeff Squyres > > jsquyres at cisco.com > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > -- Jeff Squyres jsquyres at cisco.com From weiny2 at llnl.gov Tue Sep 29 10:25:56 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 29 Sep 2009 10:25:56 -0700 Subject: [ofa-general] Re: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan In-Reply-To: <20090929154145.GA9465@me> References: <20090923144555.9efa2c75.weiny2@llnl.gov> <20090929154145.GA9465@me> Message-ID: <20090929102556.140506c7.weiny2@llnl.gov> On Tue, 29 Sep 2009 17:41:45 +0200 Sasha Khapyorsky wrote: > On 14:45 Wed 23 Sep , Ira Weiny wrote: > > > > From: Ira Weiny > > Date: Wed, 23 Sep 2009 14:26:55 -0700 > > Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: fix bug when attempting a sub-fabric scan > > > > Also ibd_sm_id is never valid in this tool as the "-s" option is used > > for "suppress" To answer your question below: Because the ibd_sm_id is never going to be overridden because the "-s" option is used for "suppress". So NULL should used for the default for now. I did not want to change the -s option now. Some day I can fix this up to be consistent with the other tools. Sorry. > > > > Signed-off-by: Ira Weiny > > --- > > infiniband-diags/src/ibqueryerrors.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c > > index f73ca6f..892e539 100644 > > --- a/infiniband-diags/src/ibqueryerrors.c > > +++ b/infiniband-diags/src/ibqueryerrors.c > > @@ -441,8 +441,8 @@ int main(int argc, char **argv) > > } else if (switch_guid_str) { > > if ((resolved = > > ib_resolve_portid_str_via(&portid, switch_guid_str, > > - IB_DEST_GUID, ibd_sm_id, > > - ibmad_port)) >= 0) > > + IB_DEST_GUID, NULL, > > BTW, why should 'ibd_sm_id' be replaced by NULL? See above, Ira > > Sasha > > > + ibmad_port)) < 0) > > IBWARN("Failed to resolve %s; attempting full scan\n", > > switch_guid_str); > > } > > -- > > 1.5.4.5 > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From weiny2 at llnl.gov Tue Sep 29 10:46:24 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 29 Sep 2009 10:46:24 -0700 Subject: [ofa-general] Re: [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router In-Reply-To: <20090929154949.GB9465@me> References: <20090923150923.a5281107.weiny2@llnl.gov> <20090929154949.GB9465@me> Message-ID: <20090929104624.7cb664fe.weiny2@llnl.gov> On Tue, 29 Sep 2009 17:49:49 +0200 Sasha Khapyorsky wrote: > Hi Ira, > > On 15:09 Wed 23 Sep , Ira Weiny wrote: > > > > From: Ira Weiny > > Date: Wed, 23 Sep 2009 11:38:11 -0700 > > Subject: [PATCH] infiniband-diags/src/ibqueryerrors.c: Remove --all option and replace it with --switch, --ca, --router > > > > By default ibqueryerrors should print errors for all node types. > > Adding the other options allows for the limitation of this output. > > > > Also change the --switch option to be --node-guid which is really more > > accurate and use "-G" for better compliance with other utilities. "-S" > > is left in for backward compatibility for the time being. > > > > Update the man page > > > > Signed-off-by: Ira Weiny > > Applied with comments below. Thanks. > > > --- > > infiniband-diags/man/ibqueryerrors.8 | 62 +++++++++++++++++++----------- > > infiniband-diags/src/ibqueryerrors.c | 70 +++++++++++++++++++++++++--------- > > 2 files changed, 92 insertions(+), 40 deletions(-) > > > > diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 > > index a327f3b..8f83a7b 100644 > > --- a/infiniband-diags/man/ibqueryerrors.8 > > +++ b/infiniband-diags/man/ibqueryerrors.8 > > @@ -5,7 +5,7 @@ ibqueryerrors.pl \- query and report non-zero IB port counters > > > > .SH SYNOPSIS > > .B ibqueryerrors.pl > > -[-a -c -r -R -C -P -s -S > > +[-s -c -r -C -P -s -G > > '-s' is listed twice. I'm fixing this. Thanks. [snip] > > > > - if (!all_nodes && node->type != IB_NODE_SWITCH) > > + if ((type & node_type_to_print) == 0) > > return; > > > > if (node->type == IB_NODE_SWITCH && node->smaenhsp0) > > @@ -361,11 +379,24 @@ static int process_opt(void *context, int ch, char *optarg) > > data_counters++; > > break; > > case 3: > > - all_nodes++; > > + if (node_type_to_print == PRINT_ALL) > > + node_type_to_print = 0; > > + node_type_to_print |= PRINT_SWITCH; > > + break; > > + case 4: > > + if (node_type_to_print == PRINT_ALL) > > + node_type_to_print = 0; > > + node_type_to_print |= PRINT_CA; > > + break; > > + case 5: > > + if (node_type_to_print == PRINT_ALL) > > + node_type_to_print = 0; > > + node_type_to_print |= PRINT_ROUTER; > > Instead of repeating 'node_type_to_print' check its setup could be done > as: > > node_type_to_print = 0; > process_options()... > if (!node_type_to_print) > node_type_to_print = ALL; > > Adding this as separate patch. Yes that patch is better. Thanks, > > > break; > > + case 'G': > > case 'S': > > - switch_guid_str = optarg; > > - switch_guid = strtoull(optarg, 0, 0); > > + node_guid_str = optarg; > > + node_guid = strtoull(optarg, 0, 0); > > break; > > case 'D': > > dr_path = strdup(optarg); > > Some generic thoughts. > > When -D, -S, -G and port cannot be resolved should this be an error > instead of full fabric discovery and errors querying? No this is not good for 2 reasons. 1) if the SA is down the tool will still work by searching (slower but works) This is really important because diags are used when things are not working. So the SA being down or slow is a real possibility.[*] 2) See below about link information... [*] of course this does not apply to the -D option but it will never fail to resolve. > > When such port was resolved shouldn't we skip even partial discover as > useless? It is not useless. The link information is printed if the "-r" option is specified, like so... 10:32:43 > ./ibqueryerrors -S 0xb8cffff00490c -r Errors for 0xb8cffff00490c "MT47396 Infiniscale-III Mellanox Technologies" GUID 0xb8cffff00490c port 10: [LinkDowned == 1] Link info: 8 10[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0008f10400411b18 15 24[ ] "ISR9024D Voltaire" ( ) GUID 0xb8cffff00490c port 12: [RcvSwRelayErrors == 19] [XmtDiscards == 1] Link info: 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" ( ) GUID 0xb8cffff00490c port 20: [RcvSwRelayErrors == 5] Link info: 8 20[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c902002268c4 10 2[ ] "woprjr4" ( ) The partial discover gives us this information. > > What about unifying -G, -D, -L options (target address type) usage with > other intiniband-diags tools (implemented already with ibdiag_common)? Well... At the risk of being flamed... I personally do not like the way these options are implemented. For example: 10:37:13 > smpquery -G 0xb8cffff00490c nodeinfo smpquery: iberror: failed: operation '0xb8cffff00490c' not supported WTF? Ah, here is the "correct" syntax 10:37:21 > smpquery -G nodeinfo 0xb8cffff00490c # Node info: Lid 8 BaseVers:........................1 ... This "correct" syntax is weird to me. Why is the GUID being specified as "nodeinfo"? I admit it might be the way I process commands but I find myself skipping around the command line to get the syntax right. Another minor difference is that iblinkinfo and ibqueryerrors have default behaviors if run without any address. So it seems to make sense for the -G to be an option with a parameter. I will admit it can be made to work either way, and I can "fix" it if you like. Ira > > Sasha > > > @@ -399,8 +430,9 @@ int main(int argc, char **argv) > > {"suppress-common", 'c', 0, NULL, > > "suppress some of the common counters"}, > > {"node-name-map", 1, 1, "", "node name map file"}, > > - {"switch", 'S', 1, "", > > - "query only (hex format)"}, > > + {"node-guid", 'G', 1, "", "query only "}, > > + {"", 'S', 1, "", > > + "Same as \"-G\" for backward compatibility"}, > > {"Direct", 'D', 1, "", > > "query only switch specified by "}, > > {"report-port", 'r', 0, NULL, > > @@ -408,7 +440,9 @@ int main(int argc, char **argv) > > {"GNDN", 'R', 0, NULL, > > "(This option is obsolete and does nothing)"}, > > {"data", 2, 0, NULL, "include the data counters in the output"}, > > - {"all", 3, 0, NULL, "output all nodes (not just switches)"}, > > + {"switch", 3, 0, NULL, "print data for switches only"}, > > + {"ca", 4, 0, NULL, "print data for CA's only"}, > > + {"router", 5, 0, NULL, "print data for routers only"}, > > {0} > > }; > > char usage_args[] = ""; > > @@ -438,13 +472,13 @@ int main(int argc, char **argv) > > NULL, ibmad_port)) < 0) > > IBWARN("Failed to resolve %s; attempting full scan\n", > > dr_path); > > - } else if (switch_guid_str) { > > + } else if (node_guid_str) { > > if ((resolved = > > - ib_resolve_portid_str_via(&portid, switch_guid_str, > > + ib_resolve_portid_str_via(&portid, node_guid_str, > > IB_DEST_GUID, ibd_sm_id, > > ibmad_port)) >= 0) > > IBWARN("Failed to resolve %s; attempting full scan\n", > > - switch_guid_str); > > + node_guid_str); > > } > > > > if (resolved >= 0) > > @@ -463,13 +497,13 @@ int main(int argc, char **argv) > > > > report_suppressed(); > > > > - if (switch_guid_str) { > > - ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid); > > + if (node_guid_str) { > > + ibnd_node_t *node = ibnd_find_node_guid(fabric, node_guid); > > if (node) > > print_node(node, NULL); > > else > > fprintf(stderr, "Failed to find node: %s\n", > > - switch_guid_str); > > + node_guid_str); > > } else if (dr_path) { > > ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path); > > uint8_t ni[IB_SMP_DATA_SIZE]; > > @@ -477,9 +511,9 @@ int main(int argc, char **argv) > > if (!smp_query_via(ni, &portid, IB_ATTR_NODE_INFO, 0, > > ibd_timeout, ibmad_port)) > > return -1; > > - mad_decode_field(ni, IB_NODE_GUID_F, &(switch_guid)); > > + mad_decode_field(ni, IB_NODE_GUID_F, &(node_guid)); > > > > - node = ibnd_find_node_guid(fabric, switch_guid); > > + node = ibnd_find_node_guid(fabric, node_guid); > > if (node) > > print_node(node, NULL); > > else > > -- > > 1.5.4.5 > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From khris4 at gmail.com Tue Sep 29 10:56:39 2009 From: khris4 at gmail.com (Chris Andrews) Date: Tue, 29 Sep 2009 10:56:39 -0700 Subject: [ofa-general] Ib_iser error with OFED 1.5.1 and centos 5.3 Message-ID: <1254246999.4931.0.camel@chris-desktop> Can some please give me a little insight on this issue From weiny2 at llnl.gov Tue Sep 29 10:51:02 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 29 Sep 2009 10:51:02 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: <20090929162457.GC10188@opengridcomputing.com> Message-ID: <20090929105102.77b6e126.weiny2@llnl.gov> +1 Even though I have forgotten to email the new list once already. Have the old list spank me and I will learn. Ira On Tue, 29 Sep 2009 13:25:13 -0400 Jeff Squyres wrote: > It seems like we did that last time and people are still mailing to openib at openib.org > . > > Why not cut the cord completely? Perhaps setup an auto-responder > saying "your was discarded; all mail should be re-sent to linux- > rdma at ...etc." People will make the mistake of mailing to the old/non- > existent address once and then update their addressbooks. > > Plus, won't it be harder on the spam controls on the new list to > recognize a variety of old names for this list? > > ** My $0.02: out with the old, in with the new. Otherwise we continue > a chain of really old/outdated email addresses and infrastructure. > > > > On Sep 29, 2009, at 12:24 PM, Jon Mason wrote: > > > On Tue, Sep 29, 2009 at 12:16:39PM -0400, Jeff Squyres wrote: > > > What happens to this list after tomorrow? (i.e., > > > general at lists.openfabrics.org) Will mails bounce? > > > > > > The intent is that all mails to the "general" list should be sent > > to the > > > linux-rdma list instead, right? > > > > Can we set up an auto forward on the e-mail server? > > > > > > > > -- > > > Jeff Squyres > > > jsquyres at cisco.com > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > > > > > -- > Jeff Squyres > jsquyres at cisco.com > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://*openib.org/mailman/listinfo/openib-general > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 weiny2 at llnl.gov From jgunthorpe at obsidianresearch.com Tue Sep 29 11:04:55 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Tue, 29 Sep 2009 12:04:55 -0600 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: <20090929162457.GC10188@opengridcomputing.com> Message-ID: <20090929180455.GZ19540@obsidianresearch.com> On Tue, Sep 29, 2009 at 01:25:13PM -0400, Jeff Squyres wrote: > Plus, won't it be harder on the spam controls on the new list to > recognize a variety of old names for this list? Many spam controls happen at the SMTP session level, forwarding messages will defeat that. Jason From Jeffrey.C.Becker at nasa.gov Tue Sep 29 12:06:36 2009 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Tue, 29 Sep 2009 12:06:36 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: Message-ID: <4AC25ABC.2000002@nasa.gov> Hi all. I propose the following plan to "shutdown" the general list: 1) unsubscribe all current subscribers 2) set the list to discard any incoming messages with an auto-discard message that points you to linux-rdma at vger.kernel.org Please send comments/suggestions. Thanks. -jeff Jeff Squyres wrote: > What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org > ) Will mails bounce? > > The intent is that all mails to the "general" list should be sent to > the linux-rdma list instead, right? > > From jsquyres at cisco.com Tue Sep 29 12:15:36 2009 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 29 Sep 2009 15:15:36 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> Message-ID: <4645DED9-EE70-4642-AACB-2D0D1A9D7248@cisco.com> +1 On Sep 29, 2009, at 3:06 PM, Jeff Becker wrote: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org > > Please send comments/suggestions. Thanks. > > -jeff > > Jeff Squyres wrote: > > What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org > > ) Will mails bounce? > > > > The intent is that all mails to the "general" list should be sent to > > the linux-rdma list instead, right? > > > > > > -- Jeff Squyres jsquyres at cisco.com From sashak at voltaire.com Tue Sep 29 12:28:31 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 21:28:31 +0200 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> Message-ID: <20090929192831.GD9465@me> On 12:06 Tue 29 Sep , Jeff Becker wrote: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org Seems as a good plan for me. Sasha From hal.rosenstock at gmail.com Tue Sep 29 12:31:11 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 29 Sep 2009 15:31:11 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> Message-ID: On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker wrote: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org > > Please send comments/suggestions. It's probably just me but I'm not ready yet. I haven't been able to post a patch to linux-rdma yet :-( -- Hal >Thanks. > > -jeff > > Jeff Squyres wrote: >> What happens to this list after tomorrow?  (i.e., general at lists.openfabrics.org >> )  Will mails bounce? >> >> The intent is that all mails to the "general" list should be sent to >> the linux-rdma list instead, right? >> >> > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jon at opengridcomputing.com Tue Sep 29 12:35:00 2009 From: jon at opengridcomputing.com (Jon Mason) Date: Tue, 29 Sep 2009 14:35:00 -0500 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> Message-ID: <20090929193500.GD10188@opengridcomputing.com> On Tue, Sep 29, 2009 at 12:06:36PM -0700, Jeff Becker wrote: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org > > Please send comments/suggestions. Thanks. Works for me. Thanks, Jon > > -jeff > > Jeff Squyres wrote: > > What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org > > ) Will mails bounce? > > > > The intent is that all mails to the "general" list should be sent to > > the linux-rdma list instead, right? > > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From vlad at dev.mellanox.co.il Tue Sep 29 12:57:25 2009 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 29 Sep 2009 21:57:25 +0200 Subject: [ofa-general] Re: general Digest, Vol 32, Issue 108 In-Reply-To: <0D401559-D16C-4DA9-85A3-B6851C1A1276@gmail.com> References: <20090929050452.59AE1E621D7@openfabrics.org> <0D401559-D16C-4DA9-85A3-B6851C1A1276@gmail.com> Message-ID: <4AC266A5.1030107@dev.mellanox.co.il> Chris wrote: >> >> Hey Guys, >> >> >> So after compiling and installing everthing from OFED 1.5 I get this >> issue >> when trying to start iscsi with ib_iser. >> >> >> Linux localhost.localdomain 2.6.18-164.el5xen #1 SMP Thu Sep 3 >> 04:03:03 EDT >> 2009 x86_64 x86_64 x86_64 GNU/Linux >> >> Starting iSCSI initiator service: FATAL: Error inserting ib_iser >> (/lib/modules/2.6.18-164.el5xen/kernel/drivers/infiniband/ulp/iser/ib_iser.= >> >> ko): >> Unknown symbol in module, or unknown parameter (see dmesg) >> [ OK ] >> Setting up iSCSI targets: iscsiadm: No records found! >> [ OK ] >> >> iscsi: registered transport (tcp) >> ib_iser: disagrees about version of symbol ib_fmr_pool_unmap >> ib_iser: Unknown symbol ib_fmr_pool_unmap >> ib_iser: disagrees about version of symbol ib_create_cq >> ib_iser: Unknown symbol ib_create_cq >> ib_iser: disagrees about version of symbol rdma_resolve_addr >> ib_iser: Unknown symbol rdma_resolve_addr >> ib_iser: disagrees about version of symbol ib_create_fmr_pool >> ib_iser: Unknown symbol ib_create_fmr_pool >> ib_iser: disagrees about version of symbol ib_dereg_mr Hi Chris, OFED-1.5 supports ib_iser on kernel 2.6.30 only. So, after OFED-1.5 installation you have an updated IB kernel modules under /lib/modules/2.6.18-164.el5xen/update/kernel/drivers/infiniband/ except ib_iser (which is coming with RHEL installation). This prevents you from loading ib_iser module. So, if you want to use ib_iser you should uninstall OFED-1.5 and install IB packages coming with RHEL. Regards, Vladimir From Jeffrey.C.Becker at nasa.gov Tue Sep 29 13:05:31 2009 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Tue, 29 Sep 2009 13:05:31 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: <4AC25ABC.2000002@nasa.gov> Message-ID: <4AC2688B.4080305@nasa.gov> Hi Hal Hal Rosenstock wrote: > On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker wrote: > >> Hi all. I propose the following plan to "shutdown" the general list: >> >> 1) unsubscribe all current subscribers >> 2) set the list to discard any incoming messages with an auto-discard >> message that points you to linux-rdma at vger.kernel.org >> >> Please send comments/suggestions. >> > > It's probably just me but I'm not ready yet. I haven't been able to > post a patch to linux-rdma yet :-( > Please let me know when you are able to do this. Thanks. -jeff > -- Hal > > >> Thanks. >> >> -jeff >> >> Jeff Squyres wrote: >> >>> What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org >>> ) Will mails bounce? >>> >>> The intent is that all mails to the "general" list should be sent to >>> the linux-rdma list instead, right? >>> >>> >>> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >> >> From rdr at iol.unh.edu Tue Sep 29 13:25:36 2009 From: rdr at iol.unh.edu (Robert D. Russell) Date: Tue, 29 Sep 2009 16:25:36 -0400 (EDT) Subject: [ofa-general] getting path to backport directory (fwd) Message-ID: A general question on getting the correct backport. On my machine, if I do "uname -r" I get 2.6.18-128.el5 If I do "cat /etc/redhat-release" I get CentOs release 5.3 (Final) If I look in "/usr/src/ofa_kernel/kernel_addons/backport" the subdirectory I need to use for the current kernel is: 2.6.18-EL5.3 My question: Is there somewhere in the system where I can find (or generate) the string "2.6.18-EL5.3"? I want to put that in my scripts so they will automatically pick it up whenever we change versions (as we just did when going to Centos -- it used to be 2.6.18-EL5.2 in the RedHat version we were running before). At present I have to edit these scripts by hand, and that's a lousy way to do business. Thanks, Bob Russell From cl at linux-foundation.org Tue Sep 29 13:25:30 2009 From: cl at linux-foundation.org (Christoph Lameter) Date: Tue, 29 Sep 2009 16:25:30 -0400 (EDT) Subject: [ofa-general] getting path to backport directory (fwd) In-Reply-To: References: Message-ID: On Tue, 29 Sep 2009, Robert D. Russell wrote: > My question: Is there somewhere in the system where I can > find (or generate) the string "2.6.18-EL5.3"? Rebuilt the kernel with another version string? > I want to put that in my scripts so they will automatically > pick it up whenever we change versions (as we just did when > going to Centos -- it used to be 2.6.18-EL5.2 in the RedHat > version we were running before). At present I have to edit > these scripts by hand, and that's a lousy way to do business. Create a new "uname" script that outputs what you want? From rdreier at cisco.com Tue Sep 29 14:19:49 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 29 Sep 2009 14:19:49 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> (Jeff Becker's message of "Tue, 29 Sep 2009 12:06:36 -0700") References: <4AC25ABC.2000002@nasa.gov> Message-ID: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org Sounds like a perfect plan to me... sorry for not pushing on this sooner, thanks Jeff for following up. - R. From rdreier at cisco.com Tue Sep 29 14:20:27 2009 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 29 Sep 2009 14:20:27 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: (Hal Rosenstock's message of "Tue, 29 Sep 2009 15:31:11 -0400") References: <4AC25ABC.2000002@nasa.gov> Message-ID: > It's probably just me but I'm not ready yet. I haven't been able to > post a patch to linux-rdma yet :-( What is going wrong when you try? - R. From sashak at voltaire.com Tue Sep 29 14:53:38 2009 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 29 Sep 2009 23:53:38 +0200 Subject: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked In-Reply-To: <20090924193444.GA15377@comcast.net> References: <20090924193444.GA15377@comcast.net> Message-ID: <20090929215338.GA17846@me> Hi Hal, On 15:34 Thu 24 Sep , Hal Rosenstock wrote: > > When multiple switches are unlinked and then a switch is relinked, > it should behave like a cable pull or power down of switch so it > depends on the state of the remote peer port (as to linked or not). > This is not represented in the IB port/port physical state and is > additional state. I'm not sure that I understand what this patch tries to achieve - I cannot see any changes related to port physical state handling. I can only see that you try to prevent linking with previously unlinked ports, and it is not clear for me why. Could you explain? > > Signed-off-by: Hal Rosenstock > --- > > diff --git a/ibsim/sim.h b/ibsim/sim.h > index bf85875..52eb73b 100644 > --- a/ibsim/sim.h > +++ b/ibsim/sim.h > @@ -210,6 +211,7 @@ struct Port { > int remoteport; > Node *previous_remotenode; > int previous_remoteport; > + int unlinked; Do you really need this flag? Existence of non NULL previous_remotenode pointer should be good indication. > int errrate; > uint16_t errattr; > Node *node; > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index cb6e639..d27ab0f 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -1,5 +1,6 @@ > /* > * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > + * Copyright (c) 2009 HNR Consulting. All rights reserved. > * > * This file is part of ibsim. > * > @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) > > rport = node_get_port(rnode, rportnum); > > + if (rport->unlinked) { > + lport->unlinked = 0; > + return -1; > + } > + Why? > if (link_ports(lport, rport) < 0) > return -fprintf(f, > "# can't link: local/remote port are already connected\n"); > > lport->previous_remotenode = NULL; > rport->previous_remotenode = NULL; > + lport->unlinked = 0; > > return 0; > } > @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) > numports++; // To make the for-loop below run up to last port > else > lportnum--; > - > + > if (lportnum >= 0) { > lport = ports + lnode->portsbase + lportnum; > > @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) > rport = node_get_port(lport->previous_remotenode, > lport->previous_remoteport); > > + if (rport->unlinked) { > + lport->unlinked = 0; > + return -1; > + } > + Why? > if (link_ports(lport, rport) < 0) > return -fprintf(f, > "# can't link: local/remote port are already connected\n"); > > lport->previous_remotenode = NULL; > rport->previous_remotenode = NULL; > + lport->unlinked = 0; > > return 1; > } > @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) > rport = node_get_port(lport->previous_remotenode, > lport->previous_remoteport); > > + if (rport->unlinked) { > + lport->unlinked = 0; > + continue; > + } > + Ditto. Sasha > if (link_ports(lport, rport) < 0) > continue; > > lport->previous_remotenode = NULL; > rport->previous_remotenode = NULL; > + lport->unlinked = 0; > > relinked++; > } > @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) > lport->previous_remoteport = lport->remoteport; > rport->previous_remotenode = rport->remotenode; > rport->previous_remoteport = rport->remoteport; > + lport->unlinked = 1; > > lport->remotenode = rport->remotenode = 0; > lport->remoteport = rport->remoteport = 0; > @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) > if (portnum >= 0) { > port = ports + node->portsbase + portnum; > if (!clear && !port->remotenode) { > + port->unlinked = 1; > fprintf(f, "# port %d at nodeid \"%s\" is not linked\n", > portnum, nodeid); > return -1; > @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) > > for (port = ports + node->portsbase, e = port + numports; port < e; > port++) { > - if (!clear && !port->remotenode) > + if (!clear && !port->remotenode) { > + port->unlinked = 1; > continue; > + } > if (port->remotenode) > unlink_port(node, port, port->remotenode, > port->remoteport); > diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c > index 8a5d281..0092068 100644 > --- a/ibsim/sim_net.c > +++ b/ibsim/sim_net.c > @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) > port->linkwidth = LINKWIDTH_4x; > port->linkspeedena = netspeed; > port->linkspeed = LINKSPEED_SDR; > + port->unlinked = 0; > > size = (type == SWITCH_NODE && i) ? sw_pkey_size : ca_pkey_size; > if (size) { From hal.rosenstock at gmail.com Tue Sep 29 14:54:10 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 29 Sep 2009 17:54:10 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: <4AC25ABC.2000002@nasa.gov> Message-ID: On Tue, Sep 29, 2009 at 5:20 PM, Roland Dreier wrote: > > > It's probably just me but I'm not ready yet. I haven't been able to > > post a patch to linux-rdma yet :-( > > What is going wrong when you try? It disappears into the ether without any response. I can see it getting a status=Sent out of my SMTP relay to the linux-rdma list saying "Message accepted for delivery": Sep 29 06:53:53 hal sm-msp-queue[26670]: n8TAr2Ae026642: to=general at lists.openfabrics.org,linux-rdma at vger.linux.org,sashak at voltaire.com, ctladdr=hnrose (502/502), delay=00:00:51, xdelay=00:00:00, mailer=relay, pri=182536, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (n8TArlkX026673 Message accepted for delivery) Sep 29 15:24:19 hal sendmail[28326]: n8TJOIEP028326: from=hnrose, size=2481, class=0, nrcpts=1, msgid=<20090929192417.GA28293 at comcast.net>, relay=hnrose at localhost -- Hal > > - R. > From hal.rosenstock at gmail.com Tue Sep 29 15:05:12 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 29 Sep 2009 18:05:12 -0400 Subject: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked In-Reply-To: <20090929215338.GA17846@me> References: <20090924193444.GA15377@comcast.net> <20090929215338.GA17846@me> Message-ID: Hi Sasha, On Tue, Sep 29, 2009 at 5:53 PM, Sasha Khapyorsky wrote: > Hi Hal, > > On 15:34 Thu 24 Sep     , Hal Rosenstock wrote: >> >> When multiple switches are unlinked and then a switch is relinked, >> it should behave like a cable pull or power down of switch so it >> depends on the state of the remote peer port (as to linked or not). >> This is not represented in the IB port/port physical state and is >> additional state. > > I'm not sure that I understand what this patch tries to achieve - I > cannot see any changes related to port physical state handling. I can > only see that you try to prevent linking with previously unlinked ports, > and it is not clear for me why. Could you explain? The failure scenario is to unlink 2 connected switches and then relink the first one. It then relinks the second one even though it still should be unlinked. > >> >> Signed-off-by: Hal Rosenstock >> --- >> >> diff --git a/ibsim/sim.h b/ibsim/sim.h >> index bf85875..52eb73b 100644 >> --- a/ibsim/sim.h >> +++ b/ibsim/sim.h >> @@ -210,6 +211,7 @@ struct Port { >>       int remoteport; >>       Node *previous_remotenode; >>       int previous_remoteport; >> +     int unlinked; > > Do you really need this flag? Existence of non NULL previous_remotenode > pointer should be good indication. That's how I started (using previous_remotenode) but it didn't work correctly for all cases. It worked with the simple case above (unlink 2 switches and relink the first). It didn't work with a 3 switch case. -- Hal >>       int errrate; >>       uint16_t errattr; >>       Node *node; >> diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c >> index cb6e639..d27ab0f 100644 >> --- a/ibsim/sim_cmd.c >> +++ b/ibsim/sim_cmd.c >> @@ -1,5 +1,6 @@ >>  /* >>   * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. >> + * Copyright (c) 2009 HNR Consulting. All rights reserved. >>   * >>   * This file is part of ibsim. >>   * >> @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) >> >>       rport = node_get_port(rnode, rportnum); >> >> +     if (rport->unlinked) { >> +             lport->unlinked = 0; >> +             return -1; >> +     } >> + > > Why? > >>       if (link_ports(lport, rport) < 0) >>               return -fprintf(f, >>                               "# can't link: local/remote port are already connected\n"); >> >>       lport->previous_remotenode = NULL; >>       rport->previous_remotenode = NULL; >> +     lport->unlinked = 0; >> >>       return 0; >>  } >> @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) >>               numports++;     // To make the for-loop below run up to last port >>       else >>               lportnum--; >> - >> + >>       if (lportnum >= 0) { >>               lport = ports + lnode->portsbase + lportnum; >> >> @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) >>               rport = node_get_port(lport->previous_remotenode, >>                                     lport->previous_remoteport); >> >> +             if (rport->unlinked) { >> +                     lport->unlinked = 0; >> +                     return -1; >> +             } >> + > > Why? > >>               if (link_ports(lport, rport) < 0) >>                       return -fprintf(f, >>                                       "# can't link: local/remote port are already connected\n"); >> >>               lport->previous_remotenode = NULL; >>               rport->previous_remotenode = NULL; >> +             lport->unlinked = 0; >> >>               return 1; >>       } >> @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) >>               rport = node_get_port(lport->previous_remotenode, >>                                     lport->previous_remoteport); >> >> +             if (rport->unlinked) { >> +                     lport->unlinked = 0; >> +                     continue; >> +             } >> + > > Ditto. > > Sasha > >>               if (link_ports(lport, rport) < 0) >>                       continue; >> >>               lport->previous_remotenode = NULL; >>               rport->previous_remotenode = NULL; >> +             lport->unlinked = 0; >> >>               relinked++; >>       } >> @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) >>       lport->previous_remoteport = lport->remoteport; >>       rport->previous_remotenode = rport->remotenode; >>       rport->previous_remoteport = rport->remoteport; >> +     lport->unlinked = 1; >> >>       lport->remotenode = rport->remotenode = 0; >>       lport->remoteport = rport->remoteport = 0; >> @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) >>       if (portnum >= 0) { >>               port = ports + node->portsbase + portnum; >>               if (!clear && !port->remotenode) { >> +                     port->unlinked = 1; >>                       fprintf(f, "# port %d at nodeid \"%s\" is not linked\n", >>                               portnum, nodeid); >>                       return -1; >> @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) >> >>       for (port = ports + node->portsbase, e = port + numports; port < e; >>            port++) { >> -             if (!clear && !port->remotenode) >> +             if (!clear && !port->remotenode) { >> +                     port->unlinked = 1; >>                       continue; >> +             } >>               if (port->remotenode) >>                       unlink_port(node, port, port->remotenode, >>                                   port->remoteport); >> diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c >> index 8a5d281..0092068 100644 >> --- a/ibsim/sim_net.c >> +++ b/ibsim/sim_net.c >> @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) >>               port->linkwidth = LINKWIDTH_4x; >>               port->linkspeedena = netspeed; >>               port->linkspeed = LINKSPEED_SDR; >> +             port->unlinked = 0; >> >>               size = (type == SWITCH_NODE && i) ? sw_pkey_size : ca_pkey_size; >>               if (size) { > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jaschut at sandia.gov Tue Sep 29 15:13:20 2009 From: jaschut at sandia.gov (Jim Schutt) Date: Tue, 29 Sep 2009 16:13:20 -0600 Subject: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked In-Reply-To: <20090929215338.GA17846@me> References: <20090924193444.GA15377@comcast.net> <20090929215338.GA17846@me> Message-ID: <1254262400.4776.1456.camel@sale659.sandia.gov> Hi Sasha, On Tue, 2009-09-29 at 15:53 -0600, Sasha Khapyorsky wrote: > Hi Hal, > > On 15:34 Thu 24 Sep , Hal Rosenstock wrote: > > > > When multiple switches are unlinked and then a switch is relinked, > > it should behave like a cable pull or power down of switch so it > > depends on the state of the remote peer port (as to linked or not). > > This is not represented in the IB port/port physical state and is > > additional state. > > I'm not sure that I understand what this patch tries to achieve - I > cannot see any changes related to port physical state handling. I can > only see that you try to prevent linking with previously unlinked ports, > and it is not clear for me why. Could you explain? This patch makes it possible to use ibsim to test failure scenarios where multiple switches fail or are powered off, then replaced or powered on, using the ibsim unlink/relink commands to simulate switch power-down/power-up. Without this patch ibsim doesn't behave like a real fabric under the conditions Hal described. -- Jim > > > > > Signed-off-by: Hal Rosenstock > > --- > > > > diff --git a/ibsim/sim.h b/ibsim/sim.h > > index bf85875..52eb73b 100644 > > --- a/ibsim/sim.h > > +++ b/ibsim/sim.h > > @@ -210,6 +211,7 @@ struct Port { > > int remoteport; > > Node *previous_remotenode; > > int previous_remoteport; > > + int unlinked; > > Do you really need this flag? Existence of non NULL previous_remotenode > pointer should be good indication. > > > int errrate; > > uint16_t errattr; > > Node *node; > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > > index cb6e639..d27ab0f 100644 > > --- a/ibsim/sim_cmd.c > > +++ b/ibsim/sim_cmd.c > > @@ -1,5 +1,6 @@ > > /* > > * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. > > + * Copyright (c) 2009 HNR Consulting. All rights reserved. > > * > > * This file is part of ibsim. > > * > > @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) > > > > rport = node_get_port(rnode, rportnum); > > > > + if (rport->unlinked) { > > + lport->unlinked = 0; > > + return -1; > > + } > > + > > Why? > > > if (link_ports(lport, rport) < 0) > > return -fprintf(f, > > "# can't link: local/remote port are already connected\n"); > > > > lport->previous_remotenode = NULL; > > rport->previous_remotenode = NULL; > > + lport->unlinked = 0; > > > > return 0; > > } > > @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) > > numports++; // To make the for-loop below run up to last port > > else > > lportnum--; > > - > > + > > if (lportnum >= 0) { > > lport = ports + lnode->portsbase + lportnum; > > > > @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) > > rport = node_get_port(lport->previous_remotenode, > > lport->previous_remoteport); > > > > + if (rport->unlinked) { > > + lport->unlinked = 0; > > + return -1; > > + } > > + > > Why? > > > if (link_ports(lport, rport) < 0) > > return -fprintf(f, > > "# can't link: local/remote port are already connected\n"); > > > > lport->previous_remotenode = NULL; > > rport->previous_remotenode = NULL; > > + lport->unlinked = 0; > > > > return 1; > > } > > @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) > > rport = node_get_port(lport->previous_remotenode, > > lport->previous_remoteport); > > > > + if (rport->unlinked) { > > + lport->unlinked = 0; > > + continue; > > + } > > + > > Ditto. > > Sasha > > > if (link_ports(lport, rport) < 0) > > continue; > > > > lport->previous_remotenode = NULL; > > rport->previous_remotenode = NULL; > > + lport->unlinked = 0; > > > > relinked++; > > } > > @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) > > lport->previous_remoteport = lport->remoteport; > > rport->previous_remotenode = rport->remotenode; > > rport->previous_remoteport = rport->remoteport; > > + lport->unlinked = 1; > > > > lport->remotenode = rport->remotenode = 0; > > lport->remoteport = rport->remoteport = 0; > > @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) > > if (portnum >= 0) { > > port = ports + node->portsbase + portnum; > > if (!clear && !port->remotenode) { > > + port->unlinked = 1; > > fprintf(f, "# port %d at nodeid \"%s\" is not linked\n", > > portnum, nodeid); > > return -1; > > @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) > > > > for (port = ports + node->portsbase, e = port + numports; port < e; > > port++) { > > - if (!clear && !port->remotenode) > > + if (!clear && !port->remotenode) { > > + port->unlinked = 1; > > continue; > > + } > > if (port->remotenode) > > unlink_port(node, port, port->remotenode, > > port->remoteport); > > diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c > > index 8a5d281..0092068 100644 > > --- a/ibsim/sim_net.c > > +++ b/ibsim/sim_net.c > > @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) > > port->linkwidth = LINKWIDTH_4x; > > port->linkspeedena = netspeed; > > port->linkspeed = LINKSPEED_SDR; > > + port->unlinked = 0; > > > > size = (type == SWITCH_NODE && i) ? sw_pkey_size : ca_pkey_size; > > if (size) { > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From hal.rosenstock at gmail.com Tue Sep 29 15:22:54 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 29 Sep 2009 18:22:54 -0400 Subject: [ofa-general] Re: [PATCH] ibsim/sim_cmd.c: Only relink port if remote port is currently linked In-Reply-To: <20090929215338.GA17846@me> References: <20090924193444.GA15377@comcast.net> <20090929215338.GA17846@me> Message-ID: Hi again Sasha, In my previous post, I missed answering some of your (implied) questions. On Tue, Sep 29, 2009 at 5:53 PM, Sasha Khapyorsky wrote: > Hi Hal, > > On 15:34 Thu 24 Sep     , Hal Rosenstock wrote: >> >> When multiple switches are unlinked and then a switch is relinked, >> it should behave like a cable pull or power down of switch so it >> depends on the state of the remote peer port (as to linked or not). >> This is not represented in the IB port/port physical state and is >> additional state. > > I'm not sure that I understand what this patch tries to achieve - I > cannot see any changes related to port physical state handling. Right; that's because there is none. My point was that this condition (e.g. simulated power off switch) cannot be represented in IB port state or port physical state. > I can only see that you try to prevent linking with previously unlinked ports, Yes. -- Hal > and it is not clear for me why. Could you explain? > >> >> Signed-off-by: Hal Rosenstock >> --- >> >> diff --git a/ibsim/sim.h b/ibsim/sim.h >> index bf85875..52eb73b 100644 >> --- a/ibsim/sim.h >> +++ b/ibsim/sim.h >> @@ -210,6 +211,7 @@ struct Port { >>       int remoteport; >>       Node *previous_remotenode; >>       int previous_remoteport; >> +     int unlinked; > > Do you really need this flag? Existence of non NULL previous_remotenode > pointer should be good indication. > >>       int errrate; >>       uint16_t errattr; >>       Node *node; >> diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c >> index cb6e639..d27ab0f 100644 >> --- a/ibsim/sim_cmd.c >> +++ b/ibsim/sim_cmd.c >> @@ -1,5 +1,6 @@ >>  /* >>   * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. >> + * Copyright (c) 2009 HNR Consulting. All rights reserved. >>   * >>   * This file is part of ibsim. >>   * >> @@ -146,12 +147,18 @@ static int do_link(FILE * f, char *line) >> >>       rport = node_get_port(rnode, rportnum); >> >> +     if (rport->unlinked) { >> +             lport->unlinked = 0; >> +             return -1; >> +     } >> + > > Why? > >>       if (link_ports(lport, rport) < 0) >>               return -fprintf(f, >>                               "# can't link: local/remote port are already connected\n"); >> >>       lport->previous_remotenode = NULL; >>       rport->previous_remotenode = NULL; >> +     lport->unlinked = 0; >> >>       return 0; >>  } >> @@ -194,7 +201,7 @@ static int do_relink(FILE * f, char *line) >>               numports++;     // To make the for-loop below run up to last port >>       else >>               lportnum--; >> - >> + >>       if (lportnum >= 0) { >>               lport = ports + lnode->portsbase + lportnum; >> >> @@ -206,12 +213,18 @@ static int do_relink(FILE * f, char *line) >>               rport = node_get_port(lport->previous_remotenode, >>                                     lport->previous_remoteport); >> >> +             if (rport->unlinked) { >> +                     lport->unlinked = 0; >> +                     return -1; >> +             } >> + > > Why? > >>               if (link_ports(lport, rport) < 0) >>                       return -fprintf(f, >>                                       "# can't link: local/remote port are already connected\n"); >> >>               lport->previous_remotenode = NULL; >>               rport->previous_remotenode = NULL; >> +             lport->unlinked = 0; >> >>               return 1; >>       } >> @@ -224,11 +237,17 @@ static int do_relink(FILE * f, char *line) >>               rport = node_get_port(lport->previous_remotenode, >>                                     lport->previous_remoteport); >> >> +             if (rport->unlinked) { >> +                     lport->unlinked = 0; >> +                     continue; >> +             } >> + > > Ditto. > > Sasha > >>               if (link_ports(lport, rport) < 0) >>                       continue; >> >>               lport->previous_remotenode = NULL; >>               rport->previous_remotenode = NULL; >> +             lport->unlinked = 0; >> >>               relinked++; >>       } >> @@ -246,6 +265,7 @@ static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) >>       lport->previous_remoteport = lport->remoteport; >>       rport->previous_remotenode = rport->remotenode; >>       rport->previous_remoteport = rport->remoteport; >> +     lport->unlinked = 1; >> >>       lport->remotenode = rport->remotenode = 0; >>       lport->remoteport = rport->remoteport = 0; >> @@ -406,6 +426,7 @@ static int do_unlink(FILE * f, char *line, int clear) >>       if (portnum >= 0) { >>               port = ports + node->portsbase + portnum; >>               if (!clear && !port->remotenode) { >> +                     port->unlinked = 1; >>                       fprintf(f, "# port %d at nodeid \"%s\" is not linked\n", >>                               portnum, nodeid); >>                       return -1; >> @@ -420,8 +441,10 @@ static int do_unlink(FILE * f, char *line, int clear) >> >>       for (port = ports + node->portsbase, e = port + numports; port < e; >>            port++) { >> -             if (!clear && !port->remotenode) >> +             if (!clear && !port->remotenode) { >> +                     port->unlinked = 1; >>                       continue; >> +             } >>               if (port->remotenode) >>                       unlink_port(node, port, port->remotenode, >>                                   port->remoteport); >> diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c >> index 8a5d281..0092068 100644 >> --- a/ibsim/sim_net.c >> +++ b/ibsim/sim_net.c >> @@ -492,6 +492,7 @@ static void init_ports(Node * node, int type, int maxports) >>               port->linkwidth = LINKWIDTH_4x; >>               port->linkspeedena = netspeed; >>               port->linkspeed = LINKSPEED_SDR; >> +             port->unlinked = 0; >> >>               size = (type == SWITCH_NODE && i) ? sw_pkey_size : ca_pkey_size; >>               if (size) { > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From hal.rosenstock at gmail.com Tue Sep 29 15:25:25 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 29 Sep 2009 18:25:25 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC25ABC.2000002@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> Message-ID: On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker wrote: > Hi all. I propose the following plan to "shutdown" the general list: > > 1) unsubscribe all current subscribers > 2) set the list to discard any incoming messages with an auto-discard > message that points you to linux-rdma at vger.kernel.org > > Please send comments/suggestions. Care should be taken on any patches not cross posted (to linux-rdma) once the cutover takes place. There are quite a number of outstanding patches on general only. -- Hal >Thanks. > > -jeff > > Jeff Squyres wrote: >> What happens to this list after tomorrow?  (i.e., general at lists.openfabrics.org >> )  Will mails bounce? >> >> The intent is that all mails to the "general" list should be sent to >> the linux-rdma list instead, right? >> >> > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From weiny2 at llnl.gov Tue Sep 29 16:48:42 2009 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 29 Sep 2009 16:48:42 -0700 Subject: [ofa-general] [PATCH] infiniband-diags: Fix IB network discovery from switch node. In-Reply-To: <4AC232D5.2060806@gmail.com> References: <4A9548AA.4020900@gmail.com> <20090923172451.fb20ab9b.weiny2@llnl.gov> <4AC232D5.2060806@gmail.com> Message-ID: <20090929164842.c1ab7d06.weiny2@llnl.gov> On Tue, 29 Sep 2009 18:16:21 +0200 "Eli Dorfman (Voltaire)" wrote: > Ira Weiny wrote: > > Eli, > > > > On Wed, 26 Aug 2009 17:37:30 +0300 > > "Eli Dorfman (Voltaire)" wrote: > > > >> Subject: [PATCH] Fix IB network discovery from switch node. > > > > Sorry for the late inquiry on this but what exactly was the bug here? > > Sorry for the late response. > The problem is related to wrong discovery when running from the switch. > Without the patch ibnetdiscover finds only local switch Ok I see. [snip] > > I think that the problem is related to NodeInfo:LocalPort which is 0 in case of a switch. > I see that get_remote_node() sends direct route MAD to switch with path 0,0 and that fails (at least for Mellanox IS4 switch chips). > Another way to bypass this may be as follows: > > diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > index 1e93ff8..3dd0dc6 100644 > --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c > +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > @@ -461,7 +461,7 @@ get_remote_node(struct ibnd_fabric *fabric, struct ibnd_node *node, struct ibnd_ > != IB_PORT_PHYS_STATE_LINKUP) > return -1; > > - if (extend_dpath(fabric, path, portnum) < 0) > + if (portnum > 0 && extend_dpath(fabric, path, portnum) < 0) > return -1; > > if (query_node(fabric, &node_buf, &port_buf, path)) { > > > Please check whether this is OK and I can send a new patch. > This seems to fix my issue. Here is a patch against master which works for me. If you want to verify that would be great. Thanks for helping me out, Ira From: Ira Weiny Date: Tue, 22 Sep 2009 11:08:28 -0700 Subject: [PATCH] infiniband-diags/libibnetdisc/src/ibnetdisc.c: fix bug in single node processing. Eli fixed an issue with running ibnetdiscover from a switch but it introduced a bug in processing a single switch: 17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: ... 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" ( ) 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) ... The port we "come in on" when discovering the switch is not reported properly. This patch, suggested by Eli, reverses Eli's patch and fixes his original bug in a way which does not introduce the above issue. Signed-off-by: Ira Weiny --- infiniband-diags/libibnetdisc/src/ibnetdisc.c | 18 ++++++++---------- 1 files changed, 8 insertions(+), 10 deletions(-) diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c index 97e369c..96f72c5 100644 --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c @@ -506,7 +506,7 @@ static int get_remote_node(struct ibmad_port *ibmad_port, != IB_PORT_PHYS_STATE_LINKUP) return 1; /* positive == non-fatal error */ - if (extend_dpath(ibmad_port, fabric, path, portnum) < 0) + if (portnum > 0 && extend_dpath(ibmad_port, fabric, path, portnum) < 0) return -1; if (query_node(ibmad_port, fabric, &node_buf, &port_buf, path)) { @@ -600,15 +600,13 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * ibmad_port, if (!port) goto error; - if (node->type != IB_NODE_SWITCH) { - rc = get_remote_node(ibmad_port, fabric, node, port, from, - mad_get_field(node->info, 0, - IB_NODE_LOCAL_PORT_F), 0); - if (rc < 0) - goto error; - if (rc > 0) /* non-fatal error, nothing more to be done */ - return ((ibnd_fabric_t *) fabric); - } + rc = get_remote_node(ibmad_port, fabric, node, port, from, + mad_get_field(node->info, 0, + IB_NODE_LOCAL_PORT_F), 0); + if (rc < 0) + goto error; + if (rc > 0) /* non-fatal error, nothing more to be done */ + return ((ibnd_fabric_t *) fabric); for (dist = 0; dist <= max_hops; dist++) { -- 1.5.4.5 From Jeffrey.C.Becker at nasa.gov Tue Sep 29 17:40:08 2009 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Tue, 29 Sep 2009 17:40:08 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: References: <4AC25ABC.2000002@nasa.gov> Message-ID: <4AC2A8E8.5040404@nasa.gov> Hal Rosenstock wrote: > On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker wrote: > >> Hi all. I propose the following plan to "shutdown" the general list: >> >> 1) unsubscribe all current subscribers >> 2) set the list to discard any incoming messages with an auto-discard >> message that points you to linux-rdma at vger.kernel.org >> >> Please send comments/suggestions. >> > > Care should be taken on any patches not cross posted (to linux-rdma) > once the cutover takes place. There are quite a number of outstanding > patches on general only. > From tomorrow on, the general list will continue to exist with searchable archives, but no new messages will be accepted. People who try to send to general will be told to send to linux-rdma at vger.kernel.org instead. If someone posted a patch to general before the switch and it hasn't been accepted, they can repost to the new list. Hope this works for everyone. Thanks -jeff -jeff > -- Hal > > >> Thanks. >> >> -jeff >> >> Jeff Squyres wrote: >> >>> What happens to this list after tomorrow? (i.e., general at lists.openfabrics.org >>> ) Will mails bounce? >>> >>> The intent is that all mails to the "general" list should be sent to >>> the linux-rdma list instead, right? >>> >>> >>> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >> >> From or.gerlitz at gmail.com Tue Sep 29 21:04:02 2009 From: or.gerlitz at gmail.com (Or Gerlitz) Date: Wed, 30 Sep 2009 06:04:02 +0200 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC2A8E8.5040404@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> <4AC2A8E8.5040404@nasa.gov> Message-ID: <15ddcffd0909292104g3490cc9bwb3543a7f92d010ba@mail.gmail.com> On Wed, Sep 30, 2009 at 2:40 AM, Jeff Becker wrote: > From tomorrow on, the general list will continue to exist with > searchable archives, but no new messages will be accepted. People who > try to send to general will be told to send to > linux-rdma at vger.kernel.org instead. If someone posted a patch to general > before the switch and it hasn't been accepted, they can repost to the > new list. Hope this works for everyone. Thanks sounds perfect to me, thanks for driving this Jeff and Roland. Or From tziporet at dev.mellanox.co.il Tue Sep 29 21:41:41 2009 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 30 Sep 2009 06:41:41 +0200 Subject: [ofa-general] help install ofed 1.4 on Centos 5.2 In-Reply-To: <20090928171429.GW19540@obsidianresearch.com> References: <9e14a1260909271342s5d5e34bbt851297a2fedcd988@mail.gmail.com> <4AC068CC.1050604@dev.mellanox.co.il> <1254147308.17199.50.camel@pc.interlinx.bc.ca> <20090928171429.GW19540@obsidianresearch.com> Message-ID: <4AC2E185.7070805@mellanox.co.il> Jason Gunthorpe wrote: > On Mon, Sep 28, 2009 at 10:15:08AM -0400, Brian J. Murrell wrote: > > >> This is a problem we run into with Lustre somewhat frequently. >> >> The issue is that deploying OFED 1.5 (i.e. beta software) in a >> production environment is completely unacceptable, yet leaving one's >> systems open to kernel vulnerabilities is equally unacceptable. >> > > Why aren't you just using the IB support directly in RH 5.4? > > I agree that best here is to take OFED coming from the distro. Tziporet > > From dorfman.eli at gmail.com Wed Sep 30 01:33:55 2009 From: dorfman.eli at gmail.com (Eli Dorfman (Voltaire)) Date: Wed, 30 Sep 2009 10:33:55 +0200 Subject: [ofa-general] [PATCH] infiniband-diags: Fix IB network discovery from switch node. In-Reply-To: <20090929164842.c1ab7d06.weiny2@llnl.gov> References: <4A9548AA.4020900@gmail.com> <20090923172451.fb20ab9b.weiny2@llnl.gov> <4AC232D5.2060806@gmail.com> <20090929164842.c1ab7d06.weiny2@llnl.gov> Message-ID: <4AC317F3.50304@gmail.com> Ira Weiny wrote: > On Tue, 29 Sep 2009 18:16:21 +0200 > "Eli Dorfman (Voltaire)" wrote: > >> Ira Weiny wrote: >>> Eli, >>> >>> On Wed, 26 Aug 2009 17:37:30 +0300 >>> "Eli Dorfman (Voltaire)" wrote: >>> >>>> Subject: [PATCH] Fix IB network discovery from switch node. >>> Sorry for the late inquiry on this but what exactly was the bug here? >> Sorry for the late response. >> The problem is related to wrong discovery when running from the switch. >> Without the patch ibnetdiscover finds only local switch > > Ok I see. > > [snip] > >> I think that the problem is related to NodeInfo:LocalPort which is 0 in case of a switch. >> I see that get_remote_node() sends direct route MAD to switch with path 0,0 and that fails (at least for Mellanox IS4 switch chips). >> Another way to bypass this may be as follows: >> >> diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> index 1e93ff8..3dd0dc6 100644 >> --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> @@ -461,7 +461,7 @@ get_remote_node(struct ibnd_fabric *fabric, struct ibnd_node *node, struct ibnd_ >> != IB_PORT_PHYS_STATE_LINKUP) >> return -1; >> >> - if (extend_dpath(fabric, path, portnum) < 0) >> + if (portnum > 0 && extend_dpath(fabric, path, portnum) < 0) >> return -1; >> >> if (query_node(fabric, &node_buf, &port_buf, path)) { >> >> >> Please check whether this is OK and I can send a new patch. >> > > This seems to fix my issue. Here is a patch against master which works for > me. If you want to verify that would be great. Verified this again and it works. Sasha, please apply this patch. Thanks, Eli > > Thanks for helping me out, > Ira > > From: Ira Weiny > Date: Tue, 22 Sep 2009 11:08:28 -0700 > Subject: [PATCH] infiniband-diags/libibnetdisc/src/ibnetdisc.c: fix bug in single node processing. > > Eli fixed an issue with running ibnetdiscover from a switch but it > introduced a bug in processing a single switch: > > 17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c > Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: > ... > 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" ( ) > 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" ( ) > ... > > The port we "come in on" when discovering the switch is not reported properly. > > This patch, suggested by Eli, reverses Eli's patch and fixes his original > bug in a way which does not introduce the above issue. > > Signed-off-by: Ira Weiny > --- > infiniband-diags/libibnetdisc/src/ibnetdisc.c | 18 ++++++++---------- > 1 files changed, 8 insertions(+), 10 deletions(-) > > diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > index 97e369c..96f72c5 100644 > --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c > +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > @@ -506,7 +506,7 @@ static int get_remote_node(struct ibmad_port *ibmad_port, > != IB_PORT_PHYS_STATE_LINKUP) > return 1; /* positive == non-fatal error */ > > - if (extend_dpath(ibmad_port, fabric, path, portnum) < 0) > + if (portnum > 0 && extend_dpath(ibmad_port, fabric, path, portnum) < 0) > return -1; > > if (query_node(ibmad_port, fabric, &node_buf, &port_buf, path)) { > @@ -600,15 +600,13 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * ibmad_port, > if (!port) > goto error; > > - if (node->type != IB_NODE_SWITCH) { > - rc = get_remote_node(ibmad_port, fabric, node, port, from, > - mad_get_field(node->info, 0, > - IB_NODE_LOCAL_PORT_F), 0); > - if (rc < 0) > - goto error; > - if (rc > 0) /* non-fatal error, nothing more to be done */ > - return ((ibnd_fabric_t *) fabric); > - } > + rc = get_remote_node(ibmad_port, fabric, node, port, from, > + mad_get_field(node->info, 0, > + IB_NODE_LOCAL_PORT_F), 0); > + if (rc < 0) > + goto error; > + if (rc > 0) /* non-fatal error, nothing more to be done */ > + return ((ibnd_fabric_t *) fabric); > > for (dist = 0; dist <= max_hops; dist++) { > From eli at mellanox.co.il Wed Sep 30 02:07:01 2009 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Sep 2009 11:07:01 +0200 Subject: [ofa-general] [PATCH] mlx4: remove limitation on LSO header size Message-ID: <20090930090701.GA2385@mtls03> Current code has a limitation as for the size of an LSO header not allowed to cross a 64 byte boundary. This patch removes this limitation by setting the WQE RR for large headers thus allowing LSO headers of any size. The extra buffer reserved for MLX4_IB_QP_LSO QPs has been doubled, from 64 to 128 bytes, assuming this is reasonable upper limit to header length. Also, this patch will cause IB_DEVICE_UD_TSO to be set only of FW versions that set MLX4_DEV_CAP_FLAG_BLH; e.g. FW version 2.6.000 and higher. Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/main.c | 2 +- drivers/infiniband/hw/mlx4/qp.c | 17 +++++++---------- include/linux/mlx4/device.h | 1 + 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 3cb3f47..e596537 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -103,7 +103,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) props->device_cap_flags |= IB_DEVICE_UD_IP_CSUM; - if (dev->dev->caps.max_gso_sz) + if (dev->dev->caps.max_gso_sz && dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_BLH) props->device_cap_flags |= IB_DEVICE_UD_TSO; if (dev->dev->caps.bmme_flags & MLX4_BMME_FLAG_RESERVED_LKEY) props->device_cap_flags |= IB_DEVICE_LOCAL_DMA_LKEY; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 219b103..1b356cf 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -261,7 +261,7 @@ static int send_wqe_overhead(enum ib_qp_type type, u32 flags) case IB_QPT_UD: return sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_datagram_seg) + - ((flags & MLX4_IB_QP_LSO) ? 64 : 0); + ((flags & MLX4_IB_QP_LSO) ? 128 : 0); case IB_QPT_UC: return sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_raddr_seg); @@ -1467,16 +1467,11 @@ static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ib_sge *sg) static int build_lso_seg(struct mlx4_wqe_lso_seg *wqe, struct ib_send_wr *wr, struct mlx4_ib_qp *qp, unsigned *lso_seg_len, - __be32 *lso_hdr_sz) + __be32 *lso_hdr_sz, int *blh) { unsigned halign = ALIGN(sizeof *wqe + wr->wr.ud.hlen, 16); - /* - * This is a temporary limitation and will be removed in - * a forthcoming FW release: - */ - if (unlikely(halign > 64)) - return -EINVAL; + *blh = unlikely(halign > 64) ? 1 : 0; if (unlikely(!(qp->flags & MLX4_IB_QP_LSO) && wr->num_sge > qp->sq.max_gs - (halign >> 4))) @@ -1523,6 +1518,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, __be32 *lso_wqe; __be32 uninitialized_var(lso_hdr_sz); int i; + int blh = 0; spin_lock_irqsave(&qp->sq.lock, flags); @@ -1616,7 +1612,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, size += sizeof (struct mlx4_wqe_datagram_seg) / 16; if (wr->opcode == IB_WR_LSO) { - err = build_lso_seg(wqe, wr, qp, &seglen, &lso_hdr_sz); + err = build_lso_seg(wqe, wr, qp, &seglen, &lso_hdr_sz, &blh); if (unlikely(err)) { *bad_wr = wr; goto out; @@ -1687,7 +1683,8 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, } ctrl->owner_opcode = mlx4_ib_opcode[wr->opcode] | - (ind & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0); + (ind & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0) | + (blh ? cpu_to_be32(1 << 6) : 0); stamp = ind + qp->sq_spare_wqes; ind += DIV_ROUND_UP(size * 16, 1U << qp->sq.wqe_shift); diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index ce7cc6c..e92d1bf 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -61,6 +61,7 @@ enum { MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1 << 8, MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1 << 9, MLX4_DEV_CAP_FLAG_DPDP = 1 << 12, + MLX4_DEV_CAP_FLAG_BLH = 1 << 15, MLX4_DEV_CAP_FLAG_MEM_WINDOW = 1 << 16, MLX4_DEV_CAP_FLAG_APM = 1 << 17, MLX4_DEV_CAP_FLAG_ATOMIC = 1 << 18, -- 1.6.4.3 From vlad at lists.openfabrics.org Wed Sep 30 03:13:35 2009 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 30 Sep 2009 03:13:35 -0700 (PDT) Subject: [ofa-general] ofa_1_5_kernel 20090930-0200 daily build status Message-ID: <20090930101335.AEFA8E62410@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: From hal.rosenstock at gmail.com Wed Sep 30 06:40:28 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 30 Sep 2009 09:40:28 -0400 Subject: [ofa-general] [PATCH 2/2 v4] opensm: Compression of multicast group according to pkey In-Reply-To: <4AC21184.4010103@Voltaire.COM> References: <4AC21184.4010103@Voltaire.COM> Message-ID: On Tue, Sep 29, 2009 at 9:54 AM, Slava Strebkov wrote: > Additional data structure added: > 1. Map of all partition keys opened in the fabric. > 2. Map of all multicast group boxes shared same pkey. > MLID assignment for multicast groups works in a usual manner, > allocating free entry for newly created group. > Proposed compression algorithm starts working when there are no more > free entries in the mlid array. List of MLIDs for new multicast group > will be chosen from the pkey indexed map according to the requested > pkey. MLID which shares minimum number of ports will be given to newly > created multicast group. Other suitability criteria aside from minimum number of ports (which is debatable), are MTU and rate matching. Are MTU and rate also checked (in addition to pkey) ? If not, IMO these checks should be added. -- Hal > Signed-off-by: Slava Strebkov From hal.rosenstock at gmail.com Wed Sep 30 06:44:12 2009 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 30 Sep 2009 09:44:12 -0400 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <4AC2A8E8.5040404@nasa.gov> References: <4AC25ABC.2000002@nasa.gov> <4AC2A8E8.5040404@nasa.gov> Message-ID: On Tue, Sep 29, 2009 at 8:40 PM, Jeff Becker wrote: > Hal Rosenstock wrote: >> >> On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker >> wrote: >> >>> >>> Hi all. I propose the following plan to "shutdown" the general list: >>> >>> 1) unsubscribe all current subscribers >>> 2) set the list to discard any incoming messages with an auto-discard >>> message that points you to linux-rdma at vger.kernel.org >>> >>> Please send comments/suggestions. >>> >> >> Care should be taken on any patches not cross posted (to linux-rdma) >> once the cutover takes place. There are quite a number of outstanding >> patches on general only. >> > > From tomorrow on, the general list will continue to exist with > searchable archives, but no new messages will be accepted. People who > try to send to general will be told to send to > linux-rdma at vger.kernel.org instead. If someone posted a patch to general > before the switch and it hasn't been accepted, they can repost to the > new list. Hope this works for everyone. Thanks Sure; these could be reposted to linux-rdma if needed. I was trying to say that care should be taken to check the email addresses prior to hitting reply/reply all so threads will be moved over to linux-rdma. -- Hal From cl at linux-foundation.org Wed Sep 30 07:27:47 2009 From: cl at linux-foundation.org (Christoph Lameter) Date: Wed, 30 Sep 2009 10:27:47 -0400 (EDT) Subject: [ofa-general] ofa_1_5_kernel 20090930-0200 daily build status In-Reply-To: <20090930101335.AEFA8E62410@openfabrics.org> References: <20090930101335.AEFA8E62410@openfabrics.org> Message-ID: No 2.6.28 39 30 31 32? On Wed, 30 Sep 2009, Vladimir Sokolovsky (Mellanox) wrote: > This email was generated automatically, please do not reply > > > git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git > git_branch: ofed_kernel_1_5 > > Common build parameters: > > Passed: > Passed on i686 with linux-2.6.19 > Passed on i686 with linux-2.6.18 > Passed on i686 with linux-2.6.21.1 > Passed on i686 with linux-2.6.24 > Passed on i686 with linux-2.6.26 > Passed on i686 with linux-2.6.22 > Passed on i686 with linux-2.6.27 > Passed on x86_64 with linux-2.6.16.60-0.21-smp > Passed on x86_64 with linux-2.6.18 > Passed on x86_64 with linux-2.6.18-164.el5 > Passed on x86_64 with linux-2.6.18-128.el5 > Passed on x86_64 with linux-2.6.18-93.el5 > Passed on x86_64 with linux-2.6.20 > Passed on x86_64 with linux-2.6.19 > Passed on x86_64 with linux-2.6.21.1 > Passed on x86_64 with linux-2.6.24 > Passed on x86_64 with linux-2.6.22 > Passed on x86_64 with linux-2.6.25 > Passed on x86_64 with linux-2.6.26 > Passed on x86_64 with linux-2.6.27 > Passed on x86_64 with linux-2.6.9-67.ELsmp > Passed on x86_64 with linux-2.6.9-78.ELsmp > Passed on ia64 with linux-2.6.18 > Passed on ia64 with linux-2.6.21.1 > Passed on ia64 with linux-2.6.19 > Passed on ia64 with linux-2.6.24 > Passed on ia64 with linux-2.6.23 > Passed on ia64 with linux-2.6.22 > Passed on ia64 with linux-2.6.26 > Passed on ia64 with linux-2.6.25 > Passed on ppc64 with linux-2.6.18 > Passed on ppc64 with linux-2.6.19 > > Failed: > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jon at opengridcomputing.com Wed Sep 30 07:37:24 2009 From: jon at opengridcomputing.com (Jon Mason) Date: Wed, 30 Sep 2009 09:37:24 -0500 Subject: [ofa-general] ofa_1_5_kernel 20090930-0200 daily build status In-Reply-To: References: <20090930101335.AEFA8E62410@openfabrics.org> Message-ID: <20090930143722.GA31116@opengridcomputing.com> On Wed, Sep 30, 2009 at 10:27:47AM -0400, Christoph Lameter wrote: > No 2.6.28 39 30 31 32? For OFED 1.5, the newest supported kernel is 2.6.30. However, there should be support for 2.6.28, 2.6.29, and 2.6.30 in the nightly builds. There should also be support for the SLES11 kernel (2.6.27.19-5-default). Vlad, can these be added? Thanks, Jon > > On Wed, 30 Sep 2009, Vladimir Sokolovsky (Mellanox) wrote: > > > This email was generated automatically, please do not reply > > > > > > git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git > > git_branch: ofed_kernel_1_5 > > > > Common build parameters: > > > > Passed: > > Passed on i686 with linux-2.6.19 > > Passed on i686 with linux-2.6.18 > > Passed on i686 with linux-2.6.21.1 > > Passed on i686 with linux-2.6.24 > > Passed on i686 with linux-2.6.26 > > Passed on i686 with linux-2.6.22 > > Passed on i686 with linux-2.6.27 > > Passed on x86_64 with linux-2.6.16.60-0.21-smp > > Passed on x86_64 with linux-2.6.18 > > Passed on x86_64 with linux-2.6.18-164.el5 > > Passed on x86_64 with linux-2.6.18-128.el5 > > Passed on x86_64 with linux-2.6.18-93.el5 > > Passed on x86_64 with linux-2.6.20 > > Passed on x86_64 with linux-2.6.19 > > Passed on x86_64 with linux-2.6.21.1 > > Passed on x86_64 with linux-2.6.24 > > Passed on x86_64 with linux-2.6.22 > > Passed on x86_64 with linux-2.6.25 > > Passed on x86_64 with linux-2.6.26 > > Passed on x86_64 with linux-2.6.27 > > Passed on x86_64 with linux-2.6.9-67.ELsmp > > Passed on x86_64 with linux-2.6.9-78.ELsmp > > Passed on ia64 with linux-2.6.18 > > Passed on ia64 with linux-2.6.21.1 > > Passed on ia64 with linux-2.6.19 > > Passed on ia64 with linux-2.6.24 > > Passed on ia64 with linux-2.6.23 > > Passed on ia64 with linux-2.6.22 > > Passed on ia64 with linux-2.6.26 > > Passed on ia64 with linux-2.6.25 > > Passed on ppc64 with linux-2.6.18 > > Passed on ppc64 with linux-2.6.19 > > > > Failed: > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From mingo at elte.hu Wed Sep 30 02:44:56 2009 From: mingo at elte.hu (Ingo Molnar) Date: Wed, 30 Sep 2009 11:44:56 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090929171332.GD14405@elf.ucw.cz> References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> <20090929171332.GD14405@elf.ucw.cz> Message-ID: <20090930094456.GD24621@elte.hu> * Pavel Machek wrote: > On Thu 2009-09-17 08:45:29, Roland Dreier wrote: > > > > [...] > > OK. It would be nice to tie into something more general, but I > > think I agree -- perf counters are missing the filtering and the "no > > lost events" that ummunotify does have. [...] Performance events filtering is being worked on and now with the proper non-DoS limit you've added you can lose events too, dont you? So it's all a question of how much buffering to add - and with perf events too you can buffer arbitrary large amount of events. > > [...] And I'm not sure it's worth messing up the perf counters > > design just to jam one more not totally related thing in. Nobody suggested details for any redesign yet (so far it seems like a perfect match, to me at least) so i'm wondering what messup you are referring to. > I believe that extending perf counters to do what you want is better > than adding one more, very strange, user<->kernel interface. Agreed. Lemme react to the original description of the code: > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git ummunotify > > This will get "ummunotify," a new character device that allows a > userspace library to register for MMU notifications; this is > particularly useful for MPI implementions (message passing libraries > used in HPC) to be able to keep track of what wacky things consumers > do to their memory mappings. I test-pulled this code and had a look at it. I think this could be done in a simpler, less limited, more generic, more useful form by using some variation of perf events. You should be able to get all that you want by adding two TRACE_EVENT() tracepoints and using the existing perf event syscall to get the events to user-space. Meaning that this: 9 files changed, 1060 insertions(+), 1 deletions(-) Would be replaced with something like: 2 files changed, 100 insertions(+), 0 deletions(-) [ the +100 lines would (roughly) would add tracepoints to invalidate_page and invalidate_range_start. (possibly via mmu_notifier_register() like the ummunotify code does) Most of that linecount would be comments. ] Another upside, beyond the reduction in complexity is that we'd have one less special char driver based ABI. Which is a big plus in my opinion, especially if this goes towards HPC folks and if it's used for real. Why should such a MM capability hidden behind a character device and an ioctl? The perf event approach is beneficial to non-HPC as well: MM instrumentation for example - page range invalidates are interesting to all sorts of modi of analysis. A question: what is the typical size/scope of the rbtree of the watched regions of memory in practical (test) deployments of the ummunofity code? Per tracepoint filtering is possible via the perf event patches Li Zefan has posted to lkml recently, under this subject: [PATCH 0/6] perf trace: Add filter support They are still being worked on but it's very clear that flexible in-kernel filtering support will be a natural part of the perf event design in the very near future, so if that alone is your reason not to use it it would be better if you helped us complete/test the filter support and use that, instead of a parallel framework. Or if that's not desirable or not possible, or if there's any other technical roadblock, i'd like to know the particulars of that. Thanks, Ingo From pavel at ucw.cz Tue Sep 29 10:13:32 2009 From: pavel at ucw.cz (Pavel Machek) Date: Tue, 29 Sep 2009 19:13:32 +0200 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> Message-ID: <20090929171332.GD14405@elf.ucw.cz> On Thu 2009-09-17 08:45:29, Roland Dreier wrote: > > > > > Hmm, or are you saying you can only get 1 event per registered range and > > > > allocate the thing on registration? That'd need some registration limit > > > > to avoid DoS scenarios. > > > > > > Yes, that's what I do. You're right, I should add a limit... although > > > their are lots of ways for userspace to consume arbitrary amounts of > > > kernel resources already. > > > > I'd be good to work at reducing that number, not adding to it ;-) > > Yes, definitely. I'll add a quick ummunotify module parameter that > limits the number of registrations per process. > > > But yeah, I currently don't see a very nice match to perf counters. > > OK. It would be nice to tie into something more general, but I think I > agree -- perf counters are missing the filtering and the "no lost > events" that ummunotify does have. And I'm not sure it's worth messing > up the perf counters design just to jam one more not totally related > thing in. I believe that extending perf counters to do what you want is better than adding one more, very strange, user<->kernel interface. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html From hnrose at comcast.net Wed Sep 30 08:43:41 2009 From: hnrose at comcast.net (Hal Rosenstock) Date: Wed, 30 Sep 2009 11:43:41 -0400 Subject: [ofa-general] [PATCH] opensm/osm_sa_lft_record.c: In lftr_rcv_new_lftr, handle osm_switch_get_lft_block failure Message-ID: <20090930154341.GA31709@comcast.net> Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_sa_lft_record.c b/opensm/opensm/osm_sa_lft_record.c index d092129..828b277 100644 --- a/opensm/opensm/osm_sa_lft_record.c +++ b/opensm/opensm/osm_sa_lft_record.c @@ -99,8 +99,12 @@ static ib_api_status_t lftr_rcv_new_lftr(IN osm_sa_t * sa, p_rec_item->rec.block_num = cl_hton16(block); /* copy the lft block */ - osm_switch_get_lft_block(p_sw, block, p_rec_item->rec.lft); - + if (!osm_switch_get_lft_block(p_sw, block, p_rec_item->rec.lft)) { + OSM_LOG(sa->p_log, OSM_LOG_ERROR, "ERR 4403: " + "osm_switch_get_lft_block failed\n"); + status = IB_INSUFFICIENT_RESOURCES; + goto Exit; + } cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: From ssufficool at sbcounty.gov Wed Sep 30 08:46:37 2009 From: ssufficool at sbcounty.gov (Sufficool, Stanley) Date: Wed, 30 Sep 2009 08:46:37 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: Message-ID: Someone should probably update openfabrics.org to reflect the new list under Developer Resources / Linux. - Stan Sufficool > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Hal Rosenstock > Sent: Wednesday, September 30, 2009 6:44 AM > To: Jeff Becker > Cc: OpenFabrics General > Subject: Re: [ofa-general] This list expires... tomorrow? > > > On Tue, Sep 29, 2009 at 8:40 PM, Jeff Becker > wrote: > > Hal Rosenstock wrote: > >> > >> On Tue, Sep 29, 2009 at 3:06 PM, Jeff Becker > >> > >> wrote: > >> > >>> > >>> Hi all. I propose the following plan to "shutdown" the > general list: > >>> > >>> 1) unsubscribe all current subscribers > >>> 2) set the list to discard any incoming messages with an > >>> auto-discard message that points you to linux-rdma at vger.kernel.org > >>> > >>> Please send comments/suggestions. > >>> > >> > >> Care should be taken on any patches not cross posted (to > linux-rdma) > >> once the cutover takes place. There are quite a number of > outstanding > >> patches on general only. > >> > > > > From tomorrow on, the general list will continue to exist with > > searchable archives, but no new messages will be accepted. > People who > > try to send to general will be told to send to > > linux-rdma at vger.kernel.org instead. If someone posted a patch to > > general before the switch and it hasn't been accepted, they > can repost > > to the new list. Hope this works for everyone. Thanks > > Sure; these could be reposted to linux-rdma if needed. > > I was trying to say that care should be taken to check the > email addresses prior to hitting reply/reply all so threads > will be moved over to linux-rdma. > > -- Hal > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From jgunthorpe at obsidianresearch.com Wed Sep 30 09:02:32 2009 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Wed, 30 Sep 2009 10:02:32 -0600 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090930094456.GD24621@elte.hu> References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> <20090929171332.GD14405@elf.ucw.cz> <20090930094456.GD24621@elte.hu> Message-ID: <20090930160232.GZ22310@obsidianresearch.com> On Wed, Sep 30, 2009 at 11:44:56AM +0200, Ingo Molnar wrote: > > > OK. It would be nice to tie into something more general, but I > > > think I agree -- perf counters are missing the filtering and the "no > > > lost events" that ummunotify does have. [...] > > Performance events filtering is being worked on and now with the proper > non-DoS limit you've added you can lose events too, dont you? So it's > all a question of how much buffering to add - and with perf events too > you can buffer arbitrary large amount of events. No, the ummunotify does not loose events, that is the fundamental difference between it and all tracing schemes. Every call to ibv_reg_mr is paired with a call to ummunotify to create a matching watcher. Both calls allocate some kernel memory, if one fails the entire operation fails and userspace can do whatever it does on memory allocation failure. After that point the scheme is perfectly lossless. Performance event filtering would use the same kind of kernel memory, call ibv_reg_mr, then install a filter, both allocate kernel memory, if one fails the op fails. But then when the ring buffer overflows you've lost events. All the tracing schemes are lossy - since they loose events when the ring buffer fills up. So to do that we either need to make a recovery scheme of some sort, or make trace points that are blocking.. So, here is a concrete proposal how ummunotify could be absorbed by perf events tracing, with filters. - The filter expression must be able to trigger on a MMU event, triggering on the intersection of the MMU event address range and filter expression address range. - The traces must be choosen so that there is exactly one filter expression per ibv_reg_mr region - Each filter has a clearable saturating counter that increments every time the filter matches an event - Each filter has a 64 bit user space assigned tag. - An API similar to ummunotify exists: struct perf_filter_tag foo[100] int rc = perf_filters_read_and_clear_non_zero_counters(foo,100); - Optionally - the mmap ring would contain only 64 bit user space filter tags, not trace events. This would then duplicate the functions of ummunotify, including the lossless collection of events. The flow would more or less be the same: struct my_data *ptr = calloc() ptr->reg_handle = ibv_reg_mr(base,len) ptr->filter_handle = perf_filter_register("string matching base->len",ptr) [..] // fast path if (atomically(perf_map->head) != last_perf_map_head) { struct perf_filter_tag foo[100] int rc = perf_filters_read_and_clear_non_zero_counters(foo,100); for (unsigned int i = 0; i != rc; i++) ((struct my_data *)foo[i])->invalid = 1; perf_empty_mmap_ring(perf_map); } If 'optionally' is done then the app can trundle through the mmap and only use the above syscall loop if the mmap overflows. That would be quite ideal. It also must be guarenteed that when a trace point is hit the mmap atomics are updated and visible to another user space thread before the trace point returns - otherwise it is not synchronous enough and will be racey. > A question: what is the typical size/scope of the rbtree of the watched > regions of memory in practical (test) deployments of the ummunofity > code? Jeff can you comment? IIRC it is many tens (hundreds?) of thousands of watches. > Per tracepoint filtering is possible via the perf event patches Li Zefan > has posted to lkml recently, under this subject: Performance of the filter add is probably a bit of a concern.. Regards, Jason From Jeffrey.C.Becker at nasa.gov Wed Sep 30 09:48:20 2009 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Wed, 30 Sep 2009 09:48:20 -0700 Subject: [ofa-general] This list expires... tomorrow? In-Reply-To: <15ddcffd0909292104g3490cc9bwb3543a7f92d010ba@mail.gmail.com> References: <4AC25ABC.2000002@nasa.gov> <4AC2A8E8.5040404@nasa.gov> <15ddcffd0909292104g3490cc9bwb3543a7f92d010ba@mail.gmail.com> Message-ID: <4AC38BD4.2010805@nasa.gov> Or Gerlitz wrote: > On Wed, Sep 30, 2009 at 2:40 AM, Jeff Becker wrote: > >> From tomorrow on, the general list will continue to exist with >> searchable archives, but no new messages will be accepted. People who >> try to send to general will be told to send to >> linux-rdma at vger.kernel.org instead. If someone posted a patch to general >> before the switch and it hasn't been accepted, they can repost to the >> new list. Hope this works for everyone. Thanks >> > > > sounds perfect to me, thanks for driving this Jeff and Roland. > > Or > No problem. I'm about to "pull the trigger". Shortly after you see this message, you will be unsubscribed from this list, and future posts to the list will be discarded with an auto-reply note redirecting you to linux-rdma at vger.kernel.org. The list will continue to exist so that its archives can be accessible. OK - here goes... -jeff 10,9,8,7,6,5,4,3,2,1, :-) From rdreier at cisco.com Wed Sep 30 10:06:32 2009 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Sep 2009 10:06:32 -0700 Subject: [ofa-general] Re: [GIT PULL] please pull ummunotify In-Reply-To: <20090930094456.GD24621@elte.hu> (Ingo Molnar's message of "Wed, 30 Sep 2009 11:44:56 +0200") References: <1253187028.8439.2.camel@twins> <1253198976.14935.27.camel@laptop> <20090929171332.GD14405@elf.ucw.cz> <20090930094456.GD24621@elte.hu> Message-ID: > Performance events filtering is being worked on and now with the proper > non-DoS limit you've added you can lose events too, dont you? So it's > all a question of how much buffering to add - and with perf events too > you can buffer arbitrary large amount of events. No, the idea for non-DoS for ummunotify is that we would limit the number of regions the application can register; so an application might hit the limit up front but no runtime loss of events once a region was registered successfully. > I think this could be done in a simpler, less limited, more generic, > more useful form by using some variation of perf events. > > You should be able to get all that you want by adding two TRACE_EVENT() > tracepoints and using the existing perf event syscall to get the events > to user-space. Yes, I would like to use perf events too. Would it be plausible to create a way for userspace to create a "counter" for each address range being watched? Then events would not be lost, because those counters would become non-zero. > Meaning that this: > 9 files changed, 1060 insertions(+), 1 deletions(-) Note that lots/ of the files touched here are in Documentation or are one-line changes to Makefiles etc. - R.