From vlad at lists.openfabrics.org Fri Aug 1 02:54:47 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 1 Aug 2008 02:54:47 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080801-0200 daily build status Message-ID: <20080801095447.C410CE60359@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From devesh28 at gmail.com Fri Aug 1 04:03:49 2008 From: devesh28 at gmail.com (Devesh Sharma) Date: Fri, 1 Aug 2008 16:33:49 +0530 Subject: [ofa-general] ***SPAM*** OFED-1.3 RDMA CM, IB_ACCESS_LOCAL_WRITE flag missing Message-ID: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> Hello all, while creating QP using rdma_create_qp(), I am not seeing any where it is setting IB_ACCESS_LOCAL_WRITE flag other then for IW QPs. Is it for some specific reason its just a mistake? -Devesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Fri Aug 1 09:37:11 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 1 Aug 2008 09:37:11 -0700 Subject: [ofa-general] ***SPAM*** OFED-1.3 RDMA CM, IB_ACCESS_LOCAL_WRITE flag missing In-Reply-To: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> References: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> Message-ID: <000001c8f3f4$d8f0ab00$bb37170a@amr.corp.intel.com> > while creating QP using rdma_create_qp(), I am not seeing any where > it is setting IB_ACCESS_LOCAL_WRITE flag other then for IW QPs. Is > it for some specific reason its just a mistake? Commit 1ca8d15619f725e223c19137350b0336b9196193 (dated July 22nd) removed this for iWarp QPs. The qp_access_flags is only used to set remote permissions, so it should not be being set. - Sean From swise at opengridcomputing.com Fri Aug 1 11:10:10 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 01 Aug 2008 13:10:10 -0500 Subject: [ofa-general] [PATCH 2.6.27] RDMA/cxgb3: Fix up MW access rights. Message-ID: <20080801181010.3736.44993.stgit@dell3.ogc.int> From: Steve Wise - MWs don't have local read/write permissions. - Set the MW_BIND enabled bit if a MR has MW_BIND access. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_hal.c | 6 +++--- drivers/infiniband/hw/cxgb3/iwch_provider.h | 7 +++++++ drivers/infiniband/hw/cxgb3/iwch_qp.c | 2 +- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index f6d5747..4dcf08b 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -725,9 +725,9 @@ static int __cxio_tpt_op(struct cxio_rdev *rdev_p, u32 reset_tpt_entry, V_TPT_STAG_TYPE(type) | V_TPT_PDID(pdid)); BUG_ON(page_size >= 28); tpt.flags_pagesize_qpid = cpu_to_be32(V_TPT_PERM(perm) | - F_TPT_MW_BIND_ENABLE | - V_TPT_ADDR_TYPE((zbva ? TPT_ZBTO : TPT_VATO)) | - V_TPT_PAGE_SIZE(page_size)); + ((perm & TPT_MW_BIND) ? F_TPT_MW_BIND_ENABLE : 0) | + V_TPT_ADDR_TYPE((zbva ? TPT_ZBTO : TPT_VATO)) | + V_TPT_PAGE_SIZE(page_size)); tpt.rsvd_pbl_addr = reset_tpt_entry ? 0 : cpu_to_be32(V_TPT_PBL_ADDR(PBL_OFF(rdev_p, pbl_addr)>>3)); tpt.len = cpu_to_be32(len); diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h b/drivers/infiniband/hw/cxgb3/iwch_provider.h index f5ceca0..a237d49 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.h +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h @@ -293,9 +293,16 @@ static inline u32 iwch_ib_to_tpt_access(int acc) return (acc & IB_ACCESS_REMOTE_WRITE ? TPT_REMOTE_WRITE : 0) | (acc & IB_ACCESS_REMOTE_READ ? TPT_REMOTE_READ : 0) | (acc & IB_ACCESS_LOCAL_WRITE ? TPT_LOCAL_WRITE : 0) | + (acc & IB_ACCESS_MW_BIND ? TPT_MW_BIND : 0) | TPT_LOCAL_READ; } +static inline u32 iwch_ib_to_tpt_bind_access(int acc) +{ + return (acc & IB_ACCESS_REMOTE_WRITE ? TPT_REMOTE_WRITE : 0) | + (acc & IB_ACCESS_REMOTE_READ ? TPT_REMOTE_READ : 0); +} + enum iwch_mmid_state { IWCH_STAG_STATE_VALID, IWCH_STAG_STATE_INVALID diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 8939716..3e4585c 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -565,7 +565,7 @@ int iwch_bind_mw(struct ib_qp *qp, wqe->bind.type = TPT_VATO; /* TBD: check perms */ - wqe->bind.perms = iwch_ib_to_tpt_access(mw_bind->mw_access_flags); + wqe->bind.perms = iwch_ib_to_tpt_bind_access(mw_bind->mw_access_flags); wqe->bind.mr_stag = cpu_to_be32(mw_bind->mr->lkey); wqe->bind.mw_stag = cpu_to_be32(mw->rkey); wqe->bind.mw_len = cpu_to_be32(mw_bind->length); From swise at opengridcomputing.com Fri Aug 1 12:43:35 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 01 Aug 2008 14:43:35 -0500 Subject: [ofa-general] [PATCH 2.6.27] RDMA/cxgb3: Deadlock initializing the iw_cxgb3 device. Message-ID: <20080801194334.7950.33820.stgit@dell3.ogc.int> From: Steve Wise Running 'ifconfig up' on the cxgb3 interface with iw_cxgb3 loaded causes a deadlock. Apparently the rtnl lock is already held in this path. Function fw_supports_fastreg() was introduced in 2.6.27 to conditionally set the IB_DEVICE_MEM_MGT_EXTENSIONS bit iff the firmware was at 7.0 or greater. This function acquires the rtnl lock and thus can cause a deadlock. Further, if iw_cxgb3 is loaded _after_ the nic interface is brought up, then the deadlock does not occur and thus fw_supports_fastreg() does indeed need to grab the rtnl lock in that path. It turns out this code is all useless anyway. The low level driver will NOT allow the open if the firmware isn't 7.0, so iw_cxgb3 can always set the MEM_MGT_EXTENSIONS bit. Simplify... Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_provider.c | 28 +++------------------------ 1 files changed, 3 insertions(+), 25 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index b89640a..c8888b8 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -1187,28 +1187,6 @@ static ssize_t show_rev(struct device *dev, struct device_attribute *attr, return sprintf(buf, "%d\n", iwch_dev->rdev.t3cdev_p->type); } -static int fw_supports_fastreg(struct iwch_dev *iwch_dev) -{ - struct ethtool_drvinfo info; - struct net_device *lldev = iwch_dev->rdev.t3cdev_p->lldev; - char *cp, *next; - unsigned fw_maj, fw_min; - - rtnl_lock(); - lldev->ethtool_ops->get_drvinfo(lldev, &info); - rtnl_unlock(); - - next = info.fw_version+1; - cp = strsep(&next, "."); - sscanf(cp, "%i", &fw_maj); - cp = strsep(&next, "."); - sscanf(cp, "%i", &fw_min); - - PDBG("%s maj %u min %u\n", __func__, fw_maj, fw_min); - - return fw_maj > 6 || (fw_maj == 6 && fw_min > 0); -} - static ssize_t show_fw_ver(struct device *dev, struct device_attribute *attr, char *buf) { struct iwch_dev *iwch_dev = container_of(dev, struct iwch_dev, @@ -1325,12 +1303,12 @@ int iwch_register_device(struct iwch_dev *dev) memset(&dev->ibdev.node_guid, 0, sizeof(dev->ibdev.node_guid)); memcpy(&dev->ibdev.node_guid, dev->rdev.t3cdev_p->lldev->dev_addr, 6); dev->ibdev.owner = THIS_MODULE; - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | + IB_DEVICE_MEM_WINDOW | + IB_DEVICE_MEM_MGT_EXTENSIONS; /* cxgb3 supports STag 0. */ dev->ibdev.local_dma_lkey = 0; - if (fw_supports_fastreg(dev)) - dev->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS; dev->ibdev.uverbs_cmd_mask = (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | From kldu at brand-a-value.com Fri Aug 1 17:27:37 2008 From: kldu at brand-a-value.com (Marshals Corp) Date: Fri, 1 Aug 2008 19:27:37 -0500 Subject: [ofa-general] You ought to have a look at that! Message-ID: <01c8f40c$a7ad5a80$9394f645@kldu> Marshal's Corp. is now hiring! Dare to earn more and in next to no time! We now need people willing to take up a position of the Financial Assistant to help out with the remittance of the funds from company's clients worldwide. As the business is being run on the large scales we lose a lot by disbursing taxes, that's why we need you to interchange the amounts and help us at least to reduce the tax burden. Your personal proceeds in this case would be 5%-7% out of every transaction operation which is on the regular basis equal to around 2500$ a month. Besides we offer: - Flexible schedule (6 hours a week) - Saturdays & Sundays off Requirements: - Need to be aged 18 +; - No criminal record ; - Regular Internet access; - Basic knowledge of Accounting; - Ability to accept payments using your bank account; - Ability to resend the money through Western Union. If feel qualified ,please, provide the following to start up: - Fist Name: - Last Name: - Age: - Sex: - Country: - Home Address: - State, City, Zip: - Phone number( home and cell): - Valid email address: NOTE:!!!! the email address you use to contact us for the first time is: Marshalpersonnel at gmail.com , in the subject field put "interested". Please, use only mentioned address! Failure in doing so might result in our inability to get your response. From vlad at lists.openfabrics.org Sat Aug 2 02:55:06 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 2 Aug 2008 02:55:06 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080802-0200 daily build status Message-ID: <20080802095506.5196EE6039D@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From a.beregalov at gmail.com Sat Aug 2 03:29:09 2008 From: a.beregalov at gmail.com (Alexander Beregalov) Date: Sat, 2 Aug 2008 14:29:09 +0400 Subject: [ofa-general] [PATCH] Infiniband/ipath: fix warnings Message-ID: <20080802102909.GC4856@orion> From: Alexander Beregalov infiniband/ipath : fix warnings ipath_driver.c:1260: warning: format '%Lx' expects type 'long long unsigned int', but argument 6 has type 'long unsigned int' ipath_driver.c:1459: warning: format '%Lx' expects type 'long long unsigned int', but argument 4 has type 'u64' ipath_intr.c:358: warning: format '%Lx' expects type 'long long unsigned int', but argument 3 has type 'u64' ipath_intr.c:358: warning: format '%Lu' expects type 'long long unsigned int', but argument 6 has type 'u64' ipath_intr.c:1119: warning: format '%Lx' expects type 'long long unsigned int', but argument 5 has type 'u64' ipath_intr.c:1119: warning: format '%Lx' expects type 'long long unsigned int', but argument 3 has type 'u64' ipath_intr.c:1123: warning: format '%Lx' expects type 'long long unsigned int', but argument 3 has type 'u64' ipath_intr.c:1130: warning: format '%Lx' expects type 'long long unsigned int', but argument 4 has type 'u64' ipath_iba7220.c:1032: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' ipath_iba7220.c:1045: warning: format '%llX' expects type 'long long unsigned int', but argument 3 has type 'u64' ipath_iba7220.c:2506: warning: format '%Lu' expects type 'long long unsigned int', but argument 4 has type 'u64' Signed-off-by: Alexander Beregalov Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock --- drivers/infiniband/hw/ipath/ipath_driver.c | 5 +++-- drivers/infiniband/hw/ipath/ipath_iba7220.c | 7 ++++--- drivers/infiniband/hw/ipath/ipath_intr.c | 12 ++++++++---- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index daad09a..896b6ab 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -1259,7 +1259,7 @@ reloop: */ ipath_cdbg(ERRPKT, "Error Pkt, but no eflags! egrbuf" " %x, len %x hdrq+%x rhf: %Lx\n", - etail, tlen, l, + etail, tlen, l, (unsigned long long) le64_to_cpu(*(__le64 *) rhf_addr)); if (ipath_debug & __IPATH_ERRPKTDBG) { u32 j, *d, dw = rsize-2; @@ -1457,7 +1457,8 @@ static void ipath_reset_availshadow(struct ipath_devdata *dd) 0xaaaaaaaaaaaaaaaaULL); /* All BUSY bits in qword */ if (oldval != dd->ipath_pioavailshadow[i]) ipath_dbg("shadow[%d] was %Lx, now %lx\n", - i, oldval, dd->ipath_pioavailshadow[i]); + i, (unsigned long long)oldval, + dd->ipath_pioavailshadow[i]); } spin_unlock_irqrestore(&ipath_pioavail_lock, flags); } diff --git a/drivers/infiniband/hw/ipath/ipath_iba7220.c b/drivers/infiniband/hw/ipath/ipath_iba7220.c index fadbfbf..124eac1 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba7220.c +++ b/drivers/infiniband/hw/ipath/ipath_iba7220.c @@ -1032,7 +1032,7 @@ static int ipath_7220_bringup_serdes(struct ipath_devdata *dd) ipath_cdbg(VERBOSE, "done: xgxs=%llx from %llx\n", (unsigned long long) ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig), - prev_val); + (unsigned long long)prev_val); guid = be64_to_cpu(dd->ipath_guid); @@ -1042,7 +1042,8 @@ static int ipath_7220_bringup_serdes(struct ipath_devdata *dd) ipath_dbg("No GUID for heartbeat, faking %llx\n", (unsigned long long)guid); } else - ipath_cdbg(VERBOSE, "Wrote %llX to HRTBT_GUID\n", guid); + ipath_cdbg(VERBOSE, "Wrote %llX to HRTBT_GUID\n", + (unsigned long long)guid); ipath_write_kreg(dd, dd->ipath_kregs->kr_hrtbt_guid, guid); return ret; } @@ -2505,7 +2506,7 @@ done: if (dd->ipath_flags & IPATH_IB_AUTONEG_INPROG) { ipath_dbg("Did not get to DDR INIT (%x) after %Lu msecs\n", ipath_ib_state(dd, dd->ipath_lastibcstat), - jiffies_to_msecs(jiffies)-startms); + (unsigned long long)jiffies_to_msecs(jiffies)-startms); dd->ipath_flags &= ~IPATH_IB_AUTONEG_INPROG; if (dd->ipath_autoneg_tries == IPATH_AUTONEG_TRIES) { dd->ipath_flags |= IPATH_IB_AUTONEG_FAILED; diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c index 26900b3..94b2711 100644 --- a/drivers/infiniband/hw/ipath/ipath_intr.c +++ b/drivers/infiniband/hw/ipath/ipath_intr.c @@ -356,9 +356,10 @@ static void handle_e_ibstatuschanged(struct ipath_devdata *dd, dd->ipath_cregs->cr_iblinkerrrecovcnt); if (linkrecov != dd->ipath_lastlinkrecov) { ipath_dbg("IB linkrecov up %Lx (%s %s) recov %Lu\n", - ibcs, ib_linkstate(dd, ibcs), + (unsigned long long)ibcs, + ib_linkstate(dd, ibcs), ipath_ibcstatus_str[ltstate], - linkrecov); + (unsigned long long)linkrecov); /* and no more until active again */ dd->ipath_lastlinkrecov = 0; ipath_set_linkstate(dd, IPATH_IB_LINKDOWN); @@ -1118,9 +1119,11 @@ irqreturn_t ipath_intr(int irq, void *data) if (unlikely(istat & ~dd->ipath_i_bitsextant)) ipath_dev_err(dd, "interrupt with unknown interrupts %Lx set\n", + (unsigned long long) istat & ~dd->ipath_i_bitsextant); else if (istat & ~INFINIPATH_I_ERROR) /* errors do own printing */ - ipath_cdbg(VERBOSE, "intr stat=0x%Lx\n", istat); + ipath_cdbg(VERBOSE, "intr stat=0x%Lx\n", + (unsigned long long)istat); if (istat & INFINIPATH_I_ERROR) { ipath_stats.sps_errints++; @@ -1128,7 +1131,8 @@ irqreturn_t ipath_intr(int irq, void *data) dd->ipath_kregs->kr_errorstatus); if (!estat) dev_info(&dd->pcidev->dev, "error interrupt (%Lx), " - "but no error bits set!\n", istat); + "but no error bits set!\n", + (unsigned long long)istat); else if (estat == -1LL) /* * should we try clearing all, or hope next read From devesh28 at gmail.com Sat Aug 2 03:02:12 2008 From: devesh28 at gmail.com (Devesh Sharma) Date: Sat, 2 Aug 2008 15:32:12 +0530 Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** OFED-1.3 RDMA CM, IB_ACCESS_LOCAL_WRITE flag missing In-Reply-To: <000001c8f3f4$d8f0ab00$bb37170a@amr.corp.intel.com> References: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> <000001c8f3f4$d8f0ab00$bb37170a@amr.corp.intel.com> Message-ID: <309a667c0808020302h6e340c88xc662064c400795d3@mail.gmail.com> Thanks for replying, Can you explain me in a bit more detail, because if QP dose not have a IB_ACCESS_LOCAL_WRITE permission, according to IB spec, HCA should generate Local Protection Error while processing the WRs. Is it assumed that mthca driver (or some other provider driver) will set IB_ACCESS_LOCAL_WRITE by itself, even if its not requested? On Fri, Aug 1, 2008 at 10:07 PM, Sean Hefty wrote: > > while creating QP using rdma_create_qp(), I am not seeing any where > > it is setting IB_ACCESS_LOCAL_WRITE flag other then for IW QPs. Is > > it for some specific reason its just a mistake? > > Commit 1ca8d15619f725e223c19137350b0336b9196193 (dated July 22nd) removed > this > for iWarp QPs. The qp_access_flags is only used to set remote permissions, > so > it should not be being set. > > - Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amec1 at amec.uk.tt Sat Aug 2 10:37:07 2008 From: amec1 at amec.uk.tt (Looking For Good Business Partner) Date: Sat, 2 Aug 2008 19:37:07 +0200 Subject: [ofa-general] Looking For Good Business Partner Message-ID: <20080802.YIXDAPFCZFZAOAIL@amec.uk.tt> A non-text attachment was scrubbed... Name: not available Type: multipart/alternative Size: 7194 bytes Desc: not available URL: From ontmoete at sfnloan.com Sat Aug 2 12:14:37 2008 From: ontmoete at sfnloan.com (Kala) Date: Sat, 2 Aug 2008 21:14:37 +0200 Subject: [ofa-general] Cannibal tribe invades civilisation Message-ID: The porn industry has admitted that amateur home videos are giving the industry a run for their money http://esportscantos.com/lol.html -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ From swise at opengridcomputing.com Sat Aug 2 13:33:34 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Sat, 02 Aug 2008 15:33:34 -0500 Subject: [ofa-general] [ANNOUNCE] libcxgb3-1.2.2 availability Message-ID: <4894C49E.1070702@opengridcomputing.com> Version 1.2.2 of libcxgb3 is now available at: http://www.openfabrics.org/downloads/cxgb3/libcxgb3-1.2.2.tar.gz This will be included in the ofed-1.4 distribution, but will also work with ofed-1.3.1. This version relaxes the firmware requirements and thus works with versions 5.x, 6,x and the up and coming 7.x for ofed-1.4. Thanks, Steve. From ronli.voltaire at gmail.com Sun Aug 3 01:30:34 2008 From: ronli.voltaire at gmail.com (Ron Livne) Date: Sun, 3 Aug 2008 11:30:34 +0300 Subject: Fwd: [ofa-general] [PATCH 3/3 v3] ib/uverbs: add support for create_qp_expanded in uverbs In-Reply-To: References: Message-ID: <3b5e77ad0808030130s63c2501bq6331c07494a7f158@mail.gmail.com> Hi Vlad, I'd like this patch to be added to OFED 1.4 This patch series is based on Jack M XRC patch series. Can you please tell me what libibvers, libmlx4 and librdmacm version are going to be in OFED 1.4, so I can make the proper modifications for the patches for these libraries in order to get them in OFED 1.4 Roland, Do you have any remarks for those patches? Thank you, Ron ---------- Forwarded message ---------- From: Ron Livne Date: Sun, Jul 27, 2008 at 6:23 PM Subject: [ofa-general] [PATCH 3/3 v3] ib/uverbs: add support for create_qp_expanded in uverbs To: Roland Drier Cc: Olga Shern , general list This patch adds support for create_qp_expanded to the uverbs. It uses the reserved bitmap in ib_uverbs_create_qp to transfer the new creation flags from the user space to the kernel. Changes in v2: Minimized code duplication by adding the function ib_uverbs_create_qp_common. LSO now can not be activated through user space. Changes in v3: Added compatibility for old libibverbs. Added field __u32 create_flags to struct ib_uverbs_create_qp_expanded. Deleted the function ib_uverbs_create_qp_common from v2. Signed-off-by: Ron Livne uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_key); + down_write(&obj->uevent.uobject.mutex); + + srq = (cmd.is_srq && cmd.qp_type != IB_QPT_XRC) ? + idr_read_srq(cmd.srq_handle, file->ucontext) : NULL; + xrcd = cmd.qp_type == IB_QPT_XRC ? + idr_read_xrcd(cmd.srq_handle, file->ucontext, &xrcd_uobj) : NULL; + pd = idr_read_pd(cmd.pd_handle, file->ucontext); + scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, 0); + rcq = cmd.recv_cq_handle == cmd.send_cq_handle ? + scq : idr_read_cq(cmd.recv_cq_handle, file->ucontext, 1); + + if (!pd || !scq || !rcq || (cmd.is_srq && !srq) || + (cmd.qp_type == IB_QPT_XRC && !xrcd)) { + ret = -EINVAL; + goto err_put; + } + + attr.event_handler = ib_uverbs_qp_event_handler; + attr.qp_context = file; + attr.send_cq = scq; + attr.recv_cq = rcq; + attr.srq = srq; + attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; + attr.qp_type = cmd.qp_type; + attr.xrc_domain = xrcd; + attr.create_flags = 0; + + attr.cap.max_send_wr = cmd.max_send_wr; + attr.cap.max_recv_wr = cmd.max_recv_wr; + attr.cap.max_send_sge = cmd.max_send_sge; + attr.cap.max_recv_sge = cmd.max_recv_sge; + attr.cap.max_inline_data = cmd.max_inline_data; + + obj->uevent.events_reported = 0; + INIT_LIST_HEAD(&obj->uevent.event_list); + INIT_LIST_HEAD(&obj->mcast_list); + + qp = pd->device->create_qp(pd, &attr, &udata); + if (IS_ERR(qp)) { + ret = PTR_ERR(qp); + goto err_put; + } + + qp->device = pd->device; + qp->pd = pd; + qp->send_cq = attr.send_cq; + qp->recv_cq = attr.recv_cq; + qp->srq = attr.srq; + qp->uobject = &obj->uevent.uobject; + qp->event_handler = attr.event_handler; + qp->qp_context = attr.qp_context; + qp->qp_type = attr.qp_type; + qp->xrcd = attr.xrc_domain; + atomic_inc(&pd->usecnt); + atomic_inc(&attr.send_cq->usecnt); + atomic_inc(&attr.recv_cq->usecnt); + if (attr.srq) + atomic_inc(&attr.srq->usecnt); + else if (attr.xrc_domain) + atomic_inc(&attr.xrc_domain->usecnt); + + obj->uevent.uobject.object = qp; + ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); + if (ret) + goto err_destroy; + + memset(&resp, 0, sizeof resp); + resp.qpn = qp->qp_num; + resp.qp_handle = obj->uevent.uobject.id; + resp.max_recv_sge = attr.cap.max_recv_sge; + resp.max_send_sge = attr.cap.max_send_sge; + resp.max_recv_wr = attr.cap.max_recv_wr; + resp.max_send_wr = attr.cap.max_send_wr; + resp.max_inline_data = attr.cap.max_inline_data; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) { + ret = -EFAULT; + goto err_copy; + } + + put_pd_read(pd); + put_cq_read(scq); + if (rcq != scq) + put_cq_read(rcq); + if (srq) + put_srq_read(srq); + if (xrcd) + put_xrcd_read(xrcd_uobj); + + mutex_lock(&file->mutex); + list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list); + mutex_unlock(&file->mutex); + + obj->uevent.uobject.live = 1; + + up_write(&obj->uevent.uobject.mutex); + + return in_len; + +err_copy: + idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); + +err_destroy: + ib_destroy_qp(qp); + +err_put: + if (pd) + put_pd_read(pd); + if (scq) + put_cq_read(scq); + if (rcq && rcq != scq) + put_cq_read(rcq); + if (srq) + put_srq_read(srq); + if (xrcd) + put_xrcd_read(xrcd_uobj); + + put_uobj_write(&obj->uevent.uobject); + return ret; +} + +ssize_t ib_uverbs_create_qp_expanded(struct ib_uverbs_file *file, const char __user *buf, int in_len, int out_len) { - struct ib_uverbs_create_qp cmd; + struct ib_uverbs_create_qp_expanded cmd; struct ib_uverbs_create_qp_resp resp; struct ib_udata udata; struct ib_uqp_object *obj; @@ -1078,7 +1232,6 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, goto err_put; } - attr.create_flags = 0; attr.event_handler = ib_uverbs_qp_event_handler; attr.qp_context = file; attr.send_cq = scq; @@ -1087,7 +1240,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; attr.qp_type = cmd.qp_type; attr.xrc_domain = xrcd; - attr.create_flags = 0; + attr.create_flags = cmd.create_flags; attr.cap.max_send_wr = cmd.max_send_wr; attr.cap.max_recv_wr = cmd.max_recv_wr; diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 1a96c35..cb435be 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -117,6 +117,7 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file, [IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP] = ib_uverbs_query_xrc_rcv_qp, [IB_USER_VERBS_CMD_REG_XRC_RCV_QP] = ib_uverbs_reg_xrc_rcv_qp, [IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP] = ib_uverbs_unreg_xrc_rcv_qp, + [IB_USER_VERBS_CMD_CREATE_QP_EXPANDED] = ib_uverbs_create_qp_expanded, }; static struct vfsmount *uverbs_event_mnt; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 030f696..f954533 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -728,7 +728,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) (1ull << IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP) | (1ull << IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP) | (1ull << IB_USER_VERBS_CMD_REG_XRC_RCV_QP) | - (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP); + (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_CREATE_QP_EXPANDED); } if (init_node_data(ibdev)) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 0d3f770..c65f88b 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -518,9 +518,6 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } else { qp->sq_no_prefetch = 0; - if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) - qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; - if (init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO) qp->flags |= MLX4_IB_QP_LSO; @@ -559,6 +556,10 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } } + if (init_attr->create_flags & + IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) + qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; + err = mlx4_qp_alloc(dev->dev, sqpn, &qp->mqp); if (err) goto err_wrid; @@ -705,8 +706,11 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd, IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK)) return ERR_PTR(-EINVAL); - if (init_attr->create_flags && - (pd->uobject || init_attr->qp_type != IB_QPT_UD)) + if (init_attr->create_flags && init_attr->qp_type != IB_QPT_UD) + return ERR_PTR(-EINVAL); + + if ((init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO) && + pd->uobject) return ERR_PTR(-EINVAL); switch (init_attr->qp_type) { diff --git a/include/rdma/ib_user_verbs.h b/include/rdma/ib_user_verbs.h index 0df90d8..300474f 100644 --- a/include/rdma/ib_user_verbs.h +++ b/include/rdma/ib_user_verbs.h @@ -90,6 +90,7 @@ enum { IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP, IB_USER_VERBS_CMD_REG_XRC_RCV_QP, IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP, + IB_USER_VERBS_CMD_CREATE_QP_EXPANDED, }; /* @@ -411,6 +412,27 @@ struct ib_uverbs_create_qp { __u64 driver_data[0]; }; +struct ib_uverbs_create_qp_expanded { + __u64 response; + __u64 user_handle; + __u32 pd_handle; + __u32 send_cq_handle; + __u32 recv_cq_handle; + __u32 srq_handle; + __u32 max_send_wr; + __u32 max_recv_wr; + __u32 max_send_sge; + __u32 max_recv_sge; + __u32 max_inline_data; + __u8 sq_sig_all; + __u8 qp_type; + __u8 is_srq; + __u8 reserved; + __u32 reserved1; + __u32 create_flags; + __u64 driver_data[0]; +}; + struct ib_uverbs_create_qp_resp { __u32 qp_handle; __u32 qpn; _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From dotanba at gmail.com Sun Aug 3 01:56:13 2008 From: dotanba at gmail.com (Dotan Barak) Date: Sun, 3 Aug 2008 11:56:13 +0300 Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** OFED-1.3 RDMA CM, IB_ACCESS_LOCAL_WRITE flag missing In-Reply-To: <309a667c0808020302h6e340c88xc662064c400795d3@mail.gmail.com> References: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> <000001c8f3f4$d8f0ab00$bb37170a@amr.corp.intel.com> <309a667c0808020302h6e340c88xc662064c400795d3@mail.gmail.com> Message-ID: <2f3bf9a60808030156p26c0e292ge0e212e887b8fbf@mail.gmail.com> On Sat, Aug 2, 2008 at 1:02 PM, Devesh Sharma wrote: > Thanks for replying, > Can you explain me in a bit more detail, because if QP dose not have a > IB_ACCESS_LOCAL_WRITE permission, according to IB spec, HCA should generate > Local Protection Error while processing the WRs. Is it assumed that mthca > driver (or some other provider driver) will set IB_ACCESS_LOCAL_WRITE by > itself, even if its not requested? The protection flag in the QP attributes is only specify which incoming remote operations are supported (Read/Write/Atomic). The IB_ACCESS_LOCAL_WRITE is enabled (or not) in the Memory Region. Dotan > > On Fri, Aug 1, 2008 at 10:07 PM, Sean Hefty wrote: >> >> > while creating QP using rdma_create_qp(), I am not seeing any where >> > it is setting IB_ACCESS_LOCAL_WRITE flag other then for IW QPs. Is >> > it for some specific reason its just a mistake? >> >> Commit 1ca8d15619f725e223c19137350b0336b9196193 (dated July 22nd) removed >> this >> for iWarp QPs. The qp_access_flags is only used to set remote >> permissions, so >> it should not be being set. >> >> - Sean >> > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From vlad at lists.openfabrics.org Sun Aug 3 02:13:35 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 3 Aug 2008 02:13:35 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080803-0200 daily build status Message-ID: <20080803091335.C48A4E609F0@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Failed: Build failed on i686 with linux-2.6.16 Build failed on i686 with linux-2.6.19 Build failed on i686 with linux-2.6.18 Build failed on i686 with linux-2.6.17 Build failed on i686 with linux-2.6.21.1 Build failed on i686 with linux-2.6.22 Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.17 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_x86_64_check/kernel_patches/backport/2.6.18/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-1.2798.fc6_x86_64_check/kernel_patches/backport/2.6.18_FC6/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-1.2798.fc6_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-8.el5_x86_64_check/kernel_patches/backport/2.6.18_FC6/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-8.el5_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-53.el5_x86_64_check/kernel_patches/backport/2.6.18-EL5.1/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-53.el5_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_x86_64_check/kernel_patches/backport/2.6.19/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.20 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.20_x86_64_check/kernel_patches/backport/2.6.20/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.20_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-93.el5_x86_64_check/kernel_patches/backport/2.6.18-EL5.2/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-93.el5_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.21.1 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.21.1_x86_64_check/kernel_patches/backport/2.6.21/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.21.1_x86_64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.22.5-31-default Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.22 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.24 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.25 Log: /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds/ib_recv.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.25' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.26 Log: /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds/ib_recv.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.26' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-67.ELsmp Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.17 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.18 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_ia64_check/kernel_patches/backport/2.6.18/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_ia64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.19 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_ia64_check/kernel_patches/backport/2.6.19/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_ia64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.21.1 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.21.1_ia64_check/kernel_patches/backport/2.6.21/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.21.1_ia64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.22 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.23 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.24 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.26 Log: /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds/ib_recv.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.26_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.26' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.25 Log: /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds/ib_recv.c:799: error: 'struct ib_wc' has no member named 'imm_data' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds/ib_recv.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.25_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.25' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.16 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_ppc64_check/kernel_patches/backport/2.6.18/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18_ppc64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-8.el5_ppc64_check/kernel_patches/backport/2.6.18_FC6/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.18-8.el5_ppc64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: Hunk #4 succeeded at 2717 (offset -122 lines). Hunk #5 succeeded at 3642 (offset -144 lines). Hunk #6 succeeded at 3737 (offset -144 lines). Hunk #7 succeeded at 3764 (offset -144 lines). patching file include/rdma/ib_verbs.h Hunk #1 succeeded at 1193 (offset 84 lines). Patch core_1sysfs_1_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: Importing patch /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_ppc64_check/kernel_patches/backport/2.6.19/core_cm_to_2_6_21.patch (stored as core_cm_to_2_6_21.patch) /usr/bin/quilt --quiltrc /home/vlad/tmp/ofa_1_4_kernel-20080803-0200_linux-2.6.19_ppc64_check/patches/quiltrc push patches/core_cm_to_2_6_21.patch Applying patch core_cm_to_2_6_21.patch patching file drivers/infiniband/core/cm.c Hunk #1 FAILED at 3705. 1 out of 1 hunk FAILED -- rejects in file drivers/infiniband/core/cm.c Patch core_cm_to_2_6_21.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.24 Log: Hunk #4 succeeded at 1381 (offset 5 lines). Hunk #5 succeeded at 1453 (offset 5 lines). Hunk #6 succeeded at 1661 (offset 5 lines). Hunk #7 succeeded at 1686 (offset 5 lines). Hunk #8 succeeded at 1762 (offset 5 lines). Hunk #9 succeeded at 2367 (offset 5 lines). Patch sdp_0090_revert_to_2_6_24.patch does not apply (enforce with -f) Failed executing /usr/bin/quilt ---------------------------------------------------------------------------------- From ronli at voltaire.com Sun Aug 3 03:36:24 2008 From: ronli at voltaire.com (Ron Livne) Date: Sun, 3 Aug 2008 13:36:24 +0300 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: <000601c8f330$aa81c430$8498070a@amr.corp.intel.com> References: <000601c8f330$aa81c430$8498070a@amr.corp.intel.com> Message-ID: Hi, >+int rdma_create_qp_expanded(struct rdma_cm_id *id, struct ibv_pd *pd, >+ struct ibv_qp_init_attr *qp_init_attr, >+ uint32_t create_flags); >Can we eliminate expanded and just use 'create_flags'? If I'll eliminate the expanded parameter, I'll have to call ibv_create_qp_expanded with create_flags = 0. This is not a good idea because it will not be compatible with an older kernel. Ron -----Original Message----- From: Sean Hefty [mailto:sean.hefty at intel.com] Sent: Thursday, July 31, 2008 8:13 PM To: Ron Livne Cc: general at lists.openfabrics.org; Olga Shern Subject: RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) thanks - I will set this aside until all necessary changes are in libibverbs. >Adds a new function: int rdma_create_qp_expanded >which uses the ibv_create_qp_expanded function in libibverbs and uses >it similarly to ibv_create_qp, with the difference of creation flags. > >Changes in v2: >Added compatibility to old libibverbs. > >Signed-off-by: Ron Livne > >diff --git a/include/rdma/rdma_cma.h b/include/rdma/rdma_cma.h >index a516ab8..34c6b9f 100644 >--- a/include/rdma/rdma_cma.h >+++ b/include/rdma/rdma_cma.h >@@ -296,6 +296,10 @@ int rdma_resolve_route(struct rdma_cm_id *id, int >timeout_ms); > int rdma_create_qp(struct rdma_cm_id *id, struct ibv_pd *pd, > struct ibv_qp_init_attr *qp_init_attr); > >+int rdma_create_qp_expanded(struct rdma_cm_id *id, struct ibv_pd *pd, >+ struct ibv_qp_init_attr *qp_init_attr, >+ uint32_t create_flags); >+ > /** > * rdma_destroy_qp - Deallocate a QP. > * @id: RDMA identifier. >diff --git a/src/cma.c b/src/cma.c >index ecb41bc..4e2da76 100644 >--- a/src/cma.c >+++ b/src/cma.c >@@ -783,33 +783,49 @@ static int ucma_init_ud_qp(struct cma_id_private >*id_priv, struct ibv_qp *qp) > return ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_SQ_PSN); > } > >-int rdma_create_qp(struct rdma_cm_id *id, struct ibv_pd *pd, >- struct ibv_qp_init_attr *qp_init_attr) >+static int rdma_create_qp_common(struct rdma_cm_id *id, struct ibv_pd *pd, >+ struct ibv_qp_init_attr *qp_init_attr, >+ uint32_t create_flags, int expanded) Can we eliminate expanded and just use 'create_flags'? The patch uses spaces in place of tabs, but I can fix that up. > { >- struct cma_id_private *id_priv; >- struct ibv_qp *qp; >- int ret; >- >- id_priv = container_of(id, struct cma_id_private, id); >- if (id->verbs != pd->context) >- return -EINVAL; >- >- qp = ibv_create_qp(pd, qp_init_attr); >- if (!qp) >- return -ENOMEM; >+ struct cma_id_private *id_priv; >+ struct ibv_qp *qp; >+ int ret; >+ >+ id_priv = container_of(id, struct cma_id_private, id); >+ if (id->verbs != pd->context) >+ return -EINVAL; >+ >+ qp = expanded ? >+ ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : >+ ibv_create_qp(pd, qp_init_attr); >+ if (!qp) >+ return -ENOMEM; >+ >+ if (ucma_is_ud_ps(id->ps)) >+ ret = ucma_init_ud_qp(id_priv, qp); >+ else >+ ret = ucma_init_conn_qp(id_priv, qp); >+ if (ret) >+ goto err; >+ >+ id->qp = qp; >+ return 0; >+err: >+ ibv_destroy_qp(qp); >+ return ret; >+} > >- if (ucma_is_ud_ps(id->ps)) >- ret = ucma_init_ud_qp(id_priv, qp); >- else >- ret = ucma_init_conn_qp(id_priv, qp); >- if (ret) >- goto err; >+int rdma_create_qp_expanded(struct rdma_cm_id *id, struct ibv_pd *pd, >+ struct ibv_qp_init_attr *qp_init_attr, >+ uint32_t create_flags) >+{ >+ return rdma_create_qp_common(id, pd, qp_init_attr, create_flags, 1); >+} > >- id->qp = qp; >- return 0; >-err: >- ibv_destroy_qp(qp); >- return ret; >+int rdma_create_qp(struct rdma_cm_id *id, struct ibv_pd *pd, >+ struct ibv_qp_init_attr *qp_init_attr) >+{ >+ return rdma_create_qp_common(id, pd, qp_init_attr, 0, 0); > } > > void rdma_destroy_qp(struct rdma_cm_id *id) >diff --git a/src/librdmacm.map b/src/librdmacm.map >index cb94efe..b237eda 100644 >--- a/src/librdmacm.map >+++ b/src/librdmacm.map >@@ -28,5 +28,6 @@ RDMACM_1.0 { > rdma_get_local_addr; > rdma_get_peer_addr; > rdma_migrate_id; >+ rdma_create_qp_expanded; > local: *; > }; From jackm at dev.mellanox.co.il Sun Aug 3 04:12:50 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 3 Aug 2008 14:12:50 +0300 Subject: Fwd: [ofa-general] [PATCH 3/3 v3] ib/uverbs: add support for create_qp_expanded in uverbs In-Reply-To: <3b5e77ad0808030130s63c2501bq6331c07494a7f158@mail.gmail.com> References: <3b5e77ad0808030130s63c2501bq6331c07494a7f158@mail.gmail.com> Message-ID: <200808031412.50264.jackm@dev.mellanox.co.il> Ron, I have not yet modified the OFED 1.4 patches to what I submitted to the list. I'm waiting until all the changes settle down and are finalized. The patches will need to be adapted for OFED 1.4 if you want them in now. (The OFED 1.4 tree is currently like the 1.4 tree, except for the alignment bug fixes in the ABI). - Jack On Sunday 03 August 2008 11:30, Ron Livne wrote: > Hi Vlad, > I'd like this patch to be added to OFED 1.4 > This patch series is based on Jack M XRC patch series. > > Can you please tell me what libibvers, libmlx4 and librdmacm version > are going to be in OFED 1.4, so I can make the proper modifications > for the patches for these libraries in order to get them in OFED 1.4 > > Roland, > Do you have any remarks for those patches? > > Thank you, > Ron > > > ---------- Forwarded message ---------- > From: Ron Livne > Date: Sun, Jul 27, 2008 at 6:23 PM > Subject: [ofa-general] [PATCH 3/3 v3] ib/uverbs: add support for > create_qp_expanded in uverbs > To: Roland Drier > Cc: Olga Shern , general list > > > > This patch adds support for create_qp_expanded > to the uverbs. > It uses the reserved bitmap in ib_uverbs_create_qp > to transfer the new creation flags from the user space > to the kernel. > > Changes in v2: > Minimized code duplication by adding the function > ib_uverbs_create_qp_common. > > LSO now can not be activated through user space. > > Changes in v3: > Added compatibility for old libibverbs. > Added field __u32 create_flags to struct ib_uverbs_create_qp_expanded. > Deleted the function ib_uverbs_create_qp_common from v2. > > Signed-off-by: Ron Livne > diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h > index b55f0d7..ae9f9a8 100644 > --- a/drivers/infiniband/core/uverbs.h > +++ b/drivers/infiniband/core/uverbs.h > @@ -214,6 +214,7 @@ IB_UVERBS_DECLARE_CMD(modify_xrc_rcv_qp); > IB_UVERBS_DECLARE_CMD(query_xrc_rcv_qp); > IB_UVERBS_DECLARE_CMD(reg_xrc_rcv_qp); > IB_UVERBS_DECLARE_CMD(unreg_xrc_rcv_qp); > +IB_UVERBS_DECLARE_CMD(create_qp_expanded); > > > #endif /* UVERBS_H */ > diff --git a/drivers/infiniband/core/uverbs_cmd.c > b/drivers/infiniband/core/uverbs_cmd.c > index 4402a07..a9c1485 100644 > --- a/drivers/infiniband/core/uverbs_cmd.c > +++ b/drivers/infiniband/core/uverbs_cmd.c > @@ -1030,10 +1030,164 @@ ssize_t ib_uverbs_destroy_cq(struct > ib_uverbs_file *file, > } > > ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > + const char __user *buf, int in_len, > + int out_len) > +{ > + struct ib_uverbs_create_qp cmd; > + struct ib_uverbs_create_qp_resp resp; > + struct ib_udata udata; > + struct ib_uqp_object *obj; > + struct ib_pd *pd; > + struct ib_cq *scq, *rcq; > + struct ib_srq *srq; > + struct ib_qp *qp; > + struct ib_qp_init_attr attr; > + struct ib_xrcd *xrcd; > + struct ib_uobject *xrcd_uobj; > + int ret; > + > + if (out_len < sizeof resp) > + return -ENOSPC; > + > + if (copy_from_user(&cmd, buf, sizeof cmd)) > + return -EFAULT; > + > + INIT_UDATA(&udata, buf + sizeof cmd, > + (unsigned long) cmd.response + sizeof resp, > + in_len - sizeof cmd, out_len - sizeof resp); > + > + obj = kmalloc(sizeof *obj, GFP_KERNEL); > + if (!obj) > + return -ENOMEM; > + > + init_uobj(&obj->uevent.uobject, cmd.user_handle, > file->ucontext, &qp_lock_key); > + down_write(&obj->uevent.uobject.mutex); > + > + srq = (cmd.is_srq && cmd.qp_type != IB_QPT_XRC) ? > + idr_read_srq(cmd.srq_handle, file->ucontext) : NULL; > + xrcd = cmd.qp_type == IB_QPT_XRC ? > + idr_read_xrcd(cmd.srq_handle, file->ucontext, > &xrcd_uobj) : NULL; > + pd = idr_read_pd(cmd.pd_handle, file->ucontext); > + scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, 0); > + rcq = cmd.recv_cq_handle == cmd.send_cq_handle ? > + scq : idr_read_cq(cmd.recv_cq_handle, file->ucontext, 1); > + > + if (!pd || !scq || !rcq || (cmd.is_srq && !srq) || > + (cmd.qp_type == IB_QPT_XRC && !xrcd)) { > + ret = -EINVAL; > + goto err_put; > + } > + > + attr.event_handler = ib_uverbs_qp_event_handler; > + attr.qp_context = file; > + attr.send_cq = scq; > + attr.recv_cq = rcq; > + attr.srq = srq; > + attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : > IB_SIGNAL_REQ_WR; > + attr.qp_type = cmd.qp_type; > + attr.xrc_domain = xrcd; > + attr.create_flags = 0; > + > + attr.cap.max_send_wr = cmd.max_send_wr; > + attr.cap.max_recv_wr = cmd.max_recv_wr; > + attr.cap.max_send_sge = cmd.max_send_sge; > + attr.cap.max_recv_sge = cmd.max_recv_sge; > + attr.cap.max_inline_data = cmd.max_inline_data; > + > + obj->uevent.events_reported = 0; > + INIT_LIST_HEAD(&obj->uevent.event_list); > + INIT_LIST_HEAD(&obj->mcast_list); > + > + qp = pd->device->create_qp(pd, &attr, &udata); > + if (IS_ERR(qp)) { > + ret = PTR_ERR(qp); > + goto err_put; > + } > + > + qp->device = pd->device; > + qp->pd = pd; > + qp->send_cq = attr.send_cq; > + qp->recv_cq = attr.recv_cq; > + qp->srq = attr.srq; > + qp->uobject = &obj->uevent.uobject; > + qp->event_handler = attr.event_handler; > + qp->qp_context = attr.qp_context; > + qp->qp_type = attr.qp_type; > + qp->xrcd = attr.xrc_domain; > + atomic_inc(&pd->usecnt); > + atomic_inc(&attr.send_cq->usecnt); > + atomic_inc(&attr.recv_cq->usecnt); > + if (attr.srq) > + atomic_inc(&attr.srq->usecnt); > + else if (attr.xrc_domain) > + atomic_inc(&attr.xrc_domain->usecnt); > + > + obj->uevent.uobject.object = qp; > + ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); > + if (ret) > + goto err_destroy; > + > + memset(&resp, 0, sizeof resp); > + resp.qpn = qp->qp_num; > + resp.qp_handle = obj->uevent.uobject.id; > + resp.max_recv_sge = attr.cap.max_recv_sge; > + resp.max_send_sge = attr.cap.max_send_sge; > + resp.max_recv_wr = attr.cap.max_recv_wr; > + resp.max_send_wr = attr.cap.max_send_wr; > + resp.max_inline_data = attr.cap.max_inline_data; > + > + if (copy_to_user((void __user *) (unsigned long) cmd.response, > + &resp, sizeof resp)) { > + ret = -EFAULT; > + goto err_copy; > + } > + > + put_pd_read(pd); > + put_cq_read(scq); > + if (rcq != scq) > + put_cq_read(rcq); > + if (srq) > + put_srq_read(srq); > + if (xrcd) > + put_xrcd_read(xrcd_uobj); > + > + mutex_lock(&file->mutex); > + list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list); > + mutex_unlock(&file->mutex); > + > + obj->uevent.uobject.live = 1; > + > + up_write(&obj->uevent.uobject.mutex); > + > + return in_len; > + > +err_copy: > + idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); > + > +err_destroy: > + ib_destroy_qp(qp); > + > +err_put: > + if (pd) > + put_pd_read(pd); > + if (scq) > + put_cq_read(scq); > + if (rcq && rcq != scq) > + put_cq_read(rcq); > + if (srq) > + put_srq_read(srq); > + if (xrcd) > + put_xrcd_read(xrcd_uobj); > + > + put_uobj_write(&obj->uevent.uobject); > + return ret; > +} > + > +ssize_t ib_uverbs_create_qp_expanded(struct ib_uverbs_file *file, > const char __user *buf, int in_len, > int out_len) > { > - struct ib_uverbs_create_qp cmd; > + struct ib_uverbs_create_qp_expanded cmd; > struct ib_uverbs_create_qp_resp resp; > struct ib_udata udata; > struct ib_uqp_object *obj; > @@ -1078,7 +1232,6 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > goto err_put; > } > > - attr.create_flags = 0; > attr.event_handler = ib_uverbs_qp_event_handler; > attr.qp_context = file; > attr.send_cq = scq; > @@ -1087,7 +1240,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : > IB_SIGNAL_REQ_WR; > attr.qp_type = cmd.qp_type; > attr.xrc_domain = xrcd; > - attr.create_flags = 0; > + attr.create_flags = cmd.create_flags; > > attr.cap.max_send_wr = cmd.max_send_wr; > attr.cap.max_recv_wr = cmd.max_recv_wr; > diff --git a/drivers/infiniband/core/uverbs_main.c > b/drivers/infiniband/core/uverbs_main.c > index 1a96c35..cb435be 100644 > --- a/drivers/infiniband/core/uverbs_main.c > +++ b/drivers/infiniband/core/uverbs_main.c > @@ -117,6 +117,7 @@ static ssize_t (*uverbs_cmd_table[])(struct > ib_uverbs_file *file, > [IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP] = ib_uverbs_query_xrc_rcv_qp, > [IB_USER_VERBS_CMD_REG_XRC_RCV_QP] = ib_uverbs_reg_xrc_rcv_qp, > [IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP] = ib_uverbs_unreg_xrc_rcv_qp, > + [IB_USER_VERBS_CMD_CREATE_QP_EXPANDED] = ib_uverbs_create_qp_expanded, > }; > > static struct vfsmount *uverbs_event_mnt; > diff --git a/drivers/infiniband/hw/mlx4/main.c > b/drivers/infiniband/hw/mlx4/main.c > index 030f696..f954533 100644 > --- a/drivers/infiniband/hw/mlx4/main.c > +++ b/drivers/infiniband/hw/mlx4/main.c > @@ -728,7 +728,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) > (1ull << IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP) | > (1ull << IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP) | > (1ull << IB_USER_VERBS_CMD_REG_XRC_RCV_QP) | > - (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP); > + (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP) | > + (1ull << IB_USER_VERBS_CMD_CREATE_QP_EXPANDED); > } > > if (init_node_data(ibdev)) > diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c > index 0d3f770..c65f88b 100644 > --- a/drivers/infiniband/hw/mlx4/qp.c > +++ b/drivers/infiniband/hw/mlx4/qp.c > @@ -518,9 +518,6 @@ static int create_qp_common(struct mlx4_ib_dev > *dev, struct ib_pd *pd, > } else { > qp->sq_no_prefetch = 0; > > - if (init_attr->create_flags & > IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) > - qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; > - > if (init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO) > qp->flags |= MLX4_IB_QP_LSO; > > @@ -559,6 +556,10 @@ static int create_qp_common(struct mlx4_ib_dev > *dev, struct ib_pd *pd, > } > } > > + if (init_attr->create_flags & > + IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) > + qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; > + > err = mlx4_qp_alloc(dev->dev, sqpn, &qp->mqp); > if (err) > goto err_wrid; > @@ -705,8 +706,11 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd, > IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK)) > return ERR_PTR(-EINVAL); > > - if (init_attr->create_flags && > - (pd->uobject || init_attr->qp_type != IB_QPT_UD)) > + if (init_attr->create_flags && init_attr->qp_type != IB_QPT_UD) > + return ERR_PTR(-EINVAL); > + > + if ((init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO) && > + pd->uobject) > return ERR_PTR(-EINVAL); > > switch (init_attr->qp_type) { > diff --git a/include/rdma/ib_user_verbs.h b/include/rdma/ib_user_verbs.h > index 0df90d8..300474f 100644 > --- a/include/rdma/ib_user_verbs.h > +++ b/include/rdma/ib_user_verbs.h > @@ -90,6 +90,7 @@ enum { > IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP, > IB_USER_VERBS_CMD_REG_XRC_RCV_QP, > IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP, > + IB_USER_VERBS_CMD_CREATE_QP_EXPANDED, > }; > > /* > @@ -411,6 +412,27 @@ struct ib_uverbs_create_qp { > __u64 driver_data[0]; > }; > > +struct ib_uverbs_create_qp_expanded { > + __u64 response; > + __u64 user_handle; > + __u32 pd_handle; > + __u32 send_cq_handle; > + __u32 recv_cq_handle; > + __u32 srq_handle; > + __u32 max_send_wr; > + __u32 max_recv_wr; > + __u32 max_send_sge; > + __u32 max_recv_sge; > + __u32 max_inline_data; > + __u8 sq_sig_all; > + __u8 qp_type; > + __u8 is_srq; > + __u8 reserved; > + __u32 reserved1; > + __u32 create_flags; > + __u64 driver_data[0]; > +}; > + > struct ib_uverbs_create_qp_resp { > __u32 qp_handle; > __u32 qpn; > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jackm at dev.mellanox.co.il Sun Aug 3 04:17:34 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 3 Aug 2008 14:17:34 +0300 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: References: <000601c8f330$aa81c430$8498070a@amr.corp.intel.com> Message-ID: <200808031417.34787.jackm@dev.mellanox.co.il> On Sunday 03 August 2008 13:36, Ron Livne wrote: > If I'll eliminate the expanded parameter, > I'll have to call ibv_create_qp_expanded with create_flags = 0. > This is not a good idea because it will not be compatible with an older > kernel. > Not so. In rdma_create_qp_common, just change: + qp = expanded ? + ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : + ibv_create_qp(pd, qp_init_attr); to + qp = create_flags ? + ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : + ibv_create_qp(pd, qp_init_attr); (i.e., just use the old ibv_create_qp if there are no create_flags set). - Jack From ronli.voltaire at gmail.com Sun Aug 3 04:27:04 2008 From: ronli.voltaire at gmail.com (Ron Livne) Date: Sun, 3 Aug 2008 14:27:04 +0300 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: <200808031417.34787.jackm@dev.mellanox.co.il> References: <000601c8f330$aa81c430$8498070a@amr.corp.intel.com> <200808031417.34787.jackm@dev.mellanox.co.il> Message-ID: <3b5e77ad0808030427kc9c25flcf069f620c196381@mail.gmail.com> >+ qp = create_flags ? >+ ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : >+ ibv_create_qp(pd, qp_init_attr); What if the user wants to call rdma_create_qp_expanded with create_flags = 0, with intension that ibv_create_qp_expanded will be called with create_flags = 0. I know There's a pretty slim chance anyone would want to do so, because it does the same thing. But still... Ron On Sun, Aug 3, 2008 at 2:17 PM, Jack Morgenstein wrote: > On Sunday 03 August 2008 13:36, Ron Livne wrote: >> If I'll eliminate the expanded parameter, >> I'll have to call ibv_create_qp_expanded with create_flags = 0. >> This is not a good idea because it will not be compatible with an older >> kernel. >> > Not so. > > In rdma_create_qp_common, just change: > + qp = expanded ? > + ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : > + ibv_create_qp(pd, qp_init_attr); > > to > + qp = create_flags ? > + ibv_create_qp_expanded(pd, qp_init_attr, create_flags) : > + ibv_create_qp(pd, qp_init_attr); > > (i.e., just use the old ibv_create_qp if there are no create_flags set). > > - Jack > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jackm at dev.mellanox.co.il Sun Aug 3 04:51:02 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 3 Aug 2008 14:51:02 +0300 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: <3b5e77ad0808030427kc9c25flcf069f620c196381@mail.gmail.com> References: <200808031417.34787.jackm@dev.mellanox.co.il> <3b5e77ad0808030427kc9c25flcf069f620c196381@mail.gmail.com> Message-ID: <200808031451.02599.jackm@dev.mellanox.co.il> On Sunday 03 August 2008 14:27, Ron Livne wrote: > What if the user wants to call rdma_create_qp_expanded with create_flags = 0, > with intension that ibv_create_qp_expanded will be called with create_flags = 0. > I know There's a pretty slim chance anyone would want to do so, > because it does the same thing. > But still... > > Ron > I've come around to agreeing with you. Since this is an internal function we are talking about, we're probably better off having a separate flag which indicates whether it was invoked by rmda_create_qp() or rdma_create_qp_expanded(). That way, we do not limit the future behavior of ibv_create_qp_expanded() (by having the librdmarc library depend on knowledge of the internal behavior of ibv_create_qp_expanded() ). - Jack From tziporet at dev.mellanox.co.il Sun Aug 3 05:19:12 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 03 Aug 2008 15:19:12 +0300 Subject: Fwd: [ofa-general] [PATCH 3/3 v3] ib/uverbs: add support for create_qp_expanded in uverbs In-Reply-To: <200808031412.50264.jackm@dev.mellanox.co.il> References: <3b5e77ad0808030130s63c2501bq6331c07494a7f158@mail.gmail.com> <200808031412.50264.jackm@dev.mellanox.co.il> Message-ID: <4895A240.4060603@mellanox.co.il> Jack Morgenstein wrote: > I have not yet modified the OFED 1.4 patches to what I submitted to the list. > I'm waiting until all the changes settle down and are finalized. > > The patches will need to be adapted for OFED 1.4 if you want them in now. > (The OFED 1.4 tree is currently like the 1.4 tree, except for the alignment > bug fixes in the ABI). > > Jack Please do it in the coming days for the beta Thanks Tziporet From vlad at mellanox.co.il Sun Aug 3 06:30:38 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 03 Aug 2008 16:30:38 +0300 Subject: [ofa-general] OFED build problems on Centos 5.2 In-Reply-To: <4892306A.2000307@harr.org> References: <4892306A.2000307@harr.org> Message-ID: <4895B2FE.2070900@mellanox.co.il> Cameron Harr wrote: > Vladimir, I too am seeing the same problem when I try to build all > packages or something with iSer.: > RPM build errors: > user vlad does not exist - using root > group vlad does not exist - using root > user vlad does not exist - using root > group vlad does not exist - using root > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/iscsi_tcp.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/libiscsi.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/scsi_transport_iscsi.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/net/rds > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/net/cxgb3 > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/net/mlx4 > > > Do you have any update on it, or would you like me to send you my build > log? > > Also, in playing around, I've found that I can compile my general Ofed > configuration of mthca drivers and SRPT without errors, but the ib_srpt > module comes up missing in updates/kernel/drivers/infiniband/ulp/srpt > directory. I've scanned all the built rpms, and it's not there. Oddly > enough, when the rpm is being built, I can see it included: > + test -e > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/infiniband/ib_srpt.ko > > > It just seem like it and some of the other modules are never actually > packaged. > > Thanks, > Cameron Hi Cameron, Check that you have kernel-devel-2.6.18-92.1.6.el5 RPM installed. Do you have /lib/modules/2.6.18-92.1.6.el5/build link? Regards, Vladimir From galak at kernel.crashing.org Sun Aug 3 07:58:14 2008 From: galak at kernel.crashing.org (Kumar Gala) Date: Sun, 3 Aug 2008 09:58:14 -0500 Subject: [ofa-general] Re: [PATCH v2] powerpc: move include files to arch/powerpc/include/asm In-Reply-To: <20080801152030.ff10b6b2.sfr@canb.auug.org.au> References: <20080801152030.ff10b6b2.sfr@canb.auug.org.au> Message-ID: On Aug 1, 2008, at 12:20 AM, Stephen Rothwell wrote: > from include/asm-powerpc. This is the result of a > > mkdir arch/powerpc/include/asm > git mv include/asm-powerpc/* arch/powerpc/include/asm > > Followed by a few documentation/comment fixups and a couple of places > where was being used explicitly. Of the latter only > one was outside the arch code and it is a driver only built for > powerpc. > > Signed-off-by: Stephen Rothwell > --- > > v2 don't change other arch files - the fixups are only in comments > anyway. > > This patch can be applied with "git am" - the full patch is way to bug > for our mailing lists. > > This has been built for all the powerpc defconfigs including > all{no,mod,yes}config. There was only one failure, but that is > expected anyway (I had to apply patches for the iommu and hfcmulti > breakages). Paul, what's the plan for this change? If this is something that will go in so can we get a tree with it so we can base other patches on it (like the PPC_MERGE cleanup)? - k From sfr at canb.auug.org.au Sun Aug 3 08:00:31 2008 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 4 Aug 2008 01:00:31 +1000 Subject: [ofa-general] Re: [PATCH v2] powerpc: move include files to arch/powerpc/include/asm In-Reply-To: <20080801152030.ff10b6b2.sfr@canb.auug.org.au> References: <20080801152030.ff10b6b2.sfr@canb.auug.org.au> Message-ID: <20080804010031.44657ace.sfr@canb.auug.org.au> On Fri, 1 Aug 2008 15:20:30 +1000 Stephen Rothwell wrote: > > This patch can be applied with "git am" - the full patch is way to bug ^^^ Freudian slip :-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: From sfr at canb.auug.org.au Sun Aug 3 08:02:38 2008 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 4 Aug 2008 01:02:38 +1000 Subject: [ofa-general] Re: [PATCH v2] powerpc: move include files to arch/powerpc/include/asm In-Reply-To: References: <20080801152030.ff10b6b2.sfr@canb.auug.org.au> Message-ID: <20080804010238.a493e1c0.sfr@canb.auug.org.au> On Sun, 3 Aug 2008 09:58:14 -0500 Kumar Gala wrote: > > Paul, what's the plan for this change? If this is something that will > go in so can we get a tree with it so we can base other patches on it > (like the PPC_MERGE cleanup)? It looks like Linus is happy to take these patches, so if Paul is game he could just shove it in powerpc-{next,master,merge} and ask Linus to pull it. The sooner the better in that case. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: From alekseys at voltaire.com Sun Aug 3 08:18:49 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Sun, 03 Aug 2008 18:18:49 +0300 Subject: [ofa-general] RE: [PATCH/RFC] RDMA/cma: Remove padding arrays by using struct sockaddr_storage In-Reply-To: <001901c8f340$833452c0$8498070a@amr.corp.intel.com> References: <1217416924.32115.9.camel@linux-zn6t.site> <001901c8f340$833452c0$8498070a@amr.corp.intel.com> Message-ID: <1217776729.2381.1.camel@linux-zn6t.site> On Thu, 2008-07-31 at 12:06 -0700, Sean Hefty wrote: > >There are a few places that the RDMA CM code handles IPv6 by doing > > > > struct sockaddr addr; > > u8 pad[sizeof(struct sockaddr_in6) - > > sizeof(struct sockaddr)]; > > > >This is fragile and ugly; handle this in a better way with just > > > > struct sockaddr_storage addr; > > > >Signed-off-by: Roland Dreier > >--- > >Any objections to merging the cleanup below? > > nope - thanks > > This is the patch above plus modifications in addr.c Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 4 ++-- drivers/infiniband/core/cma.c | 4 +--- drivers/infiniband/core/ucma.c | 10 ++++------ 3 files changed, 7 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 09a2bec..c5b623b 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -49,8 +49,8 @@ MODULE_LICENSE("Dual BSD/GPL"); struct addr_req { struct list_head list; - struct sockaddr src_addr; - struct sockaddr dst_addr; + struct sockaddr_storage src_addr; + struct sockaddr_storage dst_addr; struct rdma_dev_addr *addr; struct rdma_addr_client *client; void *context; diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index e980ff3..a16510b 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -155,9 +155,7 @@ struct cma_multicast { } multicast; struct list_head list; void *context; - struct sockaddr addr; - u8 pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage addr; }; struct cma_work { diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c index b41dd26..d5a825f 100644 --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -81,9 +81,7 @@ struct ucma_multicast { u64 uid; struct list_head list; - struct sockaddr addr; - u8 pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage addr; }; struct ucma_event { @@ -913,7 +911,7 @@ static ssize_t ucma_join_multicast(struct ucma_file *file, mc->uid = cmd.uid; memcpy(&mc->addr, &cmd.addr, sizeof cmd.addr); - ret = rdma_join_multicast(ctx->cm_id, &mc->addr, mc); + ret = rdma_join_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr, mc); if (ret) goto err2; @@ -929,7 +927,7 @@ static ssize_t ucma_join_multicast(struct ucma_file *file, return 0; err3: - rdma_leave_multicast(ctx->cm_id, &mc->addr); + rdma_leave_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr); ucma_cleanup_mc_events(mc); err2: mutex_lock(&mut); @@ -975,7 +973,7 @@ static ssize_t ucma_leave_multicast(struct ucma_file *file, goto out; } - rdma_leave_multicast(mc->ctx->cm_id, &mc->addr); + rdma_leave_multicast(mc->ctx->cm_id, (struct sockaddr *) &mc->addr); mutex_lock(&mc->ctx->file->mut); ucma_cleanup_mc_events(mc); list_del(&mc->list); -- 1.5.6.dirty From sashak at voltaire.com Sun Aug 3 10:09:46 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 3 Aug 2008 20:09:46 +0300 Subject: [ofa-general] Re: [PATCH V2] ibsim: Add a Node Description query drop error. In-Reply-To: <20080731132053.17cd8915.weiny2@llnl.gov> References: <20080730174011.5d80e036.weiny2@llnl.gov> <20080731180323.GV14872@sashak.voltaire.com> <20080731132053.17cd8915.weiny2@llnl.gov> Message-ID: <20080803170946.GF15644@sashak.voltaire.com> Hi Ira, On 13:20 Thu 31 Jul , Ira Weiny wrote: > > Like this? Not exactly. I meant possibility to specify any attribute to drop. Like this. Sasha diff --git a/ibsim/sim.h b/ibsim/sim.h index 936bb85..f989252 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -207,6 +207,7 @@ struct Port { Node *remotenode; int remoteport; int errrate; + uint16_t errattr; Node *node; Portcounters portcounters; uint16_t *pkey_tbl; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index a6aab9d..e1c2384 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -199,6 +199,7 @@ static int do_seterror(FILE * f, char *line) char *nodeid = 0, name[NAMELEN], *sp, *orig = 0; int portnum = -1; // def - all ports int numports, set = 0, rate = 0; + uint16_t attr = 0; if (strsep(&s, "\"")) orig = strsep(&s, "\""); @@ -240,6 +241,13 @@ static int do_seterror(FILE * f, char *line) } DEBUG("error rate is %d", rate); + + strsep(&s, " \t"); + if (s) { + attr = strtoul(s, 0, 0); + DEBUG("error attr is %u", attr); + } + numports = node->numports; if (node->type == SWITCH_NODE) @@ -250,12 +258,14 @@ static int do_seterror(FILE * f, char *line) if (portnum >= 0) { port = ports + node->portsbase + portnum; port->errrate = rate; + port->errattr = attr; return 1; } for (port = ports + node->portsbase, e = port + numports; port < e; port++) { port->errrate = rate; + port->errattr = attr; set++; } @@ -708,7 +718,8 @@ static int dump_help(FILE * f) fprintf(f, "\tGuid \"nodeid\" : set GUID value for this node\n"); fprintf(f, "\tGuid \"nodeid\"[port] : set GUID value for this port\n"); fprintf(f, - "\tError \"nodeid\"[port] : set error rate for port/node\n"); + "\tError \"nodeid\"[port] [attribute]: set error rate for" + "\n\t\t\tport/node, optionally for specified attribute\n"); fprintf(f, "\tBaselid \"nodeid\"[port] [lmc] : change port's lid (lmc)\n"); fprintf(f, "\tVerbose [newlevel] - show/set simulator verbosity\n"); diff --git a/ibsim/sim_mad.c b/ibsim/sim_mad.c index b8ce2ab..b66c697 100644 --- a/ibsim/sim_mad.c +++ b/ibsim/sim_mad.c @@ -1188,7 +1188,8 @@ int process_packet(Client * cl, void *p, int size, Client ** dcl) return sizeof(*r); // forward only } - if (port->errrate && (random() % 100) < port->errrate) { + if (port->errrate && (!port->errattr || port->errattr == rpc.attr.id) && + (random() % 100) < port->errrate) { VERB("drop pkt due error rate %d", port->errrate); goto _dropped; } From sashak at voltaire.com Sun Aug 3 10:23:08 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 3 Aug 2008 20:23:08 +0300 Subject: [ofa-general] [PATCH] opensm: add OSM_EVENT_ID_SUBNET_UP event Message-ID: <20080803172308.GG15644@sashak.voltaire.com> Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_event_plugin.h | 1 + opensm/opensm/osm_state_mgr.c | 2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/opensm/include/opensm/osm_event_plugin.h b/opensm/include/opensm/osm_event_plugin.h index 0626b86..e44a78a 100644 --- a/opensm/include/opensm/osm_event_plugin.h +++ b/opensm/include/opensm/osm_event_plugin.h @@ -69,6 +69,7 @@ typedef enum { OSM_EVENT_ID_PORT_DATA_COUNTERS, OSM_EVENT_ID_PORT_SELECT, OSM_EVENT_ID_TRAP, + OSM_EVENT_ID_SUBNET_UP, OSM_EVENT_ID_MAX } osm_epi_event_id_t; diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 54ac1d9..5c5167f 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -1238,6 +1238,8 @@ _repeat_discovery: */ cl_event_signal(&sm->subnet_up_event); + osm_opensm_report_event(sm->p_subn->p_osm, OSM_EVENT_ID_SUBNET_UP, NULL); + /* if we got a signal to force heavy sweep or errors * in the middle of the sweep - try another sweep. */ if (sm->p_subn->force_heavy_sweep -- 1.5.5.1.178.g1f811 From sean.hefty at intel.com Sun Aug 3 22:58:00 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Sun, 3 Aug 2008 22:58:00 -0700 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: <200808031451.02599.jackm@dev.mellanox.co.il> References: <200808031417.34787.jackm@dev.mellanox.co.il> <3b5e77ad0808030427kc9c25flcf069f620c196381@mail.gmail.com> <200808031451.02599.jackm@dev.mellanox.co.il> Message-ID: <000001c8f5f7$0db17500$9dfd070a@amr.corp.intel.com> >I've come around to agreeing with you. Since this is an internal >function we are talking about, >we're probably better off having a separate flag which indicates >whether it was invoked by >rmda_create_qp() or rdma_create_qp_expanded(). That way, we do not >limit the future behavior >of ibv_create_qp_expanded() (by having the librdmarc library depend on >knowledge of the internal >behavior of ibv_create_qp_expanded() ). This is debatable, but rdma_create_qp_ex() with create_flags = 0 (or 1 -1 or whatever) should give the exact same functionality as calling rdma_create_qp(). The expanded call should subsume all functionality as the existing call. I believe the verbs interface should be similar. This is more about the behavior of the API, than the internal implementation. - Sean From Robert at saq.co.uk Mon Aug 4 00:14:40 2008 From: Robert at saq.co.uk (Robert Dunkley) Date: Mon, 4 Aug 2008 08:14:40 +0100 Subject: [ofa-general] OFED build problems on Centos 5.2 References: <4892306A.2000307@harr.org> <4895B2FE.2070900@mellanox.co.il> Message-ID: Hi Cameron, A similar problem happened to me. Centos installs a very slightly newer *.1.6 kernel when you install the kernel source. Have you double checked that the header/devel kernel module exactly matches the actual kernel version you are running ? (uname -r) If they don't quite match then try updating the older one and OFED will hopefully then install fine. If you are using Xen don't forget to upgrade the Xen kernel too. Rob -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladimir Sokolovsky Sent: 03 August 2008 14:31 To: Cameron Harr Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] OFED build problems on Centos 5.2 Cameron Harr wrote: > Vladimir, I too am seeing the same problem when I try to build all > packages or something with iSer.: > RPM build errors: > user vlad does not exist - using root > group vlad does not exist - using root > user vlad does not exist - using root > group vlad does not exist - using root > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/ iscsi_tcp.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/ libiscsi.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/scsi/ scsi_transport_iscsi.ko > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/net/rds > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/net/c xgb3 > > File not found: > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/net/m lx4 > > > Do you have any update on it, or would you like me to send you my build > log? > > Also, in playing around, I've found that I can compile my general Ofed > configuration of mthca drivers and SRPT without errors, but the ib_srpt > module comes up missing in updates/kernel/drivers/infiniband/ulp/srpt > directory. I've scanned all the built rpms, and it's not there. Oddly > enough, when the rpm is being built, I can see it included: > + test -e > /var/tmp/OFED/lib/modules/2.6.18-92.1.6.el5/updates/kernel/drivers/infin iband/ib_srpt.ko > > > It just seem like it and some of the other modules are never actually > packaged. > > Thanks, > Cameron Hi Cameron, Check that you have kernel-devel-2.6.18-92.1.6.el5 RPM installed. Do you have /lib/modules/2.6.18-92.1.6.el5/build link? Regards, Vladimir _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general The SAQ Group Registered Office: 18 Chapel Street, Petersfield, Hampshire. GU32 3DZ SEMTEC Limited trading as SAQ is Registered in England & Wales Company Number: 06481952 http://www.saqnet.co.uk AS29219 SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. DSL : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. Find us in http://www.thebestof.co.uk/petersfield From jackm at dev.mellanox.co.il Mon Aug 4 00:19:30 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 4 Aug 2008 10:19:30 +0300 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support =?iso-8859-1?q?for=09create_qp_expanded?= (with changelog this time) In-Reply-To: <000001c8f5f7$0db17500$9dfd070a@amr.corp.intel.com> References: <200808031451.02599.jackm@dev.mellanox.co.il> <000001c8f5f7$0db17500$9dfd070a@amr.corp.intel.com> Message-ID: <200808041019.30536.jackm@dev.mellanox.co.il> On Monday 04 August 2008 08:58, Sean Hefty wrote: > This is debatable, but rdma_create_qp_ex() with create_flags = 0 (or 1 > -1 or whatever) should give the exact same functionality as calling > rdma_create_qp().  The expanded call should subsume all functionality as > the existing call.  I believe the verbs interface should be similar. > > This is more about the behavior of the API, than the internal > implementation. > Won't it be confusing if the rdma_create_qp_ex() call with create_flags = 0 succeeds, while the same call with create_flags != 0 fails (in the case where userlevel is running against an older kernel which does not have the ib_uverbs_create_qp_ex() interface)? I prefer to keep the ex interface separate -- there is no point in using the qp_ex call if there is no intent to use the expanded feature. - Jack From sashak at voltaire.com Mon Aug 4 00:52:15 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 10:52:15 +0300 Subject: [ofa-general] [PATCH] opensm/osm_sa_class_port_info.c: fix over bound array access Message-ID: <20080804075215.GA12324@sashak.voltaire.com> __msecs_to_rtv_table[] buffer is accessed over its bounds. The patch fixes this and also makes this array const. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sa_class_port_info.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index a43524f..4535d07 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -57,7 +57,8 @@ #define MAX_MSECS_TO_RTV 24 /* Precalculated table in msec (index is related to encoded value) */ /* 4.096 usec * 2 ** n (where n = 8 - 31) */ -static uint32_t __msecs_to_rtv_table[MAX_MSECS_TO_RTV] = { 1, 2, 4, 8, +const static uint32_t __msecs_to_rtv_table[MAX_MSECS_TO_RTV] = { + 1, 2, 4, 8, 16, 33, 67, 134, 268, 536, 1073, 2147, 4294, 8589, 17179, 34359, @@ -110,7 +111,7 @@ __osm_cpi_rcv_respond(IN osm_sa_t * sa, /* Calculate encoded response time value */ /* transaction timeout is in msec */ if (sa->p_subn->opt.transaction_timeout > - __msecs_to_rtv_table[MAX_MSECS_TO_RTV]) + __msecs_to_rtv_table[MAX_MSECS_TO_RTV - 1]) rtv = MAX_MSECS_TO_RTV - 1; else { for (rtv = 0; rtv < MAX_MSECS_TO_RTV; rtv++) { -- 1.5.5.1.178.g1f811 From sashak at voltaire.com Mon Aug 4 01:26:20 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 11:26:20 +0300 Subject: [ofa-general] [PATCH] osmtest/osmt_service.c: fix over bound array access In-Reply-To: <20080804075215.GA12324@sashak.voltaire.com> References: <20080804075215.GA12324@sashak.voltaire.com> Message-ID: <20080804082620.GD12324@sashak.voltaire.com> id[] buffer is accessed over its bounds. Signed-off-by: Sasha Khapyorsky --- opensm/osmtest/osmt_service.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/opensm/osmtest/osmt_service.c b/opensm/osmtest/osmt_service.c index ce13500..738a1d8 100644 --- a/opensm/osmtest/osmt_service.c +++ b/opensm/osmtest/osmt_service.c @@ -1211,7 +1211,7 @@ ib_api_status_t osmt_run_service_records_flow(IN osmtest_t * const p_osmt) OSM_LOG_ENTER(&p_osmt->log); /* Init Service names */ - for (i = 0; i <= 6; i++) { + for (i = 0; i < 7; i++) { #ifdef __WIN__ uint64_t rand_val = rand() - (uint64_t) i; #else @@ -1223,6 +1223,7 @@ ib_api_status_t osmt_run_service_records_flow(IN osmtest_t * const p_osmt) "osmt.srvc.%" PRIu64 ".%" PRIu64, rand_val, pid); /*printf("-I- Service Name is : %s, ID is : 0x%" PRIx64 "\n",service_name[i],id[i]); */ } + status = osmt_register_service(p_osmt, cl_ntoh64(id[0]), /* IN ib_net64_t service_id, */ IB_DEFAULT_PKEY, /* IN ib_net16_t service_pkey, */ 0xFFFFFFFF, /* IN ib_net32_t service_lease, */ @@ -1377,12 +1378,12 @@ ib_api_status_t osmt_run_service_records_flow(IN osmtest_t * const p_osmt) goto Exit; } - /* Bad Flow of Get with invalid Service ID: id[7] */ - status = osmt_get_service_by_id(p_osmt, 0, cl_ntoh64(id[7]), &srv_rec); + /* Bad Flow of Get with invalid Service ID: id[6] */ + status = osmt_get_service_by_id(p_osmt, 0, cl_ntoh64(id[6]), &srv_rec); if (status != IB_SUCCESS) { OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, "ERR 4A20: " "Found service: id: 0x%016" PRIx64 " " - "that is invalid\n", id[7]); + "that is invalid\n", id[6]); status = IB_ERROR; goto Exit; } -- 1.5.5.1.178.g1f811 From kliteyn at dev.mellanox.co.il Mon Aug 4 01:45:38 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 04 Aug 2008 11:45:38 +0300 Subject: [ofa-general] [PATCH] opensm/osm_qos_parser.l: add 'noinput' lexer option to remove compiler warning Message-ID: <4896C1B2.4010707@dev.mellanox.co.il> Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_qos_parser.l | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_qos_parser.l b/opensm/opensm/osm_qos_parser.l index 32e8ab3..40e061d 100644 --- a/opensm/opensm/osm_qos_parser.l +++ b/opensm/opensm/osm_qos_parser.l @@ -121,7 +121,7 @@ static void reset_new_line_flags(); %} -%option nounput +%option nounput noinput QOS_ULPS_START qos\-ulps QOS_ULPS_END end\-qos\-ulps -- 1.5.1.4 From alekseys at voltaire.com Mon Aug 4 02:06:47 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 04 Aug 2008 12:06:47 +0300 Subject: [ofa-general] [PATCH RFC] replace sockaddr with sockaddr_storage in struct rdma_addr Message-ID: <1217840807.23992.5.camel@linux-zn6t.site> Here is addition patch, that replace padding arrays in rdma_addr structure with sockaddr_storage and insert necessary castings Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 33 +++++++++++++++++---------------- include/rdma/rdma_cm.h | 8 ++------ 2 files changed, 19 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a16510b..d951896 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -784,8 +784,8 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv, cma_cancel_route(id_priv); break; case CMA_LISTEN: - if (cma_any_addr(&id_priv->id.route.addr.src_addr) && - !id_priv->cma_dev) + if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr) + && !id_priv->cma_dev) cma_cancel_listens(id_priv); break; default: @@ -1024,7 +1024,7 @@ static struct rdma_id_private *cma_new_conn_id(struct rdma_cm_id *listen_id, rt->path_rec[1] = *ib_event->param.req_rcvd.alternate_path; ib_addr_set_dgid(&rt->addr.dev_addr, &rt->path_rec[0].dgid); - ret = rdma_translate_ip(&id->route.addr.src_addr, + ret = rdma_translate_ip((struct sockaddr *) &id->route.addr.src_addr, &id->route.addr.dev_addr); if (ret) goto destroy_id; @@ -1062,7 +1062,7 @@ static struct rdma_id_private *cma_new_udp_id(struct rdma_cm_id *listen_id, cma_save_net_info(&id->route.addr, &listen_id->route.addr, ip_ver, port, src, dst); - ret = rdma_translate_ip(&id->route.addr.src_addr, + ret = rdma_translate_ip((struct sockaddr *) &id->route.addr.src_addr, &id->route.addr.dev_addr); if (ret) goto err; @@ -1375,7 +1375,7 @@ static int cma_ib_listen(struct rdma_id_private *id_priv) if (IS_ERR(id_priv->cm_id.ib)) return PTR_ERR(id_priv->cm_id.ib); - addr = &id_priv->id.route.addr.src_addr; + addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr; svc_id = cma_get_service_id(id_priv->id.ps, addr); if (cma_any_addr(addr)) ret = ib_cm_listen(id_priv->cm_id.ib, svc_id, 0, NULL); @@ -1441,7 +1441,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv, dev_id_priv->state = CMA_ADDR_BOUND; memcpy(&id->route.addr.src_addr, &id_priv->id.route.addr.src_addr, - ip_addr_size(&id_priv->id.route.addr.src_addr)); + ip_addr_size((struct sockaddr *) &id_priv->id.route.addr.src_addr)); cma_attach_to_dev(dev_id_priv, cma_dev); list_add_tail(&dev_id_priv->listen_list, &id_priv->listen_list); @@ -1561,13 +1561,14 @@ static int cma_query_ib_route(struct rdma_id_private *id_priv, int timeout_ms, path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(&addr->dev_addr)); path_rec.numb_path = 1; path_rec.reversible = 1; - path_rec.service_id = cma_get_service_id(id_priv->id.ps, &addr->dst_addr); + path_rec.service_id = cma_get_service_id(id_priv->id.ps, + (struct sockaddr *) &addr->dst_addr); comp_mask = IB_SA_PATH_REC_DGID | IB_SA_PATH_REC_SGID | IB_SA_PATH_REC_PKEY | IB_SA_PATH_REC_NUMB_PATH | IB_SA_PATH_REC_REVERSIBLE | IB_SA_PATH_REC_SERVICE_ID; - if (addr->src_addr.sa_family == AF_INET) { + if (addr->src_addr.ss_family == AF_INET) { path_rec.qos_class = cpu_to_be16((u16) id_priv->tos); comp_mask |= IB_SA_PATH_REC_QOS_CLASS; } else { @@ -1846,7 +1847,7 @@ static int cma_resolve_loopback(struct rdma_id_private *id_priv) ib_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid); ib_addr_set_dgid(&id_priv->id.route.addr.dev_addr, &gid); - if (cma_zero_addr(&id_priv->id.route.addr.src_addr)) { + if (cma_zero_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr)) { src_in = (struct sockaddr_in *)&id_priv->id.route.addr.src_addr; dst_in = (struct sockaddr_in *)&id_priv->id.route.addr.dst_addr; src_in->sin_family = dst_in->sin_family; @@ -1895,7 +1896,7 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, if (cma_any_addr(dst_addr)) ret = cma_resolve_loopback(id_priv); else - ret = rdma_resolve_ip(&addr_client, &id->route.addr.src_addr, + ret = rdma_resolve_ip(&addr_client, (struct sockaddr *) &id->route.addr.src_addr, dst_addr, &id->route.addr.dev_addr, timeout_ms, addr_handler, id_priv); if (ret) @@ -2019,11 +2020,11 @@ static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv) * We don't support binding to any address if anyone is bound to * a specific address on the same port. */ - if (cma_any_addr(&id_priv->id.route.addr.src_addr)) + if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr)) return -EADDRNOTAVAIL; hlist_for_each_entry(cur_id, node, &bind_list->owners, node) { - if (cma_any_addr(&cur_id->id.route.addr.src_addr)) + if (cma_any_addr((struct sockaddr *) &cur_id->id.route.addr.src_addr)) return -EADDRNOTAVAIL; cur_sin = (struct sockaddr_in *) &cur_id->id.route.addr.src_addr; @@ -2058,7 +2059,7 @@ static int cma_get_port(struct rdma_id_private *id_priv) } mutex_lock(&lock); - if (cma_any_port(&id_priv->id.route.addr.src_addr)) + if (cma_any_port((struct sockaddr *) &id_priv->id.route.addr.src_addr)) ret = cma_alloc_any_port(ps, id_priv); else ret = cma_use_port(ps, id_priv); @@ -2230,7 +2231,7 @@ static int cma_resolve_ib_udp(struct rdma_id_private *id_priv, req.path = route->path_rec; req.service_id = cma_get_service_id(id_priv->id.ps, - &route->addr.dst_addr); + (struct sockaddr *) &route->addr.dst_addr); req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8); req.max_cm_retries = CMA_MAX_CM_RETRIES; @@ -2281,7 +2282,7 @@ static int cma_connect_ib(struct rdma_id_private *id_priv, req.alternate_path = &route->path_rec[1]; req.service_id = cma_get_service_id(id_priv->id.ps, - &route->addr.dst_addr); + (struct sockaddr *) &route->addr.dst_addr); req.qp_num = id_priv->qp_num; req.qp_type = IB_QPT_RC; req.starting_psn = id_priv->seq_num; @@ -2665,7 +2666,7 @@ static int cma_join_ib_multicast(struct rdma_id_private *id_priv, if (ret) return ret; - cma_set_mgid(id_priv, &mc->addr, &rec.mgid); + cma_set_mgid(id_priv, (struct sockaddr *) &mc->addr, &rec.mgid); if (id_priv->id.ps == RDMA_PS_UDP) rec.qkey = cpu_to_be32(RDMA_UDP_QKEY); ib_addr_get_sgid(dev_addr, &rec.port_gid); diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h index df7faf0..c6b2962 100644 --- a/include/rdma/rdma_cm.h +++ b/include/rdma/rdma_cm.h @@ -71,12 +71,8 @@ enum rdma_port_space { }; struct rdma_addr { - struct sockaddr src_addr; - u8 src_pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; - struct sockaddr dst_addr; - u8 dst_pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage src_addr; + struct sockaddr_storage dst_addr; struct rdma_dev_addr dev_addr; }; -- 1.5.6.dirty From sashak at voltaire.com Mon Aug 4 02:09:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 12:09:12 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_qos_parser.l: add 'noinput' lexer option to remove compiler warning In-Reply-To: <4896C1B2.4010707@dev.mellanox.co.il> References: <4896C1B2.4010707@dev.mellanox.co.il> Message-ID: <20080804090912.GG12324@sashak.voltaire.com> On 11:45 Mon 04 Aug , Yevgeny Kliteynik wrote: > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From vlad at lists.openfabrics.org Mon Aug 4 02:54:42 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 4 Aug 2008 02:54:42 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080804-0200 daily build status Message-ID: <20080804095442.C3A0BE60846@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From olga.shern at gmail.com Mon Aug 4 01:01:19 2008 From: olga.shern at gmail.com (Olga Shern (Voltaire)) Date: Mon, 4 Aug 2008 11:01:19 +0300 Subject: [ofa-general] ***SPAM*** IPoIB bug in 2.6.27 RC 1 Message-ID: Hi, We have found deadlock in IPoIB in 2.6.27_rc1 Bug description: https://bugs.openfabrics.org/show_bug.cgi?id=1114 Best Regards, Olga From cameron at harr.org Mon Aug 4 07:12:23 2008 From: cameron at harr.org (Cameron Harr) Date: Mon, 04 Aug 2008 08:12:23 -0600 Subject: [ofa-general] OFED build problems on Centos 5.2 In-Reply-To: References: <4892306A.2000307@harr.org> <4895B2FE.2070900@mellanox.co.il> Message-ID: <48970E47.2060805@harr.org> An HTML attachment was scrubbed... URL: From kliteyn at dev.mellanox.co.il Mon Aug 4 07:34:58 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 04 Aug 2008 17:34:58 +0300 Subject: [ofa-general] [PATCH] opensm/Makefile.am: fixing compilation error with osm_version.h Message-ID: <48971392.7060603@dev.mellanox.co.il> Hi Sasha, Fixing compilation error: "No rule to make target `/../include/opensm/osm_version.h', needed by `all-am'. Stop." Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/Makefile.am | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 0974cac..42dd898 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -142,7 +142,7 @@ opensminclude_HEADERS = \ $(srcdir)/../include/opensm/osm_switch.h \ $(srcdir)/../include/opensm/osm_ucast_mgr.h \ $(srcdir)/../include/opensm/osm_vl15intf.h \ - $(builddir)/../include/opensm/osm_version.h + $(top_builddir)/include/opensm/osm_version.h BUILT_SOURCES = osm_version osm_version: -- 1.5.1.4 From sashak at voltaire.com Mon Aug 4 07:45:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 17:45:02 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: fixing compilation error with osm_version.h In-Reply-To: <48971392.7060603@dev.mellanox.co.il> References: <48971392.7060603@dev.mellanox.co.il> Message-ID: <20080804144501.GA14872@sashak.voltaire.com> On 17:34 Mon 04 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Fixing compilation error: > "No rule to make target `/../include/opensm/osm_version.h', > needed by `all-am'. Stop." Interesting, you auto*tools don't generate 'builddir' variable? Which version are you using? Could you send me your Makefile.in? Sasha > > Signed-off-by: Yevgeny Kliteynik > --- > opensm/opensm/Makefile.am | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am > index 0974cac..42dd898 100644 > --- a/opensm/opensm/Makefile.am > +++ b/opensm/opensm/Makefile.am > @@ -142,7 +142,7 @@ opensminclude_HEADERS = \ > $(srcdir)/../include/opensm/osm_switch.h \ > $(srcdir)/../include/opensm/osm_ucast_mgr.h \ > $(srcdir)/../include/opensm/osm_vl15intf.h \ > - $(builddir)/../include/opensm/osm_version.h > + $(top_builddir)/include/opensm/osm_version.h > > BUILT_SOURCES = osm_version > osm_version: > -- > 1.5.1.4 > From sashak at voltaire.com Mon Aug 4 07:47:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 17:47:45 +0300 Subject: [ofa-general] [PATCH] opensm: cleanup osm_sweep_fail_ctrl Message-ID: <20080804144745.GB14872@sashak.voltaire.com> Cleanup osm_sweep_fail_ctrl - just similar to other controllers. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sm.h | 3 +-- opensm/opensm/Makefile.am | 4 +--- opensm/opensm/osm_sm.c | 17 +++++++++++++---- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index ba18b2e..f44f8a2 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -58,7 +58,6 @@ #include #include #include -#include #include #include #include @@ -131,7 +130,7 @@ typedef struct osm_sm { osm_sm_mad_ctrl_t mad_ctrl; osm_lid_mgr_t lid_mgr; osm_ucast_mgr_t ucast_mgr; - osm_sweep_fail_ctrl_t sweep_fail_ctrl; + cl_disp_reg_handle_t sweep_fail_disp_h; cl_disp_reg_handle_t ni_disp_h; cl_disp_reg_handle_t pi_disp_h; cl_disp_reg_handle_t nd_disp_h; diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 0974cac..f5e752f 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -52,8 +52,7 @@ opensm_SOURCES = main.c osm_console_io.c osm_console.c osm_db_files.c \ osm_sa_sw_info_record.c osm_service.c \ osm_slvl_map_rcv.c osm_sm.c osm_sminfo_rcv.c \ osm_sm_mad_ctrl.c osm_sm_state_mgr.c osm_state_mgr.c \ - osm_subnet.c \ - osm_sweep_fail_ctrl.c osm_sw_info_rcv.c osm_switch.c \ + osm_subnet.c osm_sw_info_rcv.c osm_switch.c \ osm_prtn.c osm_prtn_config.c osm_qos.c osm_router.c \ osm_trap_rcv.c osm_ucast_mgr.c osm_ucast_updn.c \ osm_ucast_lash.c osm_ucast_file.c osm_ucast_ftree.c \ @@ -138,7 +137,6 @@ opensminclude_HEADERS = \ $(srcdir)/../include/opensm/st.h \ $(srcdir)/../include/opensm/osm_stats.h \ $(srcdir)/../include/opensm/osm_subnet.h \ - $(srcdir)/../include/opensm/osm_sweep_fail_ctrl.h \ $(srcdir)/../include/opensm/osm_switch.h \ $(srcdir)/../include/opensm/osm_ucast_mgr.h \ $(srcdir)/../include/opensm/osm_vl15intf.h \ diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index 3548935..98d0b1b 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -144,6 +144,14 @@ static void sm_sweep(void *arg) cl_timer_start(&sm->sweep_timer, sm->p_subn->opt.sweep_interval * 1000); } +static void sweep_fail_process(IN void *context, IN void *p_data) +{ + osm_sm_t *sm = context; + + OSM_LOG(sm->p_log, OSM_LOG_DEBUG, "light sweep failed\n"); + sm->p_subn->force_heavy_sweep = TRUE; +} + /********************************************************************** **********************************************************************/ void osm_sm_construct(IN osm_sm_t * const p_sm) @@ -162,7 +170,6 @@ void osm_sm_construct(IN osm_sm_t * const p_sm) osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); osm_lid_mgr_construct(&p_sm->lid_mgr); osm_ucast_mgr_construct(&p_sm->ucast_mgr); - osm_sweep_fail_ctrl_construct(&p_sm->sweep_fail_ctrl); } /********************************************************************** @@ -209,7 +216,7 @@ void osm_sm_shutdown(IN osm_sm_t * const p_sm) cl_disp_unregister(p_sm->slvl_disp_h); cl_disp_unregister(p_sm->vla_disp_h); cl_disp_unregister(p_sm->pkey_disp_h); - osm_sweep_fail_ctrl_destroy(&p_sm->sweep_fail_ctrl); + cl_disp_unregister(p_sm->sweep_fail_disp_h); OSM_LOG_EXIT(p_sm->p_log); } @@ -312,8 +319,10 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_sweep_fail_ctrl_init(&p_sm->sweep_fail_ctrl, p_sm); - if (status != IB_SUCCESS) + p_sm->sweep_fail_disp_h = cl_disp_register(p_disp, + OSM_MSG_LIGHT_SWEEP_FAIL, + sweep_fail_process, p_sm); + if (p_sm->sweep_fail_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->ni_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_NODE_INFO, -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Mon Aug 4 07:48:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 17:48:19 +0300 Subject: [ofa-general] [PATCH] opensm: remove osm_sweep_fail_ctrl.[ch] files In-Reply-To: <20080804144745.GB14872@sashak.voltaire.com> References: <20080804144745.GB14872@sashak.voltaire.com> Message-ID: <20080804144819.GC14872@sashak.voltaire.com> Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sweep_fail_ctrl.h | 204 --------------------------- opensm/opensm/osm_sweep_fail_ctrl.c | 112 --------------- 2 files changed, 0 insertions(+), 316 deletions(-) delete mode 100644 opensm/include/opensm/osm_sweep_fail_ctrl.h delete mode 100644 opensm/opensm/osm_sweep_fail_ctrl.c diff --git a/opensm/include/opensm/osm_sweep_fail_ctrl.h b/opensm/include/opensm/osm_sweep_fail_ctrl.h deleted file mode 100644 index 12832c0..0000000 --- a/opensm/include/opensm/osm_sweep_fail_ctrl.h +++ /dev/null @@ -1,204 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_sweep_fail_ctrl_t. - * This object represents a controller that - * handles transport failures during sweeps. - * This object is part of the OpenSM family of objects. - */ - -#ifndef _OSM_SWEEP_FAIL_CTRL_H_ -#define _OSM_SWEEP_FAIL_CTRL_H_ - -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Sweep Fail Controller -* NAME -* Sweep Fail Controller -* -* DESCRIPTION -* The Sweep Fail Controller object encapsulates -* the information needed to handle transport failures during -* sweeps. -* -* The Sweep Fail Controller object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -struct osm_sm; -/****s* OpenSM: Sweep Fail Controller/osm_sweep_fail_ctrl_t -* NAME -* osm_sweep_fail_ctrl_t -* -* DESCRIPTION -* Sweep Fail Controller structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct osm_sweep_fail_ctrl { - struct osm_sm *sm; - cl_disp_reg_handle_t h_disp; -} osm_sweep_fail_ctrl_t; -/* -* FIELDS -* sm -* Pointer to the sm object. -* -* h_disp -* Handle returned from dispatcher registration. -* -* SEE ALSO -* Sweep Fail Controller object -* Sweep Failr object -*********/ - -/****f* OpenSM: Sweep Fail Controller/osm_sweep_fail_ctrl_construct -* NAME -* osm_sweep_fail_ctrl_construct -* -* DESCRIPTION -* This function constructs a Sweep Fail Controller object. -* -* SYNOPSIS -*/ -void osm_sweep_fail_ctrl_construct(IN osm_sweep_fail_ctrl_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Sweep Fail Controller -* object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_sweep_fail_ctrl_init, osm_sweep_fail_ctrl_destroy -* -* Calling osm_sweep_fail_ctrl_construct is a prerequisite to calling any other -* method except osm_sweep_fail_ctrl_init. -* -* SEE ALSO -* Sweep Fail Controller object, osm_sweep_fail_ctrl_init, -* osm_sweep_fail_ctrl_destroy -*********/ - -/****f* OpenSM: Sweep Fail Controller/osm_sweep_fail_ctrl_destroy -* NAME -* osm_sweep_fail_ctrl_destroy -* -* DESCRIPTION -* The osm_sweep_fail_ctrl_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_sweep_fail_ctrl_destroy(IN osm_sweep_fail_ctrl_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Sweep Fail Controller object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_sweep_fail_ctrl_construct or osm_sweep_fail_ctrl_init. -* -* SEE ALSO -* Sweep Fail Controller object, osm_sweep_fail_ctrl_construct, -* osm_sweep_fail_ctrl_init -*********/ - -/****f* OpenSM: Sweep Fail Controller/osm_sweep_fail_ctrl_init -* NAME -* osm_sweep_fail_ctrl_init -* -* DESCRIPTION -* The osm_sweep_fail_ctrl_init function initializes a -* Sweep Fail Controller object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_sweep_fail_ctrl_init(IN osm_sweep_fail_ctrl_t * const p_ctrl, - IN struct osm_sm * sm); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_sweep_fail_ctrl_t object to initialize. -* -* sm -* [in] Pointer to the SM object. -* -* RETURN VALUES -* CL_SUCCESS if the Sweep Fail Controller object was initialized -* successfully. -* -* NOTES -* Allows calling other Sweep Fail Controller methods. -* -* SEE ALSO -* Sweep Fail Controller object, osm_sweep_fail_ctrl_construct, -* osm_sweep_fail_ctrl_destroy -*********/ - -END_C_DECLS -#endif /* _OSM_SWEEP_FAIL_CTRL_H_ */ diff --git a/opensm/opensm/osm_sweep_fail_ctrl.c b/opensm/opensm/osm_sweep_fail_ctrl.c deleted file mode 100644 index 63809de..0000000 --- a/opensm/opensm/osm_sweep_fail_ctrl.c +++ /dev/null @@ -1,112 +0,0 @@ -/* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Implementation of osm_sweep_fail_ctrl_t. - */ - -#if HAVE_CONFIG_H -# include -#endif /* HAVE_CONFIG_H */ - -#include -#include -#include -#include - -/********************************************************************** - **********************************************************************/ -static void __osm_sweep_fail_ctrl_disp_callback(IN void *context, - IN void *p_data) -{ - osm_sweep_fail_ctrl_t *const p_ctrl = (osm_sweep_fail_ctrl_t *) context; - - OSM_LOG_ENTER(p_ctrl->sm->p_log); - - UNUSED_PARAM(p_data); - /* - Notify the state manager that we had a light sweep failure. - */ - p_ctrl->sm->p_subn->force_heavy_sweep = TRUE; - - OSM_LOG_EXIT(p_ctrl->sm->p_log); -} - -/********************************************************************** - **********************************************************************/ -void osm_sweep_fail_ctrl_construct(IN osm_sweep_fail_ctrl_t * const p_ctrl) -{ - memset(p_ctrl, 0, sizeof(*p_ctrl)); - p_ctrl->h_disp = CL_DISP_INVALID_HANDLE; -} - -/********************************************************************** - **********************************************************************/ -void osm_sweep_fail_ctrl_destroy(IN osm_sweep_fail_ctrl_t * const p_ctrl) -{ - CL_ASSERT(p_ctrl); - cl_disp_unregister(p_ctrl->h_disp); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_sweep_fail_ctrl_init(IN osm_sweep_fail_ctrl_t * const p_ctrl, - IN osm_sm_t * const sm) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(sm->p_log); - - osm_sweep_fail_ctrl_construct(p_ctrl); - p_ctrl->sm = sm; - - p_ctrl->h_disp = cl_disp_register(sm->p_disp, - OSM_MSG_LIGHT_SWEEP_FAIL, - __osm_sweep_fail_ctrl_disp_callback, - p_ctrl); - - if (p_ctrl->h_disp == CL_DISP_INVALID_HANDLE) { - OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 3501: " - "Dispatcher registration failed\n"); - status = IB_INSUFFICIENT_RESOURCES; - goto Exit; - } - -Exit: - OSM_LOG_EXIT(sm->p_log); - return (status); -} -- 1.5.4.rc2.60.gb2e62 From vlad at mellanox.co.il Mon Aug 4 07:55:13 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Mon, 4 Aug 2008 17:55:13 +0300 Subject: [ofa-general] RE: [PATCH] opensm/Makefile.am: fixing compilation error withosm_version.h In-Reply-To: <20080804144501.GA14872@sashak.voltaire.com> Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD0C8F0B@mtlexch01.mtl.com> > -----Original Message----- > From: Sasha Khapyorsky [mailto:sashak at voltaire.com] > Sent: Monday, August 04, 2008 5:45 PM > To: Yevgeny Kliteynik > Cc: OpenIB; Vladimir Sokolovsky > Subject: Re: [PATCH] opensm/Makefile.am: fixing compilation > error withosm_version.h > > > On 17:34 Mon 04 Aug , Yevgeny Kliteynik wrote: > > Hi Sasha, > > > > Fixing compilation error: > > "No rule to make target `/../include/opensm/osm_version.h', > > needed by `all-am'. Stop." > > Interesting, you auto*tools don't generate 'builddir' > variable? Which version are you using? Could you send me your > Makefile.in? > > Sasha > Attached Makefile.in from SLES10 and RHEL5.2. Regards, Vladimir -------------- next part -------------- A non-text attachment was scrubbed... Name: Makefile.in.SLES10 Type: application/octet-stream Size: 200341 bytes Desc: Makefile.in.SLES10 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Makefile.in.RHEL5.2 Type: application/octet-stream Size: 200353 bytes Desc: Makefile.in.RHEL5.2 URL: From sashak at voltaire.com Mon Aug 4 08:06:17 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 18:06:17 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: fixing compilation error withosm_version.h In-Reply-To: <5D49E7A8952DC44FB38C38FA0D758EAD0C8F0B@mtlexch01.mtl.com> References: <20080804144501.GA14872@sashak.voltaire.com> <5D49E7A8952DC44FB38C38FA0D758EAD0C8F0B@mtlexch01.mtl.com> Message-ID: <20080804150616.GD14872@sashak.voltaire.com> On 17:55 Mon 04 Aug , Vladimir Sokolovsky wrote: > > Attached Makefile.in from SLES10 and RHEL5.2. Strange, both don't have builddir definition at all. I really like those auto*tools :) Sasha From sashak at voltaire.com Mon Aug 4 08:07:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 4 Aug 2008 18:07:12 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: fixing compilation error with osm_version.h In-Reply-To: <48971392.7060603@dev.mellanox.co.il> References: <48971392.7060603@dev.mellanox.co.il> Message-ID: <20080804150712.GE14872@sashak.voltaire.com> On 17:34 Mon 04 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Fixing compilation error: > "No rule to make target `/../include/opensm/osm_version.h', > needed by `all-am'. Stop." > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From ronli at voltaire.com Mon Aug 4 11:17:12 2008 From: ronli at voltaire.com (Ron Livne) Date: Mon, 4 Aug 2008 18:17:12 +0000 (UTC) Subject: [ofa-general] [PATCH] ib/core: fix for send multicast group send leave retry Message-ID: Until now, only if joinning a multicast group failed there was a retry mechanism. This patch will add a mechanism that will retry to leave a multicast group before giving up. Signed-off-by: Ron Livne diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c index 107f170..9aba771 100644 --- a/drivers/infiniband/core/multicast.c +++ b/drivers/infiniband/core/multicast.c @@ -106,6 +106,7 @@ struct mcast_group { struct ib_sa_query *query; int query_id; u16 pkey_index; + int retries; }; struct mcast_member { @@ -540,9 +541,16 @@ static void join_handler(int status, struct ib_sa_mcmember_rec *rec, static void leave_handler(int status, struct ib_sa_mcmember_rec *rec, void *context) { + __u8 leave_state; struct mcast_group *group = context; - mcast_work_handler(&group->work); + leave_state = get_leave_state(group); + if (status && (group->retries > 0)) { + send_leave(group, leave_state); + group->retries--; + } + if (!status || (group->retries == 0)) + mcast_work_handler(&group->work); } static struct mcast_group *acquire_group(struct mcast_port *port, @@ -565,6 +573,7 @@ static struct mcast_group *acquire_group(struct mcast_port *port, if (!group) return NULL; + group->retries = 3; group->port = port; group->rec.mgid = *mgid; group->pkey_index = MCAST_INVALID_PKEY_INDEX; From yosefe at Voltaire.COM Mon Aug 4 08:45:53 2008 From: yosefe at Voltaire.COM (Yossi Etigin) Date: Mon, 04 Aug 2008 18:45:53 +0300 Subject: [ofa-general] [PATCH] ipiob: fix rtnl deadlock Message-ID: <48972431.7000804@Voltaire.COM> This fixes bug #1114 in bugzilla, which is a deadlock between ipoib_stop and mcast_join_task. ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue. the flush operation might wait for mcast_join_task to finish, which in turn might wait for rtnl_lock. -- Index: b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c =================================================================== --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-08-04 18:09:33.000000000 +0300 +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-08-04 18:39:08.000000000 +0300 @@ -504,6 +504,7 @@ struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv, mcast_join_task.work); struct net_device *dev = priv->dev; + int ret; if (!test_bit(IPOIB_MCAST_RUN, &priv->flags)) return; @@ -577,9 +578,16 @@ priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) { - rtnl_lock(); - dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); - rtnl_unlock(); + /* Avoid deadlock with ipoib_stop */ + while (!(ret = rtnl_trylock()) && + test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + yield(); + + if (ret) { + dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); + rtnl_unlock(); + } else + ipoib_dbg_mcast(priv, "ignoring mtu setup because device is down\n"); } ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); -- --Yossi From sean.hefty at intel.com Mon Aug 4 09:31:52 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 4 Aug 2008 09:31:52 -0700 Subject: [ofa-general] [PATCH] ib/core: fix for send multicast group send leave retry In-Reply-To: References: Message-ID: <000001c8f64f$9a36a1a0$bb37170a@amr.corp.intel.com> > struct mcast_member { >@@ -540,9 +541,16 @@ static void join_handler(int status, struct >ib_sa_mcmember_rec *rec, > static void leave_handler(int status, struct ib_sa_mcmember_rec *rec, > void *context) > { >+ __u8 leave_state; > struct mcast_group *group = context; > >- mcast_work_handler(&group->work); >+ leave_state = get_leave_state(group); I don't think this works as expected. If you look in mcast_work_handler(), the group's join_state is adjusted before send_leave is called. Leave_state here will be different (likely 0) than the leave_state sent in the original leave request. >+ if (status && (group->retries > 0)) { >+ send_leave(group, leave_state); >+ group->retries--; >+ } >+ if (!status || (group->retries == 0)) I think this should just be an 'else'. >+ mcast_work_handler(&group->work); > } - Sean From sean.hefty at intel.com Mon Aug 4 09:45:05 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 4 Aug 2008 09:45:05 -0700 Subject: [ofa-general] RE: [PATCH 1/1 v2] librdmacm: add support for create qp expanded (with changelog this time) In-Reply-To: <200808041019.30536.jackm@dev.mellanox.co.il> References: <200808031451.02599.jackm@dev.mellanox.co.il> <000001c8f5f7$0db17500$9dfd070a@amr.corp.intel.com> <200808041019.30536.jackm@dev.mellanox.co.il> Message-ID: <000101c8f651$72e9d4d0$bb37170a@amr.corp.intel.com> >Won't it be confusing if the rdma_create_qp_ex() call with create_flags >= 0 >succeeds, while the same call with create_flags != 0 fails (in the case >where >userlevel is running against an older kernel which does not have the >ib_uverbs_create_qp_ex() interface)? As soon as we introduce new flags, we will have a situation where create_flags=X works, but create_flags=Y won't. (Either because of the different kernels, or devices.) This isn't really any different. >I prefer to keep the ex interface separate -- there is no point in >using the >qp_ex call if there is no intent to use the expanded feature. I can envision an application that allows user controlled parameters where calling qp_ex would always work, exactly so the user doesn't have to do an if statement to see which call to invoke. As soon as an ex type function is introduced, you could argue that the existing call should be deprecated. - Sean From rdreier at cisco.com Mon Aug 4 10:44:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 10:44:39 -0700 Subject: [ofa-general] [PATCH] ipiob: fix rtnl deadlock In-Reply-To: <48972431.7000804@Voltaire.COM> (Yossi Etigin's message of "Mon, 04 Aug 2008 18:45:53 +0300") References: <48972431.7000804@Voltaire.COM> Message-ID: > ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue. > the flush operation might wait for mcast_join_task to finish, which > in turn might wait for rtnl_lock. when did we introduce this bug? > + /* Avoid deadlock with ipoib_stop */ > + while (!(ret = rtnl_trylock()) && > + test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > + yield(); > + > + if (ret) { > + dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); > + rtnl_unlock(); > + } else > + ipoib_dbg_mcast(priv, "ignoring mtu setup because device is down\n"); this is rather horrible looking... is there any way we can avoid the loop on trylock? - R. From cameron at harr.org Mon Aug 4 10:53:59 2008 From: cameron at harr.org (Cameron Harr) Date: Mon, 04 Aug 2008 11:53:59 -0600 Subject: [ofa-general] OFED build problems on Centos 5.2 In-Reply-To: <48970E47.2060805@harr.org> References: <4892306A.2000307@harr.org> <4895B2FE.2070900@mellanox.co.il> <48970E47.2060805@harr.org> Message-ID: <48974237.8080306@harr.org> An HTML attachment was scrubbed... URL: From rdreier at cisco.com Mon Aug 4 11:03:55 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:03:55 -0700 Subject: [ofa-general] [PATCH RFC] replace sockaddr with sockaddr_storage in struct rdma_addr In-Reply-To: <1217840807.23992.5.camel@linux-zn6t.site> (Aleksey Senin's message of "Mon, 04 Aug 2008 12:06:47 +0300") References: <1217840807.23992.5.camel@linux-zn6t.site> Message-ID: thanks, I missed the padding in struct rmda_addr. I rolled this into the other sockaddr_storage patch. From yosefe at Voltaire.COM Mon Aug 4 11:04:50 2008 From: yosefe at Voltaire.COM (Yossi Etigin) Date: Mon, 04 Aug 2008 21:04:50 +0300 Subject: [ofa-general] [PATCH] ipiob: fix rtnl deadlock In-Reply-To: References: <48972431.7000804@Voltaire.COM> Message-ID: <489744C2.8040400@Voltaire.COM> Roland Dreier wrote: > > ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue. > > the flush operation might wait for mcast_join_task to finish, which > > in turn might wait for rtnl_lock. > > when did we introduce this bug? http://www.openfabrics.org/git/?p=ofed_1_4/linux-2.6.git;a=commit;h=529024117628d0037644a20b4870c61d63cea2a1 > > > + /* Avoid deadlock with ipoib_stop */ > > + while (!(ret = rtnl_trylock()) && > > + test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > > + yield(); > > + > > + if (ret) { > > + dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); > > + rtnl_unlock(); > > + } else > > + ipoib_dbg_mcast(priv, "ignoring mtu setup because device is down\n"); > > this is rather horrible looking... is there any way we can avoid the > loop on trylock? > We can just give up if you can't get the lock, like it's done in drivers/net/cxgb3/cxgb3_main.c. Other solution might get messy, because you don't have control when the lock is actually locked, so you can't set any flags and such. These might be: flush the queue sometime later, set the mtu sometime later on another workqueue. > - R. -- --Yossi From rdreier at cisco.com Mon Aug 4 11:05:12 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:05:12 -0700 Subject: [ofa-general] [PATCH 2.6.27] RDMA/cxgb3: Fix QP capabilities. In-Reply-To: <20080731202135.18293.57833.stgit@dell3.ogc.int> (Steve Wise's message of "Thu, 31 Jul 2008 15:21:35 -0500") References: <20080731202135.18293.57833.stgit@dell3.ogc.int> Message-ID: thanks, applied From rdreier at cisco.com Mon Aug 4 11:06:08 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:06:08 -0700 Subject: [ofa-general] [PATCH 2.6.27] RDMA/cxgb3: Fix up MW access rights. In-Reply-To: <20080801181010.3736.44993.stgit@dell3.ogc.int> (Steve Wise's message of "Fri, 01 Aug 2008 13:10:10 -0500") References: <20080801181010.3736.44993.stgit@dell3.ogc.int> Message-ID: thanks, applied From rdreier at cisco.com Mon Aug 4 11:08:53 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:08:53 -0700 Subject: [ofa-general] Re: [PATCH 2.6.27] RDMA/cxgb3: Deadlock initializing the iw_cxgb3 device. In-Reply-To: <20080801194334.7950.33820.stgit@dell3.ogc.int> (Steve Wise's message of "Fri, 01 Aug 2008 14:43:35 -0500") References: <20080801194334.7950.33820.stgit@dell3.ogc.int> Message-ID: thanks, applied From rdreier at cisco.com Mon Aug 4 11:09:27 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:09:27 -0700 Subject: [ofa-general] [ANNOUNCE] libcxgb3-1.2.2 availability In-Reply-To: <4894C49E.1070702@opengridcomputing.com> (Steve Wise's message of "Sat, 02 Aug 2008 15:33:34 -0500") References: <4894C49E.1070702@opengridcomputing.com> Message-ID: > This version relaxes the firmware requirements and thus works with > versions 5.x, 6,x and the up and coming 7.x for ofed-1.4. awesome... I'll update the Debian packages. From rdreier at cisco.com Mon Aug 4 11:12:50 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 04 Aug 2008 11:12:50 -0700 Subject: [ofa-general] Re: [PATCH] Infiniband/ipath: fix warnings In-Reply-To: <20080802102909.GC4856@orion> (Alexander Beregalov's message of "Sat, 2 Aug 2008 14:29:09 +0400") References: <20080802102909.GC4856@orion> Message-ID: Thanks for doing this work, applied. From jon at opengridcomputing.com Mon Aug 4 12:28:48 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Mon, 4 Aug 2008 14:28:48 -0500 Subject: [ofa-general] [PATCH] rds: fix compile breakage on ofed_2_6_27 tree Message-ID: <20080804192848.GD28069@opengridcomputing.com> RDS does not compile on 2.6.25 and 2.6.27 kernels due to a broken reference to a recently modified data struct. The patch below modifies the reference to point to the new location. Signed-Off-By: Jon Mason diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c index 6b3b476..9f72556 100644 --- a/net/rds/ib_recv.c +++ b/net/rds/ib_recv.c @@ -796,7 +796,7 @@ void rds_ib_recv_cq_comp_handler(struct ib_cq *cq, void *context) while (ib_poll_cq(cq, 1, &wc) > 0) { rdsdebug("wc wr_id 0x%llx status %u byte_len %u imm_data %u\n", (unsigned long long)wc.wr_id, wc.status, wc.byte_len, - be32_to_cpu(wc.imm_data)); + be32_to_cpu(wc.ex.imm_data)); rds_ib_stats_inc(s_ib_rx_cq_event); recv = &ic->i_recvs[rds_ib_ring_oldest(&ic->i_recv_ring)]; diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c index 865301a..43d3faa 100644 --- a/net/rds/ib_send.c +++ b/net/rds/ib_send.c @@ -195,7 +195,7 @@ void rds_ib_send_cq_comp_handler(struct ib_cq *cq, void *context) while (ib_poll_cq(cq, 1, &wc) > 0 ) { rdsdebug("wc wr_id 0x%llx status %u byte_len %u imm_data %u\n", (unsigned long long)wc.wr_id, wc.status, wc.byte_len, - be32_to_cpu(wc.imm_data)); + be32_to_cpu(wc.ex.imm_data)); rds_ib_stats_inc(s_ib_tx_cq_event); if (wc.wr_id == RDS_IB_ACK_WR_ID) { From weiny2 at llnl.gov Mon Aug 4 14:05:37 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 4 Aug 2008 14:05:37 -0700 Subject: [ofa-general] Re: [PATCH V2] ibsim: Add a Node Description query drop error. In-Reply-To: <20080803170946.GF15644@sashak.voltaire.com> References: <20080730174011.5d80e036.weiny2@llnl.gov> <20080731180323.GV14872@sashak.voltaire.com> <20080731132053.17cd8915.weiny2@llnl.gov> <20080803170946.GF15644@sashak.voltaire.com> Message-ID: <20080804140537.3d39a837.weiny2@llnl.gov> On Sun, 3 Aug 2008 20:09:46 +0300 Sasha Khapyorsky wrote: > Hi Ira, > > On 13:20 Thu 31 Jul , Ira Weiny wrote: > > > > Like this? > > Not exactly. I meant possibility to specify any attribute to drop. Like > this. > > Sasha > > diff --git a/ibsim/sim.h b/ibsim/sim.h > index 936bb85..f989252 100644 > --- a/ibsim/sim.h > +++ b/ibsim/sim.h This did not quite work (had to add an exception in pc_updated). Here is a revised version with some additions to the help message for ease of use. Ira >From 4d1fb5b5ba24584e27d09e51e29745f986f84a32 Mon Sep 17 00:00:00 2001 From: Sasha Khapyorsky Date: Sun, 3 Aug 2008 20:09:46 +0300 Subject: [PATCH] ibsim: Add a Node Description query drop error. Hi Ira, On 13:20 Thu 31 Jul , Ira Weiny wrote: > > Like this? Not exactly. I meant possibility to specify any attribute to drop. Like this. Sasha Signed-off-by: Ira Weiny --- ibsim/sim.h | 1 + ibsim/sim_cmd.c | 24 ++++++++++++++++++++++-- ibsim/sim_mad.c | 7 +++++-- 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/ibsim/sim.h b/ibsim/sim.h index 936bb85..f989252 100644 --- a/ibsim/sim.h +++ b/ibsim/sim.h @@ -207,6 +207,7 @@ struct Port { Node *remotenode; int remoteport; int errrate; + uint16_t errattr; Node *node; Portcounters portcounters; uint16_t *pkey_tbl; diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index a6aab9d..6bf1e29 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -199,6 +199,7 @@ static int do_seterror(FILE * f, char *line) char *nodeid = 0, name[NAMELEN], *sp, *orig = 0; int portnum = -1; // def - all ports int numports, set = 0, rate = 0; + uint16_t attr = 0; if (strsep(&s, "\"")) orig = strsep(&s, "\""); @@ -240,6 +241,13 @@ static int do_seterror(FILE * f, char *line) } DEBUG("error rate is %d", rate); + + strsep(&s, " \t"); + if (s) { + attr = strtoul(s, 0, 0); + DEBUG("error attr is %u", attr); + } + numports = node->numports; if (node->type == SWITCH_NODE) @@ -250,12 +258,14 @@ static int do_seterror(FILE * f, char *line) if (portnum >= 0) { port = ports + node->portsbase + portnum; port->errrate = rate; + port->errattr = attr; return 1; } for (port = ports + node->portsbase, e = port + numports; port < e; port++) { port->errrate = rate; + port->errattr = attr; set++; } @@ -415,8 +425,11 @@ static void dump_switch(FILE * f, Switch * sw) static void dump_comment(Port * port, char *comment) { + int n = 0; if (port->errrate) - sprintf(comment, "\t# err_rate %d", port->errrate); + n += sprintf(comment, "\t# err_rate %d", port->errrate); + if (port->errattr) + n += sprintf(comment+n, "\t# err_attr %d", port->errattr); } static void dump_port(FILE * f, Port * port, int type) @@ -708,7 +721,14 @@ static int dump_help(FILE * f) fprintf(f, "\tGuid \"nodeid\" : set GUID value for this node\n"); fprintf(f, "\tGuid \"nodeid\"[port] : set GUID value for this port\n"); fprintf(f, - "\tError \"nodeid\"[port] : set error rate for port/node\n"); + "\tError \"nodeid\"[port] [attribute]: set error rate for\n" + "\t\t\tport/node, optionally for specified attribute ID\n" + "\t\t\tSome common attribute IDs:\n" + "\t\t\t\tNodeDescription : 16\n" + "\t\t\t\tNodeInfo : 17\n" + "\t\t\t\tSwitchInfo : 18\n" + "\t\t\t\tPortInfo : 19\n" + ); fprintf(f, "\tBaselid \"nodeid\"[port] [lmc] : change port's lid (lmc)\n"); fprintf(f, "\tVerbose [newlevel] - show/set simulator verbosity\n"); diff --git a/ibsim/sim_mad.c b/ibsim/sim_mad.c index b8ce2ab..0ac2c1e 100644 --- a/ibsim/sim_mad.c +++ b/ibsim/sim_mad.c @@ -612,7 +612,9 @@ static int pc_updated(Port ** srcport, Port * destport) ADDVAL64(destpc->ext_xmit_data, madsize_div_4); ADDVAL64(destpc->ext_xmit_pkts, 1); - if (destport->errrate && (random() % 100) < destport->errrate) { + if (destport->errrate && + !destport->errattr && + (random() % 100) < destport->errrate) { pc_add_error_errs_rcv(destport); VERB("drop pkt due error rate %d", destport->errrate); return 0; @@ -1188,7 +1190,8 @@ int process_packet(Client * cl, void *p, int size, Client ** dcl) return sizeof(*r); // forward only } - if (port->errrate && (random() % 100) < port->errrate) { + if (port->errrate && (!port->errattr || port->errattr == rpc.attr.id) && + (random() % 100) < port->errrate) { VERB("drop pkt due error rate %d", port->errrate); goto _dropped; } -- 1.5.4.5 From jon at opengridcomputing.com Mon Aug 4 14:16:10 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Mon, 4 Aug 2008 16:16:10 -0500 Subject: [ofa-general] [PATCH] rds: fix call to sk_alloc Message-ID: <20080804211610.GF28069@opengridcomputing.com> The OFED-1.4 RDS calls to sk_alloc are broken, thus preventing rds from establishing a connection. sk_alloc function parameters changed between kernel versions 2.6.23 and 2.6.24. The current code erroneously checks for 2.6.26 for this change. The workaround code for no KERNEL_HAS_PROTO_REGISTER lacks a function parameter (as it is #defined over the standard sk_alloc call), and that functions call to sk_alloc has parameters 3 and 4 swapped. With the patch below, RDS will start working again. Signed-Off-By: Jon Mason diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index e32f502..3b7ad33 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -399,11 +399,11 @@ static struct proto_ops rds_proto_ops = { }; #ifndef KERNEL_HAS_PROTO_REGISTER -static struct sock *sk_alloc_compat(int pf, gfp_t gfp, struct proto *prot) +static struct sock *sk_alloc_compat(int pf, gfp_t gfp, struct proto *prot, int zero_it) { struct rds_sock *rs; - sk = sk_alloc(pf, gfp, 1, NULL); + sk = sk_alloc(pf, gfp, prot, zero_it); if (sk == NULL) return NULL; @@ -455,7 +455,7 @@ static int __rds_create(struct socket *sock, struct sock *sk, int protocol) return 0; } -#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 26) +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 24) static int rds_create(struct socket *sock, int protocol) { struct sock *sk; From jon at opengridcomputing.com Mon Aug 4 16:13:11 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Mon, 4 Aug 2008 18:13:11 -0500 Subject: [ofa-general] RDS iWARP bcopy support Message-ID: <20080804231311.GH28069@opengridcomputing.com> Hey Andy, After looking at the OFED-1.4 kernel git tree, I noticed that the RDS iWARP bcopy patches were not included. Could you please submit them for inclusion in the next OFED 1.4 release? The relevant patches in Olaf's future-20080715 git tree are: commit 215108c9db89901968be501b0fc2a2e1c5773197 Author: Jon Mason Date: Tue Jul 15 00:02:00 2008 -0700 RDS: add iWARP bcopy support This patch adds partial iWARP support to RDS. It covers bcopy mode only; RDMA will be added in a later patch. Signed-off-by: Jon Mason Signed-off-by: Olaf Kirch commit ff1d3f1ae3865365f702f8f30200744f166c7029 Author: Olaf Kirch Date: Tue Jul 15 00:01:56 2008 -0700 RDS: Move IB RNR tuning to a function This patch moves the IB RNR tuning code out of rds_ib_connect_complete into a function of its own. Signed-off-by: Olaf Kirch Thanks, Jon From jon at opengridcomputing.com Mon Aug 4 16:22:18 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Mon, 4 Aug 2008 18:22:18 -0500 Subject: [ofa-general] [PATCH] rds: support for IB_DEVICE_LOCAL_DMA_LKEY (resend) Message-ID: <20080804232218.GI28069@opengridcomputing.com> This is a resend of the patch based on the ofed-1.4 2.6.27-rc1 git tree. Please apply on top of the previous patches sent out today. For iWARP, there is a limitation where syncs to remote memory need write permission. By allowing remote write, there is a potential security risk where all memory is available to remote clients. By using the local_dma_lkey, this removes the necessity of remote write permission on local memory regions. The patch below converts the usage of dma_mr's to dma_local_lkey and removes the allocation of dma_mr's (if IB_DEVICE_LOCAL_DMA_LKEY is supported). Also, Chelsio has a limitation of not being able to access DMA MR regions that reside in memory greater that 4GB. So using the patch, rds bcopy will work on systems with greater than 4GB RAM. For IB, using local_dma_lkey removes the need for DMA MR allocations (presuming that the driver supports IB_DEVICE_LOCAL_DMA_LKEY). Signed-Off-By: Jon Mason diff --git a/net/rds/ib.c b/net/rds/ib.c index 6c5328f..775a41e 100644 --- a/net/rds/ib.c +++ b/net/rds/ib.c @@ -79,6 +79,7 @@ void rds_ib_add_one(struct ib_device *device) spin_lock_init(&rds_ibdev->spinlock); + rds_ibdev->dma_local_lkey = !!(dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY); rds_ibdev->max_wrs = dev_attr->max_qp_wr; rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE); @@ -95,18 +96,21 @@ void rds_ib_add_one(struct ib_device *device) if (IS_ERR(rds_ibdev->pd)) goto free_dev; - if (device->node_type != RDMA_NODE_RNIC) { - rds_ibdev->mr = ib_get_dma_mr(rds_ibdev->pd, - IB_ACCESS_LOCAL_WRITE); - } else { - /* Why does it have to have these permissions? */ - rds_ibdev->mr = ib_get_dma_mr(rds_ibdev->pd, - IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE | - IB_ACCESS_LOCAL_WRITE); - } - if (IS_ERR(rds_ibdev->mr)) - goto err_pd; + if (!rds_ibdev->dma_local_lkey) { + if (device->node_type != RDMA_NODE_RNIC) { + rds_ibdev->mr = ib_get_dma_mr(rds_ibdev->pd, + IB_ACCESS_LOCAL_WRITE); + } else { + /* Why does it have to have these permissions? */ + rds_ibdev->mr = ib_get_dma_mr(rds_ibdev->pd, + IB_ACCESS_REMOTE_READ | + IB_ACCESS_REMOTE_WRITE | + IB_ACCESS_LOCAL_WRITE); + } + if (IS_ERR(rds_ibdev->mr)) + goto err_pd; + } else + rds_ibdev->mr = NULL; rds_ibdev->mr_pool = rds_ib_create_mr_pool(rds_ibdev); if (IS_ERR(rds_ibdev->mr_pool)) { @@ -122,7 +126,8 @@ void rds_ib_add_one(struct ib_device *device) goto free_attr; err_mr: - ib_dereg_mr(rds_ibdev->mr); + if (!rds_ibdev->dma_local_lkey) + ib_dereg_mr(rds_ibdev->mr); err_pd: ib_dealloc_pd(rds_ibdev->pd); free_dev: @@ -148,7 +153,9 @@ void rds_ib_remove_one(struct ib_device *device) if (rds_ibdev->mr_pool) rds_ib_destroy_mr_pool(rds_ibdev->mr_pool); - ib_dereg_mr(rds_ibdev->mr); + if (rds_ibdev->mr) + ib_dereg_mr(rds_ibdev->mr); + ib_dealloc_pd(rds_ibdev->pd); list_del(&rds_ibdev->list); diff --git a/net/rds/ib.h b/net/rds/ib.h index 2a0682f..d4e19bd 100644 --- a/net/rds/ib.h +++ b/net/rds/ib.h @@ -125,7 +125,8 @@ struct rds_ib_connection { /* Protocol version specific information */ unsigned int i_flowctl : 1, /* enable/disable flow ctl */ i_iwarp : 1, /* this is actually iWARP not IB */ - i_fastreg : 1; /* device supports fastreg */ + i_fastreg : 1, /* device supports fastreg */ + i_dma_local_lkey : 1; /* Batched completions */ unsigned int i_unsignaled_wrs; @@ -157,7 +158,8 @@ struct rds_ib_device { unsigned int max_fmrs; int max_sge; unsigned int max_wrs; - unsigned int use_fastreg : 1; + unsigned int use_fastreg : 1, + dma_local_lkey : 1; spinlock_t spinlock; }; @@ -232,6 +234,10 @@ static void inline rds_ib_dma_sync_sg_for_device(struct ib_device *dev, } #define ib_dma_sync_sg_for_device rds_ib_dma_sync_sg_for_device +static inline u32 rds_ib_local_dma_lkey(struct rds_ib_connection *ic) +{ + return (ic->i_dma_local_lkey ? ic->i_cm_id->device->local_dma_lkey : ic->i_mr->lkey); +} /* ib.c */ extern struct rds_transport rds_ib_transport; diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c index 6e9db6a..c413b6d 100644 --- a/net/rds/ib_cm.c +++ b/net/rds/ib_cm.c @@ -458,6 +458,7 @@ static int rds_ib_cm_handle_connect(struct rdma_cm_id *cm_id, /* Remember whether this is IB or iWARP */ ic->i_iwarp = (cm_id->device->node_type == RDMA_NODE_RNIC); ic->i_fastreg = rds_ibdev->use_fastreg; + ic->i_dma_local_lkey = rds_ibdev->dma_local_lkey; /* We got halfway through setting up the ib_connection, if we * fail now, we have to take the long route out of this mess. */ @@ -613,6 +614,7 @@ out: int rds_ib_conn_connect(struct rds_connection *conn) { struct rds_ib_connection *ic = conn->c_transport_data; + struct rds_ib_device *rds_ibdev; struct sockaddr_in src, dest; int ret; @@ -640,8 +642,12 @@ int rds_ib_conn_connect(struct rds_connection *conn) goto out; } + rds_ibdev = ib_get_client_data(ic->i_cm_id->device, &rds_ib_client); + /* Now check the device type and set i_iwarp */ ic->i_iwarp = (ic->i_cm_id->device->node_type == RDMA_NODE_RNIC); + ic->i_fastreg = rds_ibdev->use_fastreg; + ic->i_dma_local_lkey = rds_ibdev->dma_local_lkey; dest.sin_family = AF_INET; dest.sin_addr.s_addr = (__force u32)conn->c_faddr; diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c index 6b3b476..5c69d92 100644 --- a/net/rds/ib_recv.c +++ b/net/rds/ib_recv.c @@ -97,12 +97,12 @@ void rds_ib_recv_init_ring(struct rds_ib_connection *ic) sge = rds_ib_data_sge(ic, recv->r_sge); sge->addr = 0; sge->length = RDS_FRAG_SIZE; - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); sge = rds_ib_header_sge(ic, recv->r_sge); sge->addr = ic->i_recv_hdrs_dma + (i * sizeof(struct rds_header)); sge->length = sizeof(struct rds_header); - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); } } @@ -364,7 +364,7 @@ void rds_ib_recv_init_ack(struct rds_ib_connection *ic) sge->addr = ic->i_ack_dma; sge->length = sizeof(struct rds_header); - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); wr->sg_list = sge; wr->num_sge = 1; diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c index 865301a..1b51526 100644 --- a/net/rds/ib_send.c +++ b/net/rds/ib_send.c @@ -144,12 +144,12 @@ void rds_ib_send_init_ring(struct rds_ib_connection *ic) send->s_wr.ex.imm_data = 0; sge = rds_ib_data_sge(ic, send->s_sge); - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); sge = rds_ib_header_sge(ic, send->s_sge); sge->addr = ic->i_send_hdrs_dma + (i * sizeof(struct rds_header)); sge->length = sizeof(struct rds_header); - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); } } @@ -425,7 +425,7 @@ rds_ib_xmit_populate_wr(struct rds_ib_connection *ic, sge = rds_ib_data_sge(ic, send->s_sge); sge->addr = buffer; sge->length = length; - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); sge = rds_ib_header_sge(ic, send->s_sge); } else { @@ -437,7 +437,7 @@ rds_ib_xmit_populate_wr(struct rds_ib_connection *ic, sge->addr = ic->i_send_hdrs_dma + (pos * sizeof(struct rds_header)); sge->length = sizeof(struct rds_header); - sge->lkey = ic->i_mr->lkey; + sge->lkey = rds_ib_local_dma_lkey(ic); } /* @@ -791,7 +791,7 @@ int rds_ib_xmit_rdma(struct rds_connection *conn, struct rds_rdma_op *op) len = sg_dma_len(scat); send->s_sge[j].addr = sg_dma_address(scat); send->s_sge[j].length = len; - send->s_sge[j].lkey = ic->i_mr->lkey; + send->s_sge[j].lkey = rds_ib_local_dma_lkey(ic); sent += len; rdsdebug("ic %p sent %d remote_addr %llu\n", ic, sent, remote_addr); From ralph.campbell at qlogic.com Mon Aug 4 16:59:11 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Mon, 04 Aug 2008 16:59:11 -0700 Subject: [ofa-general] Re: [ewg] [PATCH] IB/core: Add support for Receive Core Affinity In-Reply-To: <20080715161331.GA8667@mtls03> References: <20080715161331.GA8667@mtls03> Message-ID: <1217894351.620.244.camel@chromite.mv.qlogic.com> On Tue, 2008-07-15 at 19:13 +0300, Eli Cohen wrote: > Add the capability flag IB_DEVICE_IPOIB_RCA to denote devices which > support distribution of received packects to multiple receive queues. > This results in better utilization of the system CPU cores by > distributing interrupt handling between the cores. The patch adds a > new verb, ib_create_qp_range(), to create a list range of QPs with > specific alignment requirements that should be used by a consumer to > for the different receive queues. > > Signed-off-by: Eli Cohen > --- > drivers/infiniband/core/verbs.c | 39 ++++++++++++++++++++++++++++++++++++++- > include/rdma/ib_verbs.h | 30 +++++++++++++++++++++++++++++- > 2 files changed, 67 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index a7da9be..871fb1e 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -280,6 +280,39 @@ EXPORT_SYMBOL(ib_destroy_srq); > > /* Queue pairs */ > > +int ib_create_qp_range(struct ib_pd *pd, struct ib_qp_init_attr *qp_init_attr, > + int nqps, int align, struct ib_qp *list[]) It just seems wrong to me to require the caller to specify the alignment restrictions. Isn't this HCA specific? Is IPoIB really going to know whether or not the QP numbers returned by this call are "aligned" or not? What if I call ib_create_qp_range() with nqps=3 and align=0? Also, in ib_verbs.h, struct ib_qp_attr now has a struct rca_attr field. I don't see why the struct rca_attr field is needed for ib_modify_qp(). It seems to me that this information should be stored as part of the QP info when creating the N QPs. Why should the verbs caller need to know about this? The values are determined by the HCA when the QPs are created. From eli at mellanox.co.il Mon Aug 4 23:10:44 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 05 Aug 2008 09:10:44 +0300 Subject: [ofa-general] Re: [ewg] [PATCH] IB/core: Add support for Receive Core Affinity In-Reply-To: <1217894351.620.244.camel@chromite.mv.qlogic.com> References: <20080715161331.GA8667@mtls03> <1217894351.620.244.camel@chromite.mv.qlogic.com> Message-ID: <1217916644.13782.8.camel@mtls03> > > > > +int ib_create_qp_range(struct ib_pd *pd, struct ib_qp_init_attr *qp_init_attr, > > + int nqps, int align, struct ib_qp *list[]) > > It just seems wrong to me to require the caller to specify the alignment > restrictions. Isn't this HCA specific? I agree with you about this, but since my previous posts on the issue did not receive too much attention, I did not want to change my implementation before the issue has been discussed. > Is IPoIB really going to know > whether or not the QP numbers returned by this call are "aligned" > or not? What if I call ib_create_qp_range() with nqps=3 and align=0? I am not sure I understand your argument here: in this case you create 3 consecutive QPs with no other restrictions on the number of of the first. > > Also, in ib_verbs.h, struct ib_qp_attr now has a struct rca_attr field. > I don't see why the struct rca_attr field is needed for ib_modify_qp(). > It seems to me that this information should be stored as part of the > QP info when creating the N QPs. Why should the verbs caller need to > know about this? The values are determined by the HCA when the QPs > are created. > I totally agree with you here too and I also sent an email about that some time ago. Again, I want to trigger a discussion to close on the API before re-implementing. I hope to get more opinions and then re-work the patch. From vlad at mellanox.co.il Mon Aug 4 23:17:41 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 05 Aug 2008 09:17:41 +0300 Subject: [ofa-general] [PATCH] rds: fix compile breakage on ofed_2_6_27 tree In-Reply-To: <20080804192848.GD28069@opengridcomputing.com> References: <20080804192848.GD28069@opengridcomputing.com> Message-ID: <4897F085.7040303@mellanox.co.il> Jon Mason wrote: > RDS does not compile on 2.6.25 and 2.6.27 kernels due to a broken > reference to a recently modified data struct. The patch below modifies > the reference to point to the new location. > > Signed-Off-By: Jon Mason > > diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c > index 6b3b476..9f72556 100644 > --- a/net/rds/ib_recv.c > +++ b/net/rds/ib_recv.c > @@ -796,7 +796,7 @@ void rds_ib_recv_cq_comp_handler(struct ib_cq *cq, void *context) > while (ib_poll_cq(cq, 1, &wc) > 0) { > rdsdebug("wc wr_id 0x%llx status %u byte_len %u imm_data %u\n", > (unsigned long long)wc.wr_id, wc.status, wc.byte_len, > - be32_to_cpu(wc.imm_data)); > + be32_to_cpu(wc.ex.imm_data)); > rds_ib_stats_inc(s_ib_rx_cq_event); > > recv = &ic->i_recvs[rds_ib_ring_oldest(&ic->i_recv_ring)]; > diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c > index 865301a..43d3faa 100644 > --- a/net/rds/ib_send.c > +++ b/net/rds/ib_send.c > @@ -195,7 +195,7 @@ void rds_ib_send_cq_comp_handler(struct ib_cq *cq, void *context) > while (ib_poll_cq(cq, 1, &wc) > 0 ) { > rdsdebug("wc wr_id 0x%llx status %u byte_len %u imm_data %u\n", > (unsigned long long)wc.wr_id, wc.status, wc.byte_len, > - be32_to_cpu(wc.imm_data)); > + be32_to_cpu(wc.ex.imm_data)); > rds_ib_stats_inc(s_ib_tx_cq_event); > > if (wc.wr_id == RDS_IB_ACK_WR_ID) { Hi, I applied this patch to git://git.openfabrics.org/ofed_1_4/linux-2.6.git ofed_2_6_27 Thanks, Vladimir From devesh28 at gmail.com Mon Aug 4 23:25:15 2008 From: devesh28 at gmail.com (Devesh Sharma) Date: Tue, 5 Aug 2008 11:55:15 +0530 Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** OFED-1.3 RDMA CM, IB_ACCESS_LOCAL_WRITE flag missing In-Reply-To: <2f3bf9a60808030156p26c0e292ge0e212e887b8fbf@mail.gmail.com> References: <309a667c0808010403r4036bc51u8a167954a6fe9739@mail.gmail.com> <000001c8f3f4$d8f0ab00$bb37170a@amr.corp.intel.com> <309a667c0808020302h6e340c88xc662064c400795d3@mail.gmail.com> <2f3bf9a60808030156p26c0e292ge0e212e887b8fbf@mail.gmail.com> Message-ID: <309a667c0808042325l55f08601ya9b7300ca89e7f1e@mail.gmail.com> Yes, you are right, I misunderstood it, corrected now thanks for replying. On Sun, Aug 3, 2008 at 2:26 PM, Dotan Barak wrote: > On Sat, Aug 2, 2008 at 1:02 PM, Devesh Sharma wrote: > > Thanks for replying, > > Can you explain me in a bit more detail, because if QP dose not have a > > IB_ACCESS_LOCAL_WRITE permission, according to IB spec, HCA should > generate > > Local Protection Error while processing the WRs. Is it assumed that mthca > > driver (or some other provider driver) will set IB_ACCESS_LOCAL_WRITE by > > itself, even if its not requested? > The protection flag in the QP attributes is only specify which > incoming remote operations are supported (Read/Write/Atomic). > > The IB_ACCESS_LOCAL_WRITE is enabled (or not) in the Memory Region. > > Dotan > > > > > On Fri, Aug 1, 2008 at 10:07 PM, Sean Hefty > wrote: > >> > >> > while creating QP using rdma_create_qp(), I am not seeing any where > >> > it is setting IB_ACCESS_LOCAL_WRITE flag other then for IW QPs. Is > >> > it for some specific reason its just a mistake? > >> > >> Commit 1ca8d15619f725e223c19137350b0336b9196193 (dated July 22nd) > removed > >> this > >> for iWarp QPs. The qp_access_flags is only used to set remote > >> permissions, so > >> it should not be being set. > >> > >> - Sean > >> > > > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iuzzolin at nmia.com Tue Aug 5 01:29:50 2008 From: iuzzolin at nmia.com (Harold/Carlyn Iuzzolino) Date: Tue, 5 Aug 2008 02:29:50 -0600 Subject: [ofa-general] NULL isn't defined in dat_strerror.c Message-ID: <200808050829.m758Toa15472@gandalf.iuzzolino.com> Dear Openfabrics, general at lists.openfabrics.org Summary of bug: In subroutine dat_strerror.c NULL isn't defined. [root at treebeard BUILD]# uname -a Linux treebeard 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux In /tmp/OFED.16989.logs/compat-dapl.rpmbuild.log, compilation stops with the following complaint: gcc -DHAVE_CONFIG_H -I. -I. -I. -Wall -g -D_GNU_SOURCE -DOS_RELEASE=131078 -I./dat/include/ -I./dat/udat/ -I./dat/udat/linux -I./dat/common/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dat_udat_libdat_la-dat_strerror.lo -MD -MP -MF .deps/dat_udat_libdat_la-dat_strerror.Tpo -c dat/common/dat_strerror.c -fPIC -DPIC -o .libs/dat_udat_libdat_la-dat_strerror.o dat/common/dat_strerror.c: In function 'dat_strerror': dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) dat/common/dat_strerror.c:621: error: (Each undeclared identifier is reported only once dat/common/dat_strerror.c:621: error: for each function it appears in.) make[2]: *** [dat_udat_libdat_la-dat_strerror.lo] Error 1 In the directory /var/tmp/OFED_topdir/BUILD/compat-dapl-1.2.8/dat/common/ The following files use NULL. [root at treebeard common]# grep -l NULL *c dat_api.c dat_dictionary.c dat_dr.c dat_sr.c dat_strerror.c Here are the include files used in those subroutines: [root at treebeard common]# grep include *c dat_api.c:#include "dat_osd.h" dat_api.c:#include "dat_init.h" dat_api.c:#include dat_dictionary.c:#include "dat_dictionary.h" dat_dr.c:#include "dat_dr.h" dat_dr.c:#include "dat_dictionary.h" dat_sr.c:#include "dat_sr.h" dat_sr.c:#include "dat_dictionary.h" dat_sr.c:#include "udat_sr_parser.h" dat_strerror.c:#include dat_strerror.c:#include Which of those .h files do you need to include in dat_strerror.c so that NULL gets defined? Carlyn Iuzzolino From vlad at mellanox.co.il Tue Aug 5 01:28:21 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 05 Aug 2008 11:28:21 +0300 Subject: [ofa-general] [PATCH] rds: fix call to sk_alloc In-Reply-To: <20080804211610.GF28069@opengridcomputing.com> References: <20080804211610.GF28069@opengridcomputing.com> Message-ID: <48980F25.8020605@mellanox.co.il> Jon Mason wrote: > The OFED-1.4 RDS calls to sk_alloc are broken, thus preventing rds from > establishing a connection. sk_alloc function parameters changed between > kernel versions 2.6.23 and 2.6.24. The current code erroneously checks > for 2.6.26 for this change. The workaround code for no > KERNEL_HAS_PROTO_REGISTER lacks a function parameter (as it is #defined > over the standard sk_alloc call), and that functions call to sk_alloc > has parameters 3 and 4 swapped. > > With the patch below, RDS will start working again. > > Signed-Off-By: Jon Mason > > diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c > index e32f502..3b7ad33 100644 > --- a/net/rds/af_rds.c > +++ b/net/rds/af_rds.c > @@ -399,11 +399,11 @@ static struct proto_ops rds_proto_ops = { > }; > > #ifndef KERNEL_HAS_PROTO_REGISTER > -static struct sock *sk_alloc_compat(int pf, gfp_t gfp, struct proto *prot) > +static struct sock *sk_alloc_compat(int pf, gfp_t gfp, struct proto *prot, int zero_it) > { > struct rds_sock *rs; > > - sk = sk_alloc(pf, gfp, 1, NULL); > + sk = sk_alloc(pf, gfp, prot, zero_it); > if (sk == NULL) > return NULL; > > @@ -455,7 +455,7 @@ static int __rds_create(struct socket *sock, struct sock *sk, int protocol) > return 0; > } > > -#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 26) > +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 24) > static int rds_create(struct socket *sock, int protocol) > { > struct sock *sk; Applied to git://git.openfabrics.org/ofed_1_4/linux-2.6.git ofed_2_6_27 Thanks, Vladimir From vlad at lists.openfabrics.org Tue Aug 5 02:55:50 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 5 Aug 2008 02:55:50 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080805-0200 daily build status Message-ID: <20080805095550.5CA7AE60C93@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From alekseys at voltaire.com Tue Aug 5 04:14:47 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Tue, 05 Aug 2008 14:14:47 +0300 Subject: [ofa-general] [RFC PATCH] IPv6 preparation by using sockaddr in functions call Message-ID: <1217934887.8803.16.camel@linux-zn6t.site> In order to prepare RDMA CM work with IPv6 these functions changed obtain as argument struct sockaddr * pointer and not sockaddr_in addr_resolve_remote addr_resolve_local rdma_resolve_ip Changes in process_req function are side effect of modifications in functions above. I would like to get a comments how to realize address resolution. Should I expand existing existing functions ( addr_resolve_local, etc. ) and perform all changes inside of it, or rename existing to addr4_resolve_XXX and write wrapper for them that will analyze sa_family and call proper function? Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 46 ++++++++++++++++++++-------------------- 1 files changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index c5b623b..b59ad53 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -171,12 +171,12 @@ static void addr_send_arp(struct sockaddr_in *dst_in) ip_rt_put(rt); } -static int addr_resolve_remote(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_remote(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; struct flowi fl; struct rtable *rt; struct neighbour *neigh; @@ -207,8 +207,8 @@ static int addr_resolve_remote(struct sockaddr_in *src_in, } if (!src_ip) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = rt->rt_src; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; } ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); @@ -223,7 +223,7 @@ out: static void process_req(struct work_struct *work) { struct addr_req *req, *temp_req; - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct list_head done_list; INIT_LIST_HEAD(&done_list); @@ -231,8 +231,8 @@ static void process_req(struct work_struct *work) mutex_lock(&lock); list_for_each_entry_safe(req, temp_req, &req_list, list) { if (req->status == -ENODATA) { - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_remote(src_in, dst_in, req->addr); if (req->status && time_after_eq(jiffies, req->timeout)) @@ -251,20 +251,20 @@ static void process_req(struct work_struct *work) list_for_each_entry_safe(req, temp_req, &done_list, list) { list_del(&req->list); - req->callback(req->status, &req->src_addr, req->addr, - req->context); + req->callback(req->status, (struct sockaddr *) &req->src_addr, \ + req->addr, req->context); put_client(req->client); kfree(req); } } -static int addr_resolve_local(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_local(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { struct net_device *dev; - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; int ret; dev = ip_dev_find(&init_net, dst_ip); @@ -272,15 +272,15 @@ static int addr_resolve_local(struct sockaddr_in *src_in, return -EADDRNOTAVAIL; if (ipv4_is_zeronet(src_ip)) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = dst_ip; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; ret = rdma_copy_addr(addr, dev, dev->dev_addr); } else if (ipv4_is_loopback(src_ip)) { - ret = rdma_translate_ip((struct sockaddr *)dst_in, addr); + ret = rdma_translate_ip(dst_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } else { - ret = rdma_translate_ip((struct sockaddr *)src_in, addr); + ret = rdma_translate_ip(src_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } @@ -296,7 +296,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, struct rdma_dev_addr *addr, void *context), void *context) { - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct addr_req *req; int ret = 0; @@ -313,8 +313,8 @@ int rdma_resolve_ip(struct rdma_addr_client *client, req->client = client; atomic_inc(&client->refcount); - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_local(src_in, dst_in, addr); if (req->status == -EADDRNOTAVAIL) @@ -328,7 +328,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, case -ENODATA: req->timeout = msecs_to_jiffies(timeout_ms) + jiffies; queue_req(req); - addr_send_arp(dst_in); + addr_send_arp((struct sockaddr_in *)dst_in); break; default: ret = req->status; -- 1.5.6.dirty From mar at pism.pl Tue Aug 5 06:33:10 2008 From: mar at pism.pl (mar at pism.pl) Date: Tue, 5 Aug 2008 15:33:10 +0200 (CEST) Subject: [ofa-general] 1.3.1 - building with no installation Message-ID: <24590.195.224.154.166.1217943190.squirrel@poczta.pism.pl> Hi all, I'm struggling to build ofed-1.3.1 rpm's on machine having another ofed version installed (and in use). install.sh insists like crazy on wiping all packages first and I can't find any workaround for that. Is there any known? What I need to do is to build packages on build host and install them on bunch on separate hosts. Build host should not be touched in any way - rpm's not deleted, updated, installed etc. Also building without root privileges would be welcomed warmly. That's bit crazy that, to get these rpm's built proper way, I also get build machine devastated as a bonus. What happened with old, good build.sh script? Any advice? Regards, Marcin Mogielnicki From kovlensky at interia.pl Tue Aug 5 06:58:34 2008 From: kovlensky at interia.pl (kovlensky at interia.pl) Date: 05 Aug 2008 15:58:34 +0200 Subject: [ofa-general] ***SPAM*** what are inmpications of opensmd going down Message-ID: <20080805135835.CA99DE435EB@f11.poczta.interia.pl> Hi all, I\'m trying to find any informations on opensmd disappearing from the network. Let\'s assume we\'ve got fully running and operational network, with lids assigned. My undestanding is that opensmd does nothing then and hca\'s with lids assigned can happily survive without it? In other words - having opensmd going down will stop new nodes from joining the network (ports will stay in init state), but will not disrupt the ones which were up? Also opensmd, when configured to not reassign lids, will not break it when going back up, as it will preserve current setup. Right? Conclusion - taking opensmd down and up should not influence ib network in any way with exception of new hca\'s appearing. All above comes from my logic and, as I cannot find any data to confirm that, could anybody confirm that or point at any error there? I\'m considering migrating to opensm and this information is vital to me. Thanks in advance, Retek Kovlensky ---------------------------------------------------------------------- Wymien zeszyty na notebooka! Sprawdz >>> http://link.interia.pl/f1eab From hal.rosenstock at gmail.com Tue Aug 5 08:06:39 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 5 Aug 2008 11:06:39 -0400 Subject: [ofa-general] ***SPAM*** what are inmpications of opensmd going down In-Reply-To: <20080805135835.CA99DE435EB@f11.poczta.interia.pl> References: <20080805135835.CA99DE435EB@f11.poczta.interia.pl> Message-ID: Hi, On Tue, Aug 5, 2008 at 9:58 AM, wrote: > Hi all, > > I\'m trying to find any informations on opensmd disappearing from the network. In general, SM should always be present. > Let\'s assume we\'ve got fully running and operational network, with lids assigned. My undestanding is that opensmd does nothing then and hca\'s with lids assigned can happily survive without it? Yes and no. SA requests (like path requests and multicast) and subnet changes will not be serviced during that time. > In other words - having opensmd going down will stop new nodes from joining the network (ports will stay in init state), but will not disrupt the ones which were up? Yes. > Also opensmd, when configured to not reassign lids, will not break it when going back up, as it will preserve current setup. Right? Yes, as long as no new nodes are present which cause a lid conflict (e.g. subnet merge when SM was offline). > Conclusion - taking opensmd down and up should not influence ib network in any way with exception of new hca\'s appearing. Also switches or links which fail when SM is offline. -- Hal > All above comes from my logic and, as I cannot find any data to confirm that, could anybody confirm that or point at any error there? I\'m considering migrating to opensm and this information is vital to me. > > Thanks in advance, > > Retek Kovlensky > > ---------------------------------------------------------------------- > Wymien zeszyty na notebooka! > Sprawdz >>> http://link.interia.pl/f1eab > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From vlad at mellanox.co.il Tue Aug 5 08:37:40 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 05 Aug 2008 18:37:40 +0300 Subject: [ofa-general] 1.3.1 - building with no installation In-Reply-To: <24590.195.224.154.166.1217943190.squirrel@poczta.pism.pl> References: <24590.195.224.154.166.1217943190.squirrel@poczta.pism.pl> Message-ID: <489873C4.5010503@mellanox.co.il> mar at pism.pl wrote: > Hi all, > > I'm struggling to build ofed-1.3.1 rpm's on machine having another ofed > version installed (and in use). install.sh insists like crazy on wiping > all packages first and I can't find any workaround for that. Is there any > known? > > What I need to do is to build packages on build host and install them on > bunch on separate hosts. Build host should not be touched in any way - > rpm's not deleted, updated, installed etc. Also building without root > privileges would be welcomed warmly. That's bit crazy that, to get these > rpm's built proper way, I also get build machine devastated as a bonus. > > What happened with old, good build.sh script? Any advice? > > Regards, > > Marcin Mogielnicki > Hi Marcin, The decision to change the install and remove the option for build without install came from the distros (both Novell & Redhat) in order to have a standard RPM build and ease the distros to integrate OFED. This change was explained in the mailing list and we explained it will remove the build option. We will NOT return this back into OFED since there were many objections to it. So, there is no "build" option starting from OFED-1.3. You should first install OFED on the build host (root privileges required). This will create binary RPMs under OFED-1.3.X/RPMS. Then install OFED on other hosts. Regards, Vladimir From kliteyn at dev.mellanox.co.il Tue Aug 5 08:41:41 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 05 Aug 2008 18:41:41 +0300 Subject: [ofa-general] [PATCH] opensm/Makefile.am: Fix dependency for 'make -j2' Message-ID: <489874B5.2050607@dev.mellanox.co.il> Fix dependency for 'make -j2' - QoS parser generated files weren't specified correctly. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/Makefile.am | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 76f6b23..f748024 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -59,13 +59,14 @@ opensm_SOURCES = main.c osm_console_io.c osm_console.c osm_db_files.c \ osm_vl15intf.c osm_vl_arb_rcv.c \ st.c osm_perfmgr.c osm_perfmgr_db.c \ osm_event_plugin.c osm_dump.c \ - osm_qos_parser_y.c osm_qos_parser_l.c osm_qos_policy.c + $(srcdir)/osm_qos_parser_y.c $(srcdir)/osm_qos_parser_l.c \ + osm_qos_policy.c -osm_qos_parser_y.c: $(srcdir)/osm_qos_parser.y $(srcdir)/../include/opensm/osm_qos_policy.h +$(srcdir)/osm_qos_parser_y.c: $(srcdir)/osm_qos_parser.y $(srcdir)/../include/opensm/osm_qos_policy.h $(YACC) -d -o $(srcdir)/osm_qos_parser_y.c -p__qos_parser_ $(srcdir)/osm_qos_parser.y mv -f $(srcdir)/osm_qos_parser_y.h $(srcdir)/../include/opensm/osm_qos_parser_y.h -osm_qos_parser_l.c: $(srcdir)/osm_qos_parser.l $(srcdir)/../include/opensm/osm_qos_policy.h osm_qos_parser_y.c +$(srcdir)/osm_qos_parser_l.c: $(srcdir)/osm_qos_parser.l $(srcdir)/../include/opensm/osm_qos_policy.h osm_qos_parser_y.c $(LEX) -P__qos_parser_ -o$(srcdir)/osm_qos_parser_l.c $(srcdir)/osm_qos_parser.l if OSMV_OPENIB -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Tue Aug 5 08:43:51 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 05 Aug 2008 18:43:51 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: fixing compilation error withosm_version.h In-Reply-To: <20080804150616.GD14872@sashak.voltaire.com> References: <20080804144501.GA14872@sashak.voltaire.com> <5D49E7A8952DC44FB38C38FA0D758EAD0C8F0B@mtlexch01.mtl.com> <20080804150616.GD14872@sashak.voltaire.com> Message-ID: <48987537.3080407@dev.mellanox.co.il> Sasha Khapyorsky wrote: > I really like those auto*tools :) But wait, there's more!!! :-) > > Sasha > From alekseys at voltaire.com Tue Aug 5 09:07:39 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Tue, 05 Aug 2008 19:07:39 +0300 Subject: [ofa-general] [PATCH RFC] IPv6 address support in rdma_translate_ip Message-ID: <1217952459.5025.7.camel@linux-zn6t.site> The problem that IPv6 address not catched by this function. There is ip_dev_find function in the kernel for IPv4 protocol, but no analog for IPv6. The solution is to use ipv6_chk_addr function for each network device found in the system. The small problem that the same action ( ip_dev_find ) executed in another place ( addr_resolve_local ), so it may be better write local, static no exported ipv6_dev_find function and used it from addr_resolve_local and rdma_translate_ip instead write the same code twice. Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 30 +++++++++++++++++++++++------- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index b59ad53..f95d21f 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -41,6 +41,7 @@ #include #include #include +#include #include MODULE_AUTHOR("Sean Hefty"); @@ -113,15 +114,30 @@ EXPORT_SYMBOL(rdma_copy_addr); int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) { struct net_device *dev; - __be32 ip = ((struct sockaddr_in *) addr)->sin_addr.s_addr; - int ret; + int ret = -EADDRNOTAVAIL; - dev = ip_dev_find(&init_net, ip); - if (!dev) - return -EADDRNOTAVAIL; + switch (addr->sa_family) { + case AF_INET: + dev = ip_dev_find(&init_net, + ((struct sockaddr_in *) addr)->sin_addr.s_addr); - ret = rdma_copy_addr(dev_addr, dev, NULL); - dev_put(dev); + if (!dev) + return -EADDRNOTAVAIL; + + ret = rdma_copy_addr(dev_addr, dev, NULL); + dev_put(dev); + break; + case AF_INET6: + for_each_netdev(&init_net, dev) { + if (ipv6_chk_addr(&init_net, &((struct sockaddr_in6 *) addr)->sin6_addr, dev, 1)) { + ret = rdma_copy_addr(dev_addr, dev, NULL); + break; + } + } + break; + default: + break; + } return ret; } EXPORT_SYMBOL(rdma_translate_ip); -- 1.5.6.dirty From weiny2 at llnl.gov Tue Aug 5 09:23:47 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 5 Aug 2008 09:23:47 -0700 Subject: [ofa-general] ***SPAM*** what are inmpications of opensmd going down In-Reply-To: References: <20080805135835.CA99DE435EB@f11.poczta.interia.pl> Message-ID: <20080805092347.4503c102.weiny2@llnl.gov> On Tue, 5 Aug 2008 11:06:39 -0400 "Hal Rosenstock" wrote: > Hi, > > On Tue, Aug 5, 2008 at 9:58 AM, wrote: > > Hi all, > > > > I\'m trying to find any informations on opensmd disappearing from the network. > > In general, SM should always be present. > > > Let\'s assume we\'ve got fully running and operational network, with lids assigned. My undestanding is that opensmd does nothing then and hca\'s with lids assigned can happily survive without it? > > Yes and no. SA requests (like path requests and multicast) and subnet > changes will not be serviced during that time. > > > In other words - having opensmd going down will stop new nodes from joining the network (ports will stay in init state), but will not disrupt the ones which were up? > > Yes. > > > Also opensmd, when configured to not reassign lids, will not break it when going back up, as it will preserve current setup. Right? > > Yes, as long as no new nodes are present which cause a lid conflict > (e.g. subnet merge when SM was offline). > > > Conclusion - taking opensmd down and up should not influence ib network in any way with exception of new hca\'s appearing. > > Also switches or links which fail when SM is offline. > > -- Hal > > > All above comes from my logic and, as I cannot find any data to confirm that, could anybody confirm that or point at any error there? I\'m considering migrating to opensm and this information is vital to me. > > I agree with everything Hal said. But I am going to assume that you only want to be able to restart OpenSM rather than actually run without it? If so I can tell you that we have done that often and have not seen any detriment to performance on a "normal"[*] basis. ([*] Obviously there might be bugs and corner cases.) If you are proposing to run without an SM for an extended time, that will probably cause a problem. Ira From ralph.campbell at qlogic.com Tue Aug 5 10:19:04 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Tue, 05 Aug 2008 10:19:04 -0700 Subject: [ofa-general] Re: [ewg] [PATCH] IB/core: Add support for Receive Core Affinity In-Reply-To: <1217916644.13782.8.camel@mtls03> References: <20080715161331.GA8667@mtls03> <1217894351.620.244.camel@chromite.mv.qlogic.com> <1217916644.13782.8.camel@mtls03> Message-ID: <1217956744.620.263.camel@chromite.mv.qlogic.com> On Tue, 2008-08-05 at 09:10 +0300, Eli Cohen wrote: > > > > > > +int ib_create_qp_range(struct ib_pd *pd, struct ib_qp_init_attr *qp_init_attr, > > > + int nqps, int align, struct ib_qp *list[]) > > > > It just seems wrong to me to require the caller to specify the alignment > > restrictions. Isn't this HCA specific? > I agree with you about this, but since my previous posts on the issue > did not receive too much attention, I did not want to change my > implementation before the issue has been discussed. > > > Is IPoIB really going to know > > whether or not the QP numbers returned by this call are "aligned" > > or not? What if I call ib_create_qp_range() with nqps=3 and align=0? > I am not sure I understand your argument here: in this case you create 3 > consecutive QPs with no other restrictions on the number of of the > first. My point is that I am guessing that the ib_modify_qp() will return an error unless the QPs are created with the right alignment but the caller has no way of knowing what the right alignment value is and it shouldn't need to know since the ib_create_qp_range() could have an argument or flag which says the created QPs should be usable for receive affinity. The HCA driver can then use whatever alignment is needed and mark the QP struct as being part of a group. The ib_modify_qp() probably only needs a flag to say enable receive affinity. From akstcalleghenyheritagemnsdgs at alleghenyheritage.com Tue Aug 5 11:01:51 2008 From: akstcalleghenyheritagemnsdgs at alleghenyheritage.com (Hilary Morrison) Date: Tue, 5 Aug 2008 15:01:51 -0300 Subject: [ofa-general] Interesting RX Offers Message-ID: <01c8f70c$3113f980$1c802ebd@akstcalleghenyheritagemnsdgs> Splendid Medical Satisfactions http://everyhingwanted.spaces.live.com/default.aspx Most importantly, someone struggles Victoria K., Boston From MAILER-DAEMON at mexico.magic.fr Tue Aug 5 13:30:25 2008 From: MAILER-DAEMON at mexico.magic.fr (Mail Delivery Subsystem) Date: Tue, 5 Aug 2008 22:30:25 +0200 Subject: [ofa-general] Returned mail: see transcript for details Message-ID: <200808052030.m75KUPII008932@mexico.magic.fr> The original message was received at Tue, 5 Aug 2008 22:30:25 +0200 from fm1.gwavas.magic.fr [195.154.194.194] ----- The following addresses had permanent fatal errors ----- (reason: 550 5.1.1 ... User unknown) ----- Transcript of session follows ----- ... while talking to [62.210.158.43]: >>> RCPT To: <<< 550 5.1.1 ... User unknown 550 5.1.1 ... User unknown -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/rfc822-headers Size: 743 bytes Desc: not available URL: From akepner at sgi.com Tue Aug 5 13:31:06 2008 From: akepner at sgi.com (akepner at sgi.com) Date: Tue, 5 Aug 2008 13:31:06 -0700 Subject: [ofa-general] OFED 1.3 hang in cm_destroy_id() Message-ID: <20080805203106.GP27415@sgi.com> Eli, Or, I've gotten a report of a hang very similar to one reported in: http://lists.openfabrics.org/pipermail/general/2008-June/052275.html Here's the backtrace of the hung ipoib task: STACK TRACE FOR TASK: 0xe00003600b070000 (ipoib) 0 schedule+0x26ec [0xa0000001005a12ac] 1 wait_for_completion+0x14c [0xa0000001005a198c] 2 cm_destroy_id+0x66c [0xa00000021531e72c] 3 ib_destroy_cm_id+0x2c [0xa0000002153210cc] 4 ipoib_cm_tx_reap+0x17c [0xa000000215719abc] 5 run_workqueue+0x1dc [0xa0000001000c7f1c] 6 worker_thread+0x1bc [0xa0000001000c963c] 7 kthread+0x23c [0xa0000001000d39dc] 8 kernel_thread_helper+0xcc [0xa0000001000133ec] 9 start_kernel_thread+0x1c [0xa0000001000094bc] The cm_id->state is IB_CM_TIMEWAIT, and the refcount is 1. This is an ia64 system with an MT23108, running OFED 1.3. Haven't yet been able to reproduce this, however. -- Arthur From Bandi-hugade at swiftaviation.com Tue Aug 5 16:43:17 2008 From: Bandi-hugade at swiftaviation.com (Daily Top 10) Date: Tue, 5 Aug 2008 20:43:17 -0300 Subject: [ofa-general] CNN.com Daily Top 10 Message-ID: <20080801155902.cnn-dailytop10@mail.cnn.com> >+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= >THE DAILY TOP 10 >from CNN.com >Top videos and stories as of: Aug 1, 2008 3:58 PM EDT >+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= TOP 10 VIDEOS 1. MONTAUK 'MONSTER' http://www.cnn.com/video/partners/email/index.html?url=/video/us/2008/07/31/moos.montauk.monster.cnn Is it a devil dog? Is it a turtle? Is it the Montauk Monster? CNN's Jeanne Moos asks, "what is this thing?" 2. RACY PHOTOS OF TODDLER'S MOM http://www.cnn.com/video/partners/email/index.html?url=/video/crime/2008/07/31/ng.racy.photos.cnn 3. NEWS OF THE ABSURD EPISODE 54 http://www.cnn.com/video/partners/email/index.html?url=/video/podcasts/absurd/site/2008/08/01/nota.episode.54.cnn 4. POLICE BEATING DISPUTE http://www.cnn.com/video/partners/email/index.html?url=/video/us/2008/07/31/levs.police.video.cnn 5. MOM PLEADS FOR GIRL'S RETURN http://www.cnn.com/video/partners/email/index.html?url=/video/crime/2008/08/01/hill.boss.reigh.plea.cnn 6. DEFENDANT FAKES HEART ATTACK http://www.cnn.com/video/partners/email/index.html?url=/video/crime/2008/08/01/dnt.fake.heart.attack.mxf.whio 7. KILLER CARRIED VICTIM'S HEAD http://www.cnn.com/video/partners/email/index.html?url=/video/world/2008/07/31/natpkg.can.bus.decapitation.ctv 8. MURDER CONFESSION RECANTED http://www.cnn.com/video/partners/email/index.html?url=/video/world/2008/08/01/blake.brazil.teen.murder.itn 9. ANTHRAX SUSPECT'S HOME http://www.cnn.com/video/partners/email/index.html?url=/video/us/2008/08/01/von.dr.bruce.ivins.home.cnn 10. HECKLERS INTERRUPT OBAMA TALK http://www.cnn.com/video/partners/email/index.html?url=/video/politics/2008/08/01/sot.fl.obama.protesters.baynews9 TOP 10 STORIES 1. SUSPECT IN BEHEADING IDENTIFIED http://www.cnn.com/2008/WORLD/americas/08/01/canada.beheading/index.html Canadian police say Vince Weiguang Li, 40, of Edmonton is charged with second-degree murder in the beheading of a man on a bus. 2. JUDGE TAKEN OFF LAST JENA 6 CASES http://www.cnn.com/2008/CRIME/08/01/jena6.appeal/index.html 3. PEOPLE MAG GETS PITT-JOLIE PIX http://www.cnn.com/2008/SHOWBIZ/Movies/08/01/brangelina.photos.ap/index.html 4. ATTACK IN TORONTO CALLED RACIAL http://www.cnn.com/2008/WORLD/americas/08/01/canada.attack.ap/index.html 5. IREPORTERS' UNUSUAL NAMES http://www.cnn.com/2008/LIVING/08/01/unusual.names.irpt/index.html 6. MOTHER PLEADS FOR CHILD'S RETURN http://www.cnn.com/2008/CRIME/08/01/rockefeller.kidnapping/index.html 7. KARADZIC: I MADE DEAL WITH U.S. http://www.cnn.com/2008/WORLD/europe/08/01/karadzic.trial/index.html 8. SUSPECT ARRESTED IN SWIM KILLINGS http://www.cnn.com/2008/CRIME/08/01/wisconsin.shooting.ap/index.html 9. ANTHRAX SUSPECT APPARENT SUICIDE http://www.cnn.com/2008/CRIME/08/01/anthrax.death/index.html 10. MCCAIN: OBAMA CRITICISM 'FAIR' http://www.cnn.com/2008/POLITICS/08/01/campaign.wrap/index.html CNN, The Most Trusted Name in News > Cable News Network LP, LLLP. < > One CNN Center, Atlanta, Georgia 30303 < > 2008 Cable News Network LP, LLLP. < > A Time Warner Company. < > All Rights Reserved. < ========================================================= = Please send comments or suggestions by going to = = http://www.cnn.com/feedback/ = = = = Read our privacy guidelines by going to = = http://www.cnn.com/privacy.html = ========================================================= You have agreed to receive this email from CNN.com as a result of your CNN.com preference settings. To manage your settings, go to: http://www.cnn.com/linkto/bn.manage.html To unsubscribe from the Daily Top 10, go to http://cgi.cnn.com/m/clik?e=general at openib.org&l=cnn-dailytop10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From whcwnjekhjswzm at anjali.every1.net Tue Aug 5 19:45:36 2008 From: whcwnjekhjswzm at anjali.every1.net (janae) Date: Tue, 05 Aug 2008 18:45:36 -0800 Subject: [ofa-general] is it you? janae here Message-ID: Hi, i am here sitting in the internet caffe. Found your email and decided to write. I am 25 y.o.girl. I have a picture if you want. No need to reply here as this is not may email. Write me at ajanae74 at centralrd.com From akstcaneddamnsdgs at anedda.it Tue Aug 5 20:03:49 2008 From: akstcaneddamnsdgs at anedda.it (Jorge Kenney) Date: Wed, 6 Aug 2008 12:03:49 +0900 Subject: [ofa-general] Engaging Narcotic Facilitation Message-ID: <01c8f7bc$7c69cd00$48b47974@akstcaneddamnsdgs> Superior Health Satisfactions http://smallworldtho.spaces.live.com/default.aspx want to see how of patterns with others Rita R., Boston From mwieland at nch.org Tue Aug 5 22:21:35 2008 From: mwieland at nch.org (jard gerald) Date: Wed, 06 Aug 2008 05:21:35 +0000 Subject: [ofa-general] For:general Bill Clinton Regrets, 'I Am Not a Racist' Message-ID: <000901c8f793$02b510a4$fe93229e@tctinin> * Mindestlohn: Nahles wirft Union Wortbruch vor * Raus aus dem Beruf: Jede dritte Mutter bleibt zu Hause Watch tMOTORRAD- AUKTION BEI BONHAMS he video * Jugendgewalt: Munchen schneidet am schlechtesten ab -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at dev.mellanox.co.il Wed Aug 6 00:27:57 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 06 Aug 2008 10:27:57 +0300 Subject: [ofa-general] OFED 1.3 hang in cm_destroy_id() In-Reply-To: <20080805203106.GP27415@sgi.com> References: <20080805203106.GP27415@sgi.com> Message-ID: <4899527D.5090904@mellanox.co.il> akepner at sgi.com wrote: > Eli, Or, > > I've gotten a report of a hang very similar to one reported in: > http://lists.openfabrics.org/pipermail/general/2008-June/052275.html > > Here's the backtrace of the hung ipoib task: > > STACK TRACE FOR TASK: 0xe00003600b070000 (ipoib) > > 0 schedule+0x26ec [0xa0000001005a12ac] > 1 wait_for_completion+0x14c [0xa0000001005a198c] > 2 cm_destroy_id+0x66c [0xa00000021531e72c] > 3 ib_destroy_cm_id+0x2c [0xa0000002153210cc] > 4 ipoib_cm_tx_reap+0x17c [0xa000000215719abc] > 5 run_workqueue+0x1dc [0xa0000001000c7f1c] > 6 worker_thread+0x1bc [0xa0000001000c963c] > 7 kthread+0x23c [0xa0000001000d39dc] > 8 kernel_thread_helper+0xcc [0xa0000001000133ec] > 9 start_kernel_thread+0x1c [0xa0000001000094bc] > > > The cm_id->state is IB_CM_TIMEWAIT, and the refcount is 1. > > This is an ia64 system with an MT23108, running OFED 1.3. > > Haven't yet been able to reproduce this, however. > > Can you try 1.3.1? We have fixed several bugs there Tziporet From eli at mellanox.co.il Wed Aug 6 01:32:03 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 6 Aug 2008 11:32:03 +0300 Subject: [ofa-general] [PATCH] mlx4_ib: Allow 4K messages for UD QPs Message-ID: <20080806083203.GA7768@mtls03> Current code limits UD QPs message size to 2K while MTU is set to 4K. This patch sets message size to 4K. Signed-off-by: Alex Naslednikov Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/qp.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index f7bc7dd..f29dbb7 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -902,7 +902,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, context->mtu_msgmax = (IB_MTU_4096 << 5) | ilog2(dev->dev->caps.max_gso_sz); else - context->mtu_msgmax = (IB_MTU_4096 << 5) | 11; + context->mtu_msgmax = (IB_MTU_4096 << 5) | 12; } else if (attr_mask & IB_QP_PATH_MTU) { if (attr->path_mtu < IB_MTU_256 || attr->path_mtu > IB_MTU_4096) { printk(KERN_ERR "path MTU (%u) is invalid\n", -- 1.5.6 From sashak at voltaire.com Wed Aug 6 01:40:57 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 6 Aug 2008 11:40:57 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: Fix dependency for 'make -j2' In-Reply-To: <489874B5.2050607@dev.mellanox.co.il> References: <489874B5.2050607@dev.mellanox.co.il> Message-ID: <20080806084057.GA19158@sashak.voltaire.com> On 18:41 Tue 05 Aug , Yevgeny Kliteynik wrote: > Fix dependency for 'make -j2' - QoS parser generated > files weren't specified correctly. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From wangwhao at cn.ibm.com Wed Aug 6 02:22:48 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Wed, 6 Aug 2008 17:22:48 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 Message-ID: Hi all: My subnet includes RHEL5.2 servers and one switch with subnet manager running. On RHEL 5.2 servers, I can run most IB diagnostics commands/scripts, such as ibnetdiscover and ibstat without errors. But opensm command hang with following output [root at gaia-07 OFED-1.3.1]# opensm ------------------------------------------------- OpenSM 3.1.11 Command Line Arguments: Log File: /var/log/opensm.log ------------------------------------------------- OpenSM 3.1.11 Using default GUID 0x2c90300013371 Entering STANDBY state (never end up ...) [root at gaia-07 ~]# ps -ef|grep opensm root 30018 10888 0 10:50 pts/2 00:00:00 opensm root 30081 30035 0 10:52 pts/1 00:00:00 grep opensm [root at gaia-07 ~]# ps -ef|grep 30018[root at gaia-07 ~]# osmtest root 30018 10888 0 10:50 pts/2 00:00:00 opensm root 30100 30035 0 11:00 pts/1 00:00:00 grep 30018 [root at gaia-07 ~]# tail /var/log/opensm.log Aug 06 10:50:12 631425 [B07D0EB0] 0x03 -> OpenSM 3.1.11 Aug 06 10:50:12 631472 [B07D0EB0] 0x80 -> OpenSM 3.1.11 Aug 06 10:50:12 640853 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 06 10:50:12 662682 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 06 10:50:12 667338 [486D6940] 0x80 -> Entering STANDBY state It seems opensm does not spawn other threads. While osmtest gave errors. [root at gaia-07 ~]# osmtest Command Line Arguments Done with args Flow = All Validations Aug 06 11:02:25 264234 [EDCFC880] 0x7f -> Setting log level to: 0x03 Aug 06 11:02:25 282259 [EDCFC880] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 06 11:02:25 304475 [EDCFC880] 0x02 -> osmtest_validate_sa_class_port_info: ----------------------------- SA Class Port Info: base_ver:1 class_ver:2 cap_mask:0x2601 cap_mask2:0x0 resp_time_val:0x14 ----------------------------- Aug 06 11:02:25 304526 [EDCFC880] 0x01 -> osmtest_create_db: ERR 0130: Unable to open inventory file (osmtest.dat) Aug 06 11:02:25 304555 [EDCFC880] 0x01 -> osmtest_run: ERR 0145: Database creation failed (IB_ERROR) OSMTEST: TEST "All Validations" FAIL Is there any advice how to probe the opensm/osmtest issue? Thanks in advance! Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Wed Aug 6 02:47:03 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 6 Aug 2008 02:47:03 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080806-0200 daily build status Message-ID: <20080806094703.78087E6095B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.24 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/core/cma.c:2810: error: 'IFF_BONDING' undeclared (first use in this function) /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/core/cma.c:2810: error: (Each undeclared identifier is reported only once /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/core/cma.c:2810: error: for each function it appears in.) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/core/cma.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/core] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-1.2798.fc6_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-1.2798.fc6' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/core/cma.c:2810: error: 'IFF_BONDING' undeclared (first use in this function) /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/core/cma.c:2810: error: (Each undeclared identifier is reported only once /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/core/cma.c:2810: error: for each function it appears in.) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/core/cma.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/core] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.20 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.20_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.20' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: include/asm/apic.h:47: warning: value computed is not used /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1840: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.17_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080806-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From mmgpgoqyqt at bocquet.com Wed Aug 6 03:04:04 2008 From: mmgpgoqyqt at bocquet.com (Meika nordmann ) Date: Wed, 6 Aug 2008 14:04:04 +0400 Subject: [ofa-general] 5% Kredit Message-ID: <01c8f7cd$48b16200$d3f3c858@mmgpgoqyqt> Kredite/ Darlehen von fuehrenden deutschen Banken mit 5-7% Jahreszinsen. Keine Versteckte Kosten!!! Infoanfrage an: reich76253 at gmail.com From diego.guella at sircomtech.com Wed Aug 6 03:17:00 2008 From: diego.guella at sircomtech.com (Diego Guella) Date: Wed, 6 Aug 2008 12:17:00 +0200 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms Message-ID: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> Hi, I've just installed Opensuse 10.3 on a Dell PowerEdge 2850 server, and updated it as of today (Aug 6, 2008). I noticed that opensuse has some rpms in its repository, and I was wondering if I could get Infiniband up without installing OFED. But, I get this error from opensm: ----- Aug 06 10:15:52 476103 [2340AB00] -> OpenSM Rev:openib-3.0.14 Aug 06 10:15:52 476206 [2340AB00] -> OpenSM Rev:openib-3.0.14 Aug 06 10:15:52 485719 [2340AB00] -> osm_vendor_bind: Binding to port 0x2c9020021c9f1 Aug 06 10:15:52 488535 [2340AB00] -> osm_vendor_open_port: ERR 542C: umad_open_port() failed Aug 06 10:15:52 488565 [2340AB00] -> osm_vendor_bind: ERR 5424: Unable to open port 0x2c9020021c9f1 Aug 06 10:15:52 488578 [2340AB00] -> osm_sm_mad_ctrl_bind: ERR 3118: Vendor specific bind failed Aug 06 10:15:52 488593 [2340AB00] -> osm_sm_bind: ERR 2E10: SM MAD Controller bind failed (IB_ERROR) Aug 06 10:15:52 488614 [2340AB00] -> osm_sa_mad_ctrl_unbind: ERR 1A11: No previous bind Aug 06 10:15:52 488977 [2340AB00] -> Exiting SM ----- What's wrong? Here are some more information on this system: The Infiniband-related rpms that I have installed are: infiniband-diags, kernel-default, libibmad, libibumad, libibverbs, libsdp, ofed-kmp-default, opensm ----- Server19:~ # lsmod Module Size Used by ib_umad 32792 0 iptable_filter 19840 0 ip_tables 37848 1 iptable_filter ip6table_filter 19584 0 ip6_tables 31944 1 ip6table_filter x_tables 37000 2 ip_tables,ip6_tables microcode 31256 0 firmware_class 27520 1 microcode apparmor 58544 0 loop 36356 0 dm_mod 77152 0 ib_ipoib 89160 0 ib_cm 51480 1 ib_ipoib ib_sa 57688 2 ib_ipoib,ib_cm ipv6 372600 35 ib_ipoib rtc_cmos 25016 0 rtc_core 38156 1 rtc_cmos floppy 79624 0 rtc_lib 19968 1 rtc_core sr_mod 33444 0 iTCO_wdt 28624 0 serio_raw 24068 0 cdrom 52392 1 sr_mod iTCO_vendor_support 20740 1 iTCO_wdt usbhid 58160 0 hid 43776 1 usbhid ff_memless 22536 1 usbhid ib_mthca 141540 0 ib_mad 54436 4 ib_umad,ib_cm,ib_sa,ib_mthca ib_core 76032 6 ib_umad,ib_ipoib,ib_cm,ib_sa,ib_mthca,ib_mad e1000 203200 0 shpchp 50716 0 pci_hotplug 49396 1 shpchp e752x_edac 28036 0 edac_mc 43584 1 e752x_edac button 26528 0 sg 53304 0 ehci_hcd 50956 0 uhci_hcd 42144 0 usbcore 156456 4 usbhid,ehci_hcd,uhci_hcd sd_mod 45824 3 edd 26760 0 ext3 156688 1 mbcache 26248 1 ext3 jbd 89192 1 ext3 fan 22792 0 mptspi 36112 2 mptscsih 39680 1 mptspi mptbase 73952 2 mptspi,mptscsih scsi_transport_spi 43776 1 mptspi ata_piix 37636 0 libata 166800 1 ata_piix scsi_mod 176536 7 sr_mod,sg,sd_mod,mptspi,mptscsih,scsi_transport_spi,libata thermal 36112 0 processor 59720 1 thermal ----- ----- Server19:~ # ibstat CA 'mthca0' CA type: MT25208 Number of ports: 2 Firmware version: 5.1.400 Hardware version: a0 Node GUID: 0x0002c9020021c9f0 System image GUID: 0x0002c9020021c9f3 Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 4 LMC: 0 SM lid: 1 Capability mask: 0x02510a68 Port GUID: 0x0002c9020021c9f1 Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c9020021c9f2 ----- Any advice is greatly appreciated. Thanks, Diego -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at dev.mellanox.co.il Wed Aug 6 03:54:54 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 06 Aug 2008 13:54:54 +0300 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: <489982FE.5030008@dev.mellanox.co.il> Hi, Wen Hao Wang wrote: > Hi all: > > My subnet includes RHEL5.2 servers and one switch with subnet manager > running. On RHEL 5.2 servers, I can run most IB diagnostics > commands/scripts, such as ibnetdiscover and ibstat without errors. But > opensm command hang with following output > > [root at gaia-07 OFED-1.3.1]# opensm > ------------------------------------------------- > OpenSM 3.1.11 > Command Line Arguments: > Log File: /var/log/opensm.log > ------------------------------------------------- > OpenSM 3.1.11 > > Using default GUID 0x2c90300013371 > Entering STANDBY state > > (never end up ...) This part is OK - opensm enters the stand-by state and waits in this state indefinitely. This happened because opensm detects other opensm in the subnet. If you kill that other opensm, the stand-by opensm will enter MASTER state after a short period. You can see who's the master opensm in your subnet by running 'sminfo' tool. > [root at gaia-07 ~]# ps -ef|grep opensm > root 30018 10888 0 10:50 pts/2 00:00:00 opensm > root 30081 30035 0 10:52 pts/1 00:00:00 grep opensm > [root at gaia-07 ~]# ps -ef|grep 30018[root at gaia-07 ~]# osmtest > root 30018 10888 0 10:50 pts/2 00:00:00 opensm > root 30100 30035 0 11:00 pts/1 00:00:00 grep 30018 > [root at gaia-07 ~]# tail /var/log/opensm.log > Aug 06 10:50:12 631425 [B07D0EB0] 0x03 -> OpenSM 3.1.11 > Aug 06 10:50:12 631472 [B07D0EB0] 0x80 -> OpenSM 3.1.11 > Aug 06 10:50:12 640853 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to > port 0x2c90300013371 > Aug 06 10:50:12 662682 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to > port 0x2c90300013371 > Aug 06 10:50:12 667338 [486D6940] 0x80 -> Entering STANDBY state > > > It seems opensm does not spawn other threads. While osmtest gave errors. If there is another opensm in the subnet, osmtest shouldn't fail. See below. > [root at gaia-07 ~]# osmtest > > Command Line Arguments > Done with args > Flow = All Validations > Aug 06 11:02:25 264234 [EDCFC880] 0x7f -> Setting log level to: 0x03 > Aug 06 11:02:25 282259 [EDCFC880] 0x02 -> osm_vendor_bind: Binding to > port 0x2c90300013371 > Aug 06 11:02:25 304475 [EDCFC880] 0x02 -> > osmtest_validate_sa_class_port_info: > ----------------------------- > SA Class Port Info: > base_ver:1 > class_ver:2 > cap_mask:0x2601 > cap_mask2:0x0 > resp_time_val:0x14 > ----------------------------- > Aug 06 11:02:25 304526 [EDCFC880] 0x01 -> osmtest_create_db: ERR 0130: > Unable to open inventory file (osmtest.dat) > Aug 06 11:02:25 304555 [EDCFC880] 0x01 -> osmtest_run: ERR 0145: > Database creation failed (IB_ERROR) > OSMTEST: TEST "All Validations" FAIL By default, osmtest runs all validation tests, which is similar to 'osmtest -f a'. This flow expects to get an input inventory file. You should first run 'osmtest -f c' to create such file, and then 'osmtest' or 'osmtest -f a' to run the tests. See 'man osmtest' for more details. -- Yevgeny > Is there any advice how to probe the opensm/osmtest issue? Thanks in > advance! > > Wen Hao Wang > Email: wangwhao at cn.ibm.com > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at dev.mellanox.co.il Wed Aug 6 04:01:31 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 06 Aug 2008 14:01:31 +0300 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms In-Reply-To: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> Message-ID: <4899848B.4080705@dev.mellanox.co.il> Hi Diego, Diego Guella wrote: > Hi, > > I've just installed Opensuse 10.3 on a Dell PowerEdge 2850 server, and > updated it as of today (Aug 6, 2008). > I noticed that opensuse has some rpms in its repository, and I was > wondering if I could get Infiniband up without installing OFED. > But, I get this error from opensm: > ----- > Aug 06 10:15:52 476103 [2340AB00] -> OpenSM Rev:openib-3.0.14 > Aug 06 10:15:52 476206 [2340AB00] -> OpenSM Rev:openib-3.0.14 > Aug 06 10:15:52 485719 [2340AB00] -> osm_vendor_bind: Binding to port > 0x2c9020021c9f1 > Aug 06 10:15:52 488535 [2340AB00] -> osm_vendor_open_port: ERR 542C: > umad_open_port() failed > Aug 06 10:15:52 488565 [2340AB00] -> osm_vendor_bind: ERR 5424: Unable > to open port 0x2c9020021c9f1 > Aug 06 10:15:52 488578 [2340AB00] -> osm_sm_mad_ctrl_bind: ERR 3118: > Vendor specific bind failed > Aug 06 10:15:52 488593 [2340AB00] -> osm_sm_bind: ERR 2E10: SM MAD > Controller bind failed (IB_ERROR) > Aug 06 10:15:52 488614 [2340AB00] -> osm_sa_mad_ctrl_unbind: ERR 1A11: > No previous bind > Aug 06 10:15:52 488977 [2340AB00] -> Exiting SM > ----- > > > What's wrong? Perhaps you already have opensm on this machine? ibstat shows that port state is 'Active'. Port cannot reach this state without opensm - it should be in 'Init' state w/o opensm. You can just grep for 'opensm' process, or check where opensm is running with 'sminfo' tool. -- Yevgeny > Here are some more information on this system: > > The Infiniband-related rpms that I have installed are: > infiniband-diags, kernel-default, libibmad, libibumad, libibverbs, > libsdp, ofed-kmp-default, opensm > ----- > Server19:~ # lsmod > Module Size Used by > ib_umad 32792 0 > iptable_filter 19840 0 > ip_tables 37848 1 iptable_filter > ip6table_filter 19584 0 > ip6_tables 31944 1 ip6table_filter > x_tables 37000 2 ip_tables,ip6_tables > microcode 31256 0 > firmware_class 27520 1 microcode > apparmor 58544 0 > loop 36356 0 > dm_mod 77152 0 > ib_ipoib 89160 0 > ib_cm 51480 1 ib_ipoib > ib_sa 57688 2 ib_ipoib,ib_cm > ipv6 372600 35 ib_ipoib > rtc_cmos 25016 0 > rtc_core 38156 1 rtc_cmos > floppy 79624 0 > rtc_lib 19968 1 rtc_core > sr_mod 33444 0 > iTCO_wdt 28624 0 > serio_raw 24068 0 > cdrom 52392 1 sr_mod > iTCO_vendor_support 20740 1 iTCO_wdt > usbhid 58160 0 > hid 43776 1 usbhid > ff_memless 22536 1 usbhid > ib_mthca 141540 0 > ib_mad 54436 4 ib_umad,ib_cm,ib_sa,ib_mthca > ib_core 76032 6 ib_umad,ib_ipoib,ib_cm,ib_sa,ib_mthca,ib_mad > e1000 203200 0 > shpchp 50716 0 > pci_hotplug 49396 1 shpchp > e752x_edac 28036 0 > edac_mc 43584 1 e752x_edac > button 26528 0 > sg 53304 0 > ehci_hcd 50956 0 > uhci_hcd 42144 0 > usbcore 156456 4 usbhid,ehci_hcd,uhci_hcd > sd_mod 45824 3 > edd 26760 0 > ext3 156688 1 > mbcache 26248 1 ext3 > jbd 89192 1 ext3 > fan 22792 0 > mptspi 36112 2 > mptscsih 39680 1 mptspi > mptbase 73952 2 mptspi,mptscsih > scsi_transport_spi 43776 1 mptspi > ata_piix 37636 0 > libata 166800 1 ata_piix > scsi_mod 176536 7 > sr_mod,sg,sd_mod,mptspi,mptscsih,scsi_transport_spi,libata > thermal 36112 0 > processor 59720 1 thermal > ----- > > ----- > Server19:~ # ibstat > CA 'mthca0' > CA type: MT25208 > Number of ports: 2 > Firmware version: 5.1.400 > Hardware version: a0 > Node GUID: 0x0002c9020021c9f0 > System image GUID: 0x0002c9020021c9f3 > Port 1: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 4 > LMC: 0 > SM lid: 1 > Capability mask: 0x02510a68 > Port GUID: 0x0002c9020021c9f1 > Port 2: > State: Down > Physical state: Polling > Rate: 10 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c9020021c9f2 > ----- > > > Any advice is greatly appreciated. > Thanks, > Diego > > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hal.rosenstock at gmail.com Wed Aug 6 04:36:27 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 6 Aug 2008 07:36:27 -0400 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: On Wed, Aug 6, 2008 at 5:22 AM, Wen Hao Wang wrote: > Hi all: > > My subnet includes RHEL5.2 servers and one switch with subnet manager > running. Is this OpenSM or some embedded SM ? > On RHEL 5.2 servers, I can run most IB diagnostics > commands/scripts, such as ibnetdiscover and ibstat without errors. But > opensm command hang with following output > > [root at gaia-07 OFED-1.3.1]# opensm > ------------------------------------------------- > OpenSM 3.1.11 > Command Line Arguments: > Log File: /var/log/opensm.log > ------------------------------------------------- > OpenSM 3.1.11 > > Using default GUID 0x2c90300013371 > Entering STANDBY state > > (never end up ...) > > [root at gaia-07 ~]# ps -ef|grep opensm > root 30018 10888 0 10:50 pts/2 00:00:00 opensm > root 30081 30035 0 10:52 pts/1 00:00:00 grep opensm > [root at gaia-07 ~]# ps -ef|grep 30018[root at gaia-07 ~]# osmtest > root 30018 10888 0 10:50 pts/2 00:00:00 opensm > root 30100 30035 0 11:00 pts/1 00:00:00 grep 30018 > [root at gaia-07 ~]# tail /var/log/opensm.log > Aug 06 10:50:12 631425 [B07D0EB0] 0x03 -> OpenSM 3.1.11 > Aug 06 10:50:12 631472 [B07D0EB0] 0x80 -> OpenSM 3.1.11 > Aug 06 10:50:12 640853 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to port > 0x2c90300013371 > Aug 06 10:50:12 662682 [B07D0EB0] 0x02 -> osm_vendor_bind: Binding to port > 0x2c90300013371 > Aug 06 10:50:12 667338 [486D6940] 0x80 -> Entering STANDBY state > > > It seems opensm does not spawn other threads. While osmtest gave errors. > > [root at gaia-07 ~]# osmtest > > Command Line Arguments > Done with args > Flow = All Validations > Aug 06 11:02:25 264234 [EDCFC880] 0x7f -> Setting log level to: 0x03 > Aug 06 11:02:25 282259 [EDCFC880] 0x02 -> osm_vendor_bind: Binding to port > 0x2c90300013371 > Aug 06 11:02:25 304475 [EDCFC880] 0x02 -> > osmtest_validate_sa_class_port_info: > ----------------------------- > SA Class Port Info: > base_ver:1 > class_ver:2 > cap_mask:0x2601 > cap_mask2:0x0 > resp_time_val:0x14 > ----------------------------- This makes it look like the master SM is not OpenSM. Is that the case ? -- Hal > Aug 06 11:02:25 304526 [EDCFC880] 0x01 -> osmtest_create_db: ERR 0130: > Unable to open inventory file (osmtest.dat) > Aug 06 11:02:25 304555 [EDCFC880] 0x01 -> osmtest_run: ERR 0145: Database > creation failed (IB_ERROR) > OSMTEST: TEST "All Validations" FAIL > > Is there any advice how to probe the opensm/osmtest issue? Thanks in > advance! > Wen Hao Wang > Email: wangwhao at cn.ibm.com > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From diego.guella at deviltechnologies.net Wed Aug 6 04:48:44 2008 From: diego.guella at deviltechnologies.net (Diego Guella) Date: Wed, 6 Aug 2008 13:48:44 +0200 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> Message-ID: <006501c8f7ba$6161db80$05c8a8c0@DIEGO> Hi Yevgeny, Thanks for your answer. From: "Yevgeny Kliteynik" > > Perhaps you already have opensm on this machine? > ibstat shows that port state is 'Active'. > Port cannot reach this state without opensm - it > should be in 'Init' state w/o opensm. Ok, you were partially right: I forgot I have another opensm running on another wxp64 machine. So, I created /etc/sysconfig/network/ifcfg-ib0, ifup ib0 and then I successfully ping'ed the two machines each other. Good. I remember the days when I had to download OFED sources, fight to compile them, and spend hours to get IPoIB running, and now all of this is ready out of the box, in rpms from the distribution, ready to install with the package manager. All of you have done a great work. Thanks!! That said, I really need opensm on this machine, so I will shut down opensm on the wxp machine now. > You can just grep for 'opensm' process, or check > where opensm is running with 'sminfo' tool. No opensm process is running on this machine. sminfo: (while opensm is running on wxp64 machine) ----- Server19:~ # sminfo ibpanic: [4969] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) Server19:~ # ----- /etc/init.d/opensmd start have the same behavior and same errors as when the opensm on wxp64 was running. Do you have other ideas on what can I do to solve this? Thanks, Diego From makc at sgi.com Wed Aug 6 05:27:23 2008 From: makc at sgi.com (Max Matveev) Date: 06 Aug 2008 22:27:23 +1000 Subject: [ofa-general] OFED 1.3 hang in cm_destroy_id() In-Reply-To: <4899527D.5090904@mellanox.co.il> References: <20080805203106.GP27415@sgi.com> <4899527D.5090904@mellanox.co.il> Message-ID: On Wed, 06 Aug 2008 10:27:57 +0300, Tziporet Koren wrote: TK> akepner at sgi.com wrote: >> Eli, Or, >> >> I've gotten a report of a hang very similar to one reported in: >> http://lists.openfabrics.org/pipermail/general/2008-June/052275.html >> >> Here's the backtrace of the hung ipoib task: >> >> STACK TRACE FOR TASK: 0xe00003600b070000 (ipoib) >> >> 0 schedule+0x26ec [0xa0000001005a12ac] >> 1 wait_for_completion+0x14c [0xa0000001005a198c] >> 2 cm_destroy_id+0x66c [0xa00000021531e72c] >> 3 ib_destroy_cm_id+0x2c [0xa0000002153210cc] >> 4 ipoib_cm_tx_reap+0x17c [0xa000000215719abc] >> 5 run_workqueue+0x1dc [0xa0000001000c7f1c] >> 6 worker_thread+0x1bc [0xa0000001000c963c] >> 7 kthread+0x23c [0xa0000001000d39dc] >> 8 kernel_thread_helper+0xcc [0xa0000001000133ec] >> 9 start_kernel_thread+0x1c [0xa0000001000094bc] >> >> >> The cm_id->state is IB_CM_TIMEWAIT, and the refcount is 1. >> >> This is an ia64 system with an MT23108, running OFED 1.3. >> >> Haven't yet been able to reproduce this, however. >> >> TK> Can you try 1.3.1? We have fixed several bugs there This box is actually running 1.3.1. max From hal.rosenstock at gmail.com Wed Aug 6 05:41:29 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 6 Aug 2008 08:41:29 -0400 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms In-Reply-To: <006501c8f7ba$6161db80$05c8a8c0@DIEGO> References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> <006501c8f7ba$6161db80$05c8a8c0@DIEGO> Message-ID: Hi Diego, On Wed, Aug 6, 2008 at 7:48 AM, Diego Guella wrote: > Hi Yevgeny, > Thanks for your answer. > > > From: "Yevgeny Kliteynik" >> >> Perhaps you already have opensm on this machine? >> ibstat shows that port state is 'Active'. >> Port cannot reach this state without opensm - it >> should be in 'Init' state w/o opensm. > > Ok, you were partially right: I forgot I have another opensm running on > another wxp64 machine. > So, I created /etc/sysconfig/network/ifcfg-ib0, ifup ib0 and then I > successfully ping'ed the two machines each other. Good. > I remember the days when I had to download OFED sources, fight to compile > them, and spend hours to get IPoIB running, and now all of this is ready out > of the box, in rpms from the distribution, ready to install with the package > manager. > All of you have done a great work. Thanks!! > > > That said, I really need opensm on this machine, so I will shut down opensm > on the wxp machine now. > >> You can just grep for 'opensm' process, or check >> where opensm is running with 'sminfo' tool. > > No opensm process is running on this machine. > > sminfo: (while opensm is running on wxp64 machine) > ----- > Server19:~ # sminfo > ibpanic: [4969] madrpc_init: can't open UMAD port ((null):0): (No such file > or directory) > Server19:~ # > ----- > > /etc/init.d/opensmd start have the same behavior and same errors as when the > opensm on wxp64 was running. > > > > Do you have other ideas on what can I do to solve this? Are you running the ib_umad kernel module on that machine ? Does /sys/class/infiniband_mad exist ? If so, are there any umad* directories in it ? -- Hal > Thanks, > Diego > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From kliteyn at dev.mellanox.co.il Wed Aug 6 06:15:28 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 06 Aug 2008 16:15:28 +0300 Subject: [ofa-general] [PATCH] opensm/Makefile.am: add dependency rule Message-ID: <4899A3F0.30300@dev.mellanox.co.il> Adding one more dependency rule for a generated header file. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/Makefile.am | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index f748024..d14b913 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -69,6 +69,8 @@ $(srcdir)/osm_qos_parser_y.c: $(srcdir)/osm_qos_parser.y $(srcdir)/../include/op $(srcdir)/osm_qos_parser_l.c: $(srcdir)/osm_qos_parser.l $(srcdir)/../include/opensm/osm_qos_policy.h osm_qos_parser_y.c $(LEX) -P__qos_parser_ -o$(srcdir)/osm_qos_parser_l.c $(srcdir)/osm_qos_parser.l +$(srcdir)/../include/opensm/osm_qos_parser_y.h: $(srcdir)/osm_qos_parser_y.c + if OSMV_OPENIB opensm_CFLAGS = -Wall $(OSMV_CFLAGS) -fno-strict-aliasing -DVENDOR_RMPP_SUPPORT -DDUAL_SIDED_RMPP $(DBGFLAGS) -D_XOPEN_SOURCE=600 -D_BSD_SOURCE=1 else -- 1.5.1.4 From diego.guella at deviltechnologies.net Wed Aug 6 06:27:49 2008 From: diego.guella at deviltechnologies.net (Diego Guella) Date: Wed, 6 Aug 2008 15:27:49 +0200 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> <006501c8f7ba$6161db80$05c8a8c0@DIEGO> Message-ID: <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> From: "Hal Rosenstock" > Hi Diego, Hi Hal, thanks for your answer > Are you running the ib_umad kernel module on that machine ? Does > /sys/class/infiniband_mad exist ? If so, are there any umad* > directories in it ? lsmod shows ib_umad module loaded, with 0 users. The directories exist, here you are their content: ----- Server19:/sys/class # ll | grep infini drwxr-xr-x 3 root root 0 6 ago 15:08 infiniband drwxr-xr-x 6 root root 0 6 ago 13:17 infiniband_mad Server19:/sys/class # cd infiniband_mad Server19:/sys/class/infiniband_mad # ll totale 0 -r--r--r-- 1 root root 4096 6 ago 13:17 abi_version drwxr-xr-x 2 root root 0 6 ago 13:17 issm0 drwxr-xr-x 2 root root 0 6 ago 13:17 issm1 drwxr-xr-x 2 root root 0 6 ago 13:17 umad0 drwxr-xr-x 2 root root 0 6 ago 13:17 umad1 Server19:/sys/class/infiniband_mad # cat abi_version 5 Server19:/sys/class/infiniband_mad # cd umad0/ Server19:/sys/class/infiniband_mad/umad0 # ll totale 0 -r--r--r-- 1 root root 4096 6 ago 15:16 dev lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 -r--r--r-- 1 root root 4096 6 ago 13:17 ibdev -r--r--r-- 1 root root 4096 6 ago 13:17 port lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> ../../../class/infiniband_mad --w------- 1 root root 4096 6 ago 15:16 uevent Server19:/sys/class/infiniband_mad/umad0 # cat dev 231:0 Server19:/sys/class/infiniband_mad/umad0 # cat ibdev mthca0 Server19:/sys/class/infiniband_mad/umad0 # cat port 1 Server19:/sys/class/infiniband_mad/umad0 # cd .. Server19:/sys/class/infiniband_mad # cd umad1/ Server19:/sys/class/infiniband_mad/umad1 # ll totale 0 -r--r--r-- 1 root root 4096 6 ago 15:18 dev lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev -r--r--r-- 1 root root 4096 6 ago 15:18 port lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> ../../../class/infiniband_mad --w------- 1 root root 4096 6 ago 15:18 uevent Server19:/sys/class/infiniband_mad/umad1 # cat port 2 Server19:/sys/class/infiniband_mad/umad1 # cd .. Server19:/sys/class/infiniband_mad # cd issm0/ Server19:/sys/class/infiniband_mad/issm0 # ll totale 0 -r--r--r-- 1 root root 4096 6 ago 15:18 dev lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev -r--r--r-- 1 root root 4096 6 ago 15:18 port lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> ../../../class/infiniband_mad --w------- 1 root root 4096 6 ago 15:18 uevent Server19:/sys/class/infiniband_mad/issm0 # cat port 1 ----- Diego From hal.rosenstock at gmail.com Wed Aug 6 06:45:22 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 6 Aug 2008 09:45:22 -0400 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms In-Reply-To: <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> <006501c8f7ba$6161db80$05c8a8c0@DIEGO> <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> Message-ID: On Wed, Aug 6, 2008 at 9:27 AM, Diego Guella wrote: > > From: "Hal Rosenstock" >> >> Hi Diego, > > Hi Hal, thanks for your answer > >> Are you running the ib_umad kernel module on that machine ? Does >> /sys/class/infiniband_mad exist ? If so, are there any umad* >> directories in it ? > > lsmod shows ib_umad module loaded, with 0 users. > > The directories exist, here you are their content: > ----- > Server19:/sys/class # ll | grep infini > drwxr-xr-x 3 root root 0 6 ago 15:08 infiniband > drwxr-xr-x 6 root root 0 6 ago 13:17 infiniband_mad > Server19:/sys/class # cd infiniband_mad > Server19:/sys/class/infiniband_mad # ll > totale 0 > -r--r--r-- 1 root root 4096 6 ago 13:17 abi_version > drwxr-xr-x 2 root root 0 6 ago 13:17 issm0 > drwxr-xr-x 2 root root 0 6 ago 13:17 issm1 > drwxr-xr-x 2 root root 0 6 ago 13:17 umad0 > drwxr-xr-x 2 root root 0 6 ago 13:17 umad1 > Server19:/sys/class/infiniband_mad # cat abi_version > 5 > Server19:/sys/class/infiniband_mad # cd umad0/ > Server19:/sys/class/infiniband_mad/umad0 # ll > totale 0 > -r--r--r-- 1 root root 4096 6 ago 15:16 dev > lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> > ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 > -r--r--r-- 1 root root 4096 6 ago 13:17 ibdev > -r--r--r-- 1 root root 4096 6 ago 13:17 port > lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> > ../../../class/infiniband_mad > --w------- 1 root root 4096 6 ago 15:16 uevent > Server19:/sys/class/infiniband_mad/umad0 # cat dev > 231:0 > Server19:/sys/class/infiniband_mad/umad0 # cat ibdev > mthca0 > Server19:/sys/class/infiniband_mad/umad0 # cat port > 1 > Server19:/sys/class/infiniband_mad/umad0 # cd .. > Server19:/sys/class/infiniband_mad # cd umad1/ > Server19:/sys/class/infiniband_mad/umad1 # ll > totale 0 > -r--r--r-- 1 root root 4096 6 ago 15:18 dev > lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> > ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 > -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev > -r--r--r-- 1 root root 4096 6 ago 15:18 port > lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> > ../../../class/infiniband_mad > --w------- 1 root root 4096 6 ago 15:18 uevent > Server19:/sys/class/infiniband_mad/umad1 # cat port > 2 > Server19:/sys/class/infiniband_mad/umad1 # cd .. > Server19:/sys/class/infiniband_mad # cd issm0/ > Server19:/sys/class/infiniband_mad/issm0 # ll > totale 0 > -r--r--r-- 1 root root 4096 6 ago 15:18 dev > lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> > ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 > -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev > -r--r--r-- 1 root root 4096 6 ago 15:18 port > lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> > ../../../class/infiniband_mad > --w------- 1 root root 4096 6 ago 15:18 uevent > Server19:/sys/class/infiniband_mad/issm0 # cat port > 1 > ----- Are you running the diag commands (sminfo) as root ? Not sure if that gives this error. -- Hal > > Diego > > From hal.rosenstock at gmail.com Wed Aug 6 06:46:09 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 6 Aug 2008 09:46:09 -0400 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms In-Reply-To: References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> <006501c8f7ba$6161db80$05c8a8c0@DIEGO> <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> Message-ID: On Wed, Aug 6, 2008 at 9:45 AM, Hal Rosenstock wrote: > On Wed, Aug 6, 2008 at 9:27 AM, Diego Guella > wrote: >> >> From: "Hal Rosenstock" >>> >>> Hi Diego, >> >> Hi Hal, thanks for your answer >> >>> Are you running the ib_umad kernel module on that machine ? Does >>> /sys/class/infiniband_mad exist ? If so, are there any umad* >>> directories in it ? >> >> lsmod shows ib_umad module loaded, with 0 users. >> >> The directories exist, here you are their content: >> ----- >> Server19:/sys/class # ll | grep infini >> drwxr-xr-x 3 root root 0 6 ago 15:08 infiniband >> drwxr-xr-x 6 root root 0 6 ago 13:17 infiniband_mad >> Server19:/sys/class # cd infiniband_mad >> Server19:/sys/class/infiniband_mad # ll >> totale 0 >> -r--r--r-- 1 root root 4096 6 ago 13:17 abi_version >> drwxr-xr-x 2 root root 0 6 ago 13:17 issm0 >> drwxr-xr-x 2 root root 0 6 ago 13:17 issm1 >> drwxr-xr-x 2 root root 0 6 ago 13:17 umad0 >> drwxr-xr-x 2 root root 0 6 ago 13:17 umad1 >> Server19:/sys/class/infiniband_mad # cat abi_version >> 5 >> Server19:/sys/class/infiniband_mad # cd umad0/ >> Server19:/sys/class/infiniband_mad/umad0 # ll >> totale 0 >> -r--r--r-- 1 root root 4096 6 ago 15:16 dev >> lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> >> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 >> -r--r--r-- 1 root root 4096 6 ago 13:17 ibdev >> -r--r--r-- 1 root root 4096 6 ago 13:17 port >> lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> >> ../../../class/infiniband_mad >> --w------- 1 root root 4096 6 ago 15:16 uevent >> Server19:/sys/class/infiniband_mad/umad0 # cat dev >> 231:0 >> Server19:/sys/class/infiniband_mad/umad0 # cat ibdev >> mthca0 >> Server19:/sys/class/infiniband_mad/umad0 # cat port >> 1 >> Server19:/sys/class/infiniband_mad/umad0 # cd .. >> Server19:/sys/class/infiniband_mad # cd umad1/ >> Server19:/sys/class/infiniband_mad/umad1 # ll >> totale 0 >> -r--r--r-- 1 root root 4096 6 ago 15:18 dev >> lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> >> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 >> -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev >> -r--r--r-- 1 root root 4096 6 ago 15:18 port >> lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> >> ../../../class/infiniband_mad >> --w------- 1 root root 4096 6 ago 15:18 uevent >> Server19:/sys/class/infiniband_mad/umad1 # cat port >> 2 >> Server19:/sys/class/infiniband_mad/umad1 # cd .. >> Server19:/sys/class/infiniband_mad # cd issm0/ >> Server19:/sys/class/infiniband_mad/issm0 # ll >> totale 0 >> -r--r--r-- 1 root root 4096 6 ago 15:18 dev >> lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> >> ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 >> -r--r--r-- 1 root root 4096 6 ago 15:18 ibdev >> -r--r--r-- 1 root root 4096 6 ago 15:18 port >> lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> >> ../../../class/infiniband_mad >> --w------- 1 root root 4096 6 ago 15:18 uevent >> Server19:/sys/class/infiniband_mad/issm0 # cat port >> 1 >> ----- > > Are you running the diag commands (sminfo) as root ? Not sure if that > gives this error. One other things: Do you have a mix of IB and iWARP adapters in that machine or only IB ? -- Hal > > -- Hal > >> >> Diego >> >> > From ogerlitz at voltaire.com Wed Aug 6 07:01:46 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 06 Aug 2008 17:01:46 +0300 Subject: [ofa-general] Re: OFED 1.3 hang in cm_destroy_id() In-Reply-To: <20080805203106.GP27415@sgi.com> References: <20080805203106.GP27415@sgi.com> Message-ID: <4899AECA.5030000@voltaire.com> akepner at sgi.com wrote: > I've gotten a report of a hang very similar to one reported in: > http://lists.openfabrics.org/pipermail/general/2008-June/052275.html > > Here's the backtrace of the hung ipoib task: > > STACK TRACE FOR TASK: 0xe00003600b070000 (ipoib) > > 0 schedule+0x26ec [0xa0000001005a12ac] > 1 wait_for_completion+0x14c [0xa0000001005a198c] > 2 cm_destroy_id+0x66c [0xa00000021531e72c] > 3 ib_destroy_cm_id+0x2c [0xa0000002153210cc] > 4 ipoib_cm_tx_reap+0x17c [0xa000000215719abc] > 5 run_workqueue+0x1dc [0xa0000001000c7f1c] > 6 worker_thread+0x1bc [0xa0000001000c963c] > 7 kthread+0x23c [0xa0000001000d39dc] > 8 kernel_thread_helper+0xcc [0xa0000001000133ec] > 9 start_kernel_thread+0x1c [0xa0000001000094bc] > > > The cm_id->state is IB_CM_TIMEWAIT, and the refcount is 1. > > This is an ia64 system with an MT23108, running OFED 1.3. > > Haven't yet been able to reproduce this, however. > Hi Arthur, As of the large set of patches which found their way into ofed 1.3 without passing through kernel acceptance and the lack of support for ofed by the Linux IB maintainer, I truly believe that the most constructive approach to debug ipoib issues is to try and reproduce the bug on the mainline kernel code and work with the mainstream kernel IB maintainer. Or. From PHF at zurich.ibm.com Wed Aug 6 07:17:25 2008 From: PHF at zurich.ibm.com (Philip Frey1) Date: Wed, 6 Aug 2008 16:17:25 +0200 Subject: [ofa-general] RDMA Write Error Message-ID: An HTML attachment was scrubbed... URL: From sashak at voltaire.com Wed Aug 6 07:35:05 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 6 Aug 2008 17:35:05 +0300 Subject: [ofa-general] Re: [PATCH] opensm/Makefile.am: add dependency rule In-Reply-To: <4899A3F0.30300@dev.mellanox.co.il> References: <4899A3F0.30300@dev.mellanox.co.il> Message-ID: <20080806143505.GB19158@sashak.voltaire.com> On 16:15 Wed 06 Aug , Yevgeny Kliteynik wrote: > Adding one more dependency rule for a generated header file. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From dotanba at gmail.com Wed Aug 6 07:41:55 2008 From: dotanba at gmail.com (Dotan Barak) Date: Wed, 06 Aug 2008 16:41:55 +0200 Subject: [ofa-general] RDMA Write Error In-Reply-To: References: Message-ID: <4899B833.2070606@gmail.com> Philip Frey1 wrote: > Hi, > > I am trying to figure out how efficient MR registration followed by an > RDMA write is. > For that matter I am running the following loop: > > // create MR of size 64KB > > for (i = 0; i < max_writes; i++) { > > // destroy old MR > > // create MR of size 64KB > > // RDMA write from new MR to some remote buffer > > } > > > At some point (varying) I get the following error: > > iwch_ev_dispatch - CQE Err qpid 0x3d00 opcode 0 status 0x1 type 1 > wrid.hi 0xb3 wrid.lo 0x0 > post_qp_event - AE qpid 0x3d00 opcode 0 status 0x1 type 1 wrid.hi 0xb3 > wrid.lo 0x0 > > ...which basically tells me that the egress (type 1) RDMA write > (opcode 0) has failed du to an invaild STag > (status 0x1 = STAG invalid: either the STAG is offlimit, being 0 or > STAG_key mismatch). > > The error occurs at ibv_post_send(). > > Here is a trace of the WRs posted shortly before the 'crash': > > wr_id=178 > loc_addr=0x2aaaab64f010 > loc_len=65536 > lkey=4552191 > num_sge=1 > rem_addr=0x2aaaab5d0010 > rkey=1459967 > > wr_id=179 > loc_addr=0x2aaaab65f010 > loc_len=65536 > lkey=4555263 > num_sge=1 > rem_addr=0x2aaaab5e0010 > rkey=1459967 > > ASYNC_EVENT: [QP] Local access violation error > wr_id=180 > loc_addr=0x2aaaab66f010 > loc_len=65536 > lkey=4555519 > num_sge=1 > rem_addr=0x2aaaab5f0010 > rkey=1459967 > ERROR: [rdma_write] failed to post rdma write wr > ERROR: rdma write (180/1000) failed > > > Do you have any idea what could be happening here? I noticed that if I > do signaled writes and wait for each > individual completion, this does not happen. It is also not an issue > when posting RDMA writes of size 32KB. > When using 64KB or larger this happens... but why? I assume that as > soon as ibv_reg_mr() returns I am free > to use the MR, right? Yes. Do you post RDMA Write and wait for that completion BEFORE deregistering the MR that reference to this MR? Dotan > > Many thanks for your advice, > Phil > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From ruimario at gmail.com Wed Aug 6 08:47:14 2008 From: ruimario at gmail.com (Rui Machado) Date: Wed, 6 Aug 2008 17:47:14 +0200 Subject: [ofa-general] limit on memory registration Message-ID: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> Hi all, is there any limitation on the size that can be registered (ibv_reg_mr) for communication? I seem to be limited to 16GB (on 32GB 64bit x86 machine). Is this normal? Can someone tell me why and/or if there is a workaround? Thank you very much for the help. Cheers, Rui From dotanba at gmail.com Wed Aug 6 09:12:56 2008 From: dotanba at gmail.com (Dotan Barak) Date: Wed, 06 Aug 2008 18:12:56 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> Message-ID: <4899CD88.4030805@gmail.com> Rui Machado wrote: > Hi all, > > is there any limitation on the size that can be registered > (ibv_reg_mr) for communication? > I seem to be limited to 16GB (on 32GB 64bit x86 machine). > Is this normal? Can someone tell me why and/or if there is a workaround? > In the device attributes there is an attribute called max_mr_size which indicate the maximum block size that can be registered for a device. There is one more limitation which is a device specific for the translation table virtual <-->physical You can check the low level driver of the HW that you are using in order to increase the size of this table. which HW do you use? Dotan > Thank you very much for the help. > > Cheers, > Rui > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From yosefe at Voltaire.COM Wed Aug 6 09:19:22 2008 From: yosefe at Voltaire.COM (Yossi Etigin) Date: Wed, 06 Aug 2008 19:19:22 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock Message-ID: <4899CF0A.1060509@Voltaire.COM> This fixes bug #1114 in bugzilla, which is a deadlock between ipoib_stop and mcast_join_task. ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue. the flush operation might wait for mcast_join_task to finish, which in turn might wait for rtnl_lock. Changes from v1: Instead of loop-waiting for the lock, give it up if can't lock. Same thing is done in drivers/net/cxgb3/cxgb3_main.c. Signed-off-by: Yossi Etigin -- Index: b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c =================================================================== --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-08-04 18:09:33.000000000 +0300 +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-08-06 19:12:19.000000000 +0300 @@ -577,9 +577,11 @@ void ipoib_mcast_join_task(struct work_s priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) { - rtnl_lock(); - dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); - rtnl_unlock(); + /* Avoid deadlock with ipoib_stop */ + if (rtnl_trylock()) { + dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); + rtnl_unlock(); + } } ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); -- --Yossi From ruimario at gmail.com Wed Aug 6 09:27:45 2008 From: ruimario at gmail.com (Rui Machado) Date: Wed, 6 Aug 2008 18:27:45 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <4899CD88.4030805@gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> Message-ID: <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> Hi Dotan >> is there any limitation on the size that can be registered >> (ibv_reg_mr) for communication? >> I seem to be limited to 16GB (on 32GB 64bit x86 machine). >> Is this normal? Can someone tell me why and/or if there is a workaround? >> > > In the device attributes there is an attribute called max_mr_size which > indicate the maximum block size > that can be registered for a device. I have a value of 131056 (printed as an int). How can I decode this? > > There is one more limitation which is a device specific for the translation > table virtual <-->physical > You can check the low level driver of the HW that you are using in order to > increase the size > of this table. > > which HW do you use? > Mellanox ca type:25218 (vendor_part_id) fw_version : 5.1.400 (fw_ver) hw_version : a0 (hw_ver) Thank you From dotanba at gmail.com Wed Aug 6 09:43:08 2008 From: dotanba at gmail.com (Dotan Barak) Date: Wed, 06 Aug 2008 18:43:08 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> Message-ID: <4899D49C.2040204@gmail.com> Rui Machado wrote: > Hi Dotan > > >>> is there any limitation on the size that can be registered >>> (ibv_reg_mr) for communication? >>> I seem to be limited to 16GB (on 32GB 64bit x86 machine). >>> Is this normal? Can someone tell me why and/or if there is a workaround? >>> >>> >> In the device attributes there is an attribute called max_mr_size which >> indicate the maximum block size >> that can be registered for a device. >> > > I have a value of 131056 (printed as an int). How can I decode this? > I have a feeling that you refer to the value of max_mr (am i right?) > >> There is one more limitation which is a device specific for the translation >> table virtual <-->physical >> You can check the low level driver of the HW that you are using in order to >> increase the size >> of this table. >> >> which HW do you use? >> >> > > Mellanox > ca type:25218 (vendor_part_id) > fw_version : 5.1.400 (fw_ver) > hw_version : a0 (hw_ver) > The module parameter "num_mtt" control the size of the above described table. Dotan > Thank you > From sashak at voltaire.com Wed Aug 6 09:54:05 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 6 Aug 2008 19:54:05 +0300 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms In-Reply-To: <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO> <4899848B.4080705@dev.mellanox.co.il> <006501c8f7ba$6161db80$05c8a8c0@DIEGO> <004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> Message-ID: <20080806165405.GD19158@sashak.voltaire.com> On 15:27 Wed 06 Aug , Diego Guella wrote: > > Server19:/sys/class/infiniband_mad # cd umad0/ > Server19:/sys/class/infiniband_mad/umad0 # ll > totale 0 > -r--r--r-- 1 root root 4096 6 ago 15:16 dev > lrwxrwxrwx 1 root root 0 6 ago 13:17 device -> > ../../../devices/pci0000:00/0000:00:06.0/0000:08:00.0 > -r--r--r-- 1 root root 4096 6 ago 13:17 ibdev > -r--r--r-- 1 root root 4096 6 ago 13:17 port > lrwxrwxrwx 1 root root 0 6 ago 13:17 subsystem -> > ../../../class/infiniband_mad > --w------- 1 root root 4096 6 ago 15:16 uevent Also you will need /dev/infiniband/umad0 entry. Sasha From yosefe at Voltaire.COM Wed Aug 6 10:12:25 2008 From: yosefe at Voltaire.COM (Yossi Etigin) Date: Wed, 06 Aug 2008 20:12:25 +0300 Subject: [ofa-general] [PATCH v2] ib/core: fix for send multicast group send leave retry Message-ID: <4899DB79.2030204@Voltaire.COM> Until now, only if joinning a multicast group failed there was a retry mechanism. This patch will add a mechanism that will retry to leave a multicast group before giving up. Changes from v1: - Save the leave state because it's overridden - use 'else' Signed-off-by: Ron Livne Signed-off-by: Yossi Etigin Index: b/drivers/infiniband/core/multicast.c =================================================================== --- a/drivers/infiniband/core/multicast.c 2008-07-07 20:09:15.000000000 +0300 +++ b/drivers/infiniband/core/multicast.c 2008-08-06 20:08:18.000000000 +0300 @@ -106,6 +106,8 @@ struct mcast_group { struct ib_sa_query *query; int query_id; u16 pkey_index; + u8 leave_state; + int retries; }; struct mcast_member { @@ -350,6 +352,7 @@ static int send_leave(struct mcast_group rec = group->rec; rec.join_state = leave_state; + group->leave_state = leave_state; ret = ib_sa_mcmember_rec_query(&sa_client, port->dev->device, port->port_num, IB_SA_METHOD_DELETE, &rec, @@ -542,7 +545,11 @@ static void leave_handler(int status, st { struct mcast_group *group = context; - mcast_work_handler(&group->work); + if (status && (group->retries > 0)) { + send_leave(group, group->leave_state); + group->retries--; + } else + mcast_work_handler(&group->work); } static struct mcast_group *acquire_group(struct mcast_port *port, @@ -565,6 +572,7 @@ static struct mcast_group *acquire_group if (!group) return NULL; + group->retries = 3; group->port = port; group->rec.mgid = *mgid; group->pkey_index = MCAST_INVALID_PKEY_INDEX; -- --Yossi -- --Yossi From dotanba at gmail.com Wed Aug 6 10:19:30 2008 From: dotanba at gmail.com (Dotan Barak) Date: Wed, 06 Aug 2008 19:19:30 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <4899D49C.2040204@gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> <4899D49C.2040204@gmail.com> Message-ID: <4899DD22.2070003@gmail.com> >> > The module parameter "num_mtt" control the size of the above described > table. The default value is (1 << 20), you might try some higher value than this ... Dotan From info at tecodryer.com Wed Aug 6 14:16:55 2008 From: info at tecodryer.com (TECO DRYER) Date: Wed, 6 Aug 2008 14:16:55 -0700 (PDT) Subject: [ofa-general] Teco Industry is in the business of corn, wheat, paddy, and Message-ID: <20080806211656.97906E6024B@openfabrics.org> vegetable dr Sender: "TECO DRYER" Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Date: Thu, 7 Aug 2008 00:13:13 +0300 Message-ID: <20080806211313565.237D266B49D91862 at erkan-e90bf8060> X-Priority: 3 (Normal) Importance: Normal Teco Industry is in the business of corn, wheat, paddy, and vegetable drying machines and the production and marketing of silo & steel construction. Related to the machines that our company produce; Teco Industry has the representatives in Bulgaria, Albania, Ukraine, Tatarstan, Kazakhstan, Russia, Angola and Indonesia. Our partners in these countries are accepted as the leaders in the steel industry. The quality of produced machines is approved by international standards. Teco is guaranteed by CE and ISO 9001-2000 certificates. Teco also contributes to the national economy by creating jobs in designing, project, production, import and export. Teco materializes R&D activities with its professional staff. Quality results are presented to the customers during the production, import and export. Our company takes the leadership of producing and marketing nationally and internationally. For Grain, Oily Seeds, and Pulses: Silos Corn and Soybean Drying Machines Handling Systems like Bucket Elevator, Chain Conveyor and Helix Prop Towers and Catwalks for Handling Systems Unloading Truck Lifts Industrial Foundations, Steel Construction With the expert staff; we take an important target like ��Customer Satisfaction and Service Quality�� and perform service and counseling duties successfully. -------------------------------------------------------------------------------- Contact Us , Teco Dryer Company is ready for a long partnership with you. Sales Engineer Erkan AYMAN eayman at tecodryer.com From robins at bellnet.ca Wed Aug 6 19:33:13 2008 From: robins at bellnet.ca (SIR GEORGE HARRIS.) Date: Wed, 6 Aug 2008 22:33:13 -0400 Subject: [ofa-general] Claims Requirrement Message-ID: <20080807023332.DDIX1589.tomts52-srv.bellnexxia.net@toip38-bus.srvr.bell.ca> Your E-ID was selected online in this week's AWARD PROMO.Your draw has a total value of $2,000,000.00. Please acknowledge the receipt of this mail with the details below to : James Keegan, E-mail:mrjameskeegan0 at rocketmail.com Claims Requirements: 1.Full name: 2.Address:3.Age:4.Sex: Cordialy, SIR GEORGE HARRIS From wangwhao at cn.ibm.com Wed Aug 6 20:20:53 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Thu, 7 Aug 2008 11:20:53 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: <489982FE.5030008@dev.mellanox.co.il> Message-ID: Hi, Yevgeny: Thanks for your answer. >This part is OK - opensm enters the stand-by state and >waits in this state indefinitely. This happened because >opensm detects other opensm in the subnet. >If you kill that other opensm, the stand-by opensm will >enter MASTER state after a short period. >You can see who's the master opensm in your subnet by >running 'sminfo' tool. Here is the output of sminfo [root at gaia-07 nodedef]# sminfo sminfo: sm lid 2 sm guid 0x5ad0000094038, activity count 4999946 priority 10 state 3 SMINFO_MASTER [root at gaia-07 nodedef]# ibnetdiscover |grep 0x5ad0000094038 switchguid=0x5ad0000094038(5ad0000094038) [root at gaia-07 nodedef]# ibnetdiscover |grep "lid 2" Switch 24 "S-0005ad0000094038" # "Topspin Switch" enhanced port 0 lid 2 lmc 0 [1](2c903000134f5) "S-0005ad0000094038"[13] # lid 7 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f1040398b9f1) "S-0005ad0000094038"[11] # lid 8 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f104039955a5) "S-0005ad0000094038"[10] # lid 6 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f10403995879) "S-0005ad0000094038"[9] # lid 10 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f1040398ba19) "S-0005ad0000094038"[7] # lid 9 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f10403995861) "S-0005ad0000094038"[4] # lid 5 lmc 0 "Topspin Switch" lid 2 4xSDR [1](8f10403995875) "S-0005ad0000094038"[3] # lid 4 lmc 0 "Topspin Switch" lid 2 4xSDR [1](2c90300013371) "S-0005ad0000094038"[14] # lid 3 lmc 0 "Topspin Switch" lid 2 4xSDR It seems the Cisco switch has subnet manager running. >By default, osmtest runs all validation tests, which is similar >to 'osmtest -f a'. This flow expects to get an input inventory file. >You should first run 'osmtest -f c' to create such file, and then >'osmtest' or 'osmtest -f a' to run the tests. >See 'man osmtest' for more details. "osmtest -f c" failed to create the inventory file. [root at gaia-07 ~]# osmtest -f c Command Line Arguments Done with args Flow = Create Inventory Aug 07 04:57:04 561325 [516EF3B0] 0x7f -> Setting log level to: 0x03 Aug 07 04:57:04 579744 [516EF3B0] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 07 04:57:04 602919 [516EF3B0] 0x02 -> osmtest_validate_sa_class_port_info: ----------------------------- SA Class Port Info: base_ver:1 class_ver:2 cap_mask:0x2601 cap_mask2:0x0 resp_time_val:0x14 ----------------------------- Aug 07 04:57:08 604366 [4236E940] 0x01 -> umad_receiver: ERR 5409: send completed with error (method=0x12 attr=0x35 trans_id=0x2a00000004) -- dropping Aug 07 04:57:08 604396 [4236E940] 0x01 -> umad_receiver: ERR 5410: class 0x3 LID 0x2 Aug 07 04:57:08 604420 [4236E940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_TIMEOUT) Aug 07 04:57:08 604454 [516EF3B0] 0x01 -> osmtest_get_all_recs: ERR 0004: ib_query failed (IB_TIMEOUT) Aug 07 04:57:08 604476 [516EF3B0] 0x01 -> osmtest_write_all_path_recs: ERR 0025: osmtest_get_all_recs failed (IB_TIMEOUT) Aug 07 04:57:08 604500 [516EF3B0] 0x01 -> osmtest_run: ERR 0139: Inventory file create failed (IB_TIMEOUT) OSMTEST: TEST "Create Inventory" FAIL Here attatch the output of "osmtest -f c -V". (See attached file: output) -- Yevgeny Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: output Type: application/octet-stream Size: 18773 bytes Desc: not available URL: From wangwhao at cn.ibm.com Wed Aug 6 20:26:01 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Thu, 7 Aug 2008 11:26:01 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: Message-ID: Hi Hal: Thanks for your response. >Is this OpenSM or some embedded SM ? The Cisco swithc has one embeded SM running. I have installed opensm on my RHEL5.2 server. I expect opensm works as slave/standby SM, and the switch works as master SM. Is it feasible? >This makes it look like the master SM is not OpenSM. Is that the case ? You are right. The master SM is Cisco embedded SM on switch. -- Hal Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Aug 6 20:14:34 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 06 Aug 2008 20:14:34 -0700 Subject: [ofa-general][PATCH] mlx4: Fixes to cqe format In-Reply-To: <488F058B.6040701@mellanox.co.il> (Yevgeny Petrilin's message of "Tue, 29 Jul 2008 14:56:59 +0300") References: <488F058B.6040701@mellanox.co.il> Message-ID: thanks, applied From robins at bellnet.ca Wed Aug 6 20:48:43 2008 From: robins at bellnet.ca (Nl Online Promo) Date: Wed, 6 Aug 2008 23:48:43 -0400 Subject: [ofa-general] File Your Claims Message-ID: <20080807034900.OUES1729.tomts35-srv.bellnexxia.net@toip41-bus.srvr.bell.ca> Contact Mr. Mark Smith for the claim of 1,000,000.00 Euro which you have won in lottery promo. Send your Names ,Address,Age,Tel,Country,Occupation. Email: nll3458745643 at googlemail.com From sashak at voltaire.com Wed Aug 6 22:52:58 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 7 Aug 2008 08:52:58 +0300 Subject: [ofa-general] [PATCH] opensm: query remote SMs during light sweep Message-ID: <20080807055258.GA14250@sashak.voltaire.com> Remote statndby SM(s) may change priority, etc.. Somehow it should be detected - query remotes SMs for SMIfno in a light sweep. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_madw.h | 1 + opensm/opensm/osm_sminfo_rcv.c | 11 +++++++++-- opensm/opensm/osm_state_mgr.c | 23 +++++++++++++++++++++++ 3 files changed, 33 insertions(+), 2 deletions(-) diff --git a/opensm/include/opensm/osm_madw.h b/opensm/include/opensm/osm_madw.h index 70649a3..fd39272 100644 --- a/opensm/include/opensm/osm_madw.h +++ b/opensm/include/opensm/osm_madw.h @@ -249,6 +249,7 @@ typedef struct osm_mft_context { typedef struct osm_smi_context { ib_net64_t port_guid; boolean_t set_method; + boolean_t light_sweep; } osm_smi_context_t; /*********/ diff --git a/opensm/opensm/osm_sminfo_rcv.c b/opensm/opensm/osm_sminfo_rcv.c index 47c346d..98c1994 100644 --- a/opensm/opensm/osm_sminfo_rcv.c +++ b/opensm/opensm/osm_sminfo_rcv.c @@ -307,7 +307,8 @@ Exit: **********************************************************************/ static osm_signal_t __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, - IN const osm_remote_sm_t * const p_sm) + IN const osm_remote_sm_t * const p_sm, + boolean_t light_sweep) { const ib_sm_info_t *p_smi; @@ -398,6 +399,11 @@ __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, is done and all SMs are recongnized. */ } break; + case IB_SMINFO_STATE_STANDBY: + if (light_sweep && + __osm_sminfo_rcv_remote_sm_is_higher(sm, p_smi)) + sm->p_subn->force_heavy_sweep = TRUE; + break; default: /* any other state - do nothing */ break; @@ -497,7 +503,8 @@ __osm_sminfo_rcv_process_get_response(IN osm_sm_t * sm, /* We already know this SM. Update the SMInfo attribute. */ p_sm->smi = *p_smi; - __osm_sminfo_rcv_process_get_sm(sm, p_sm); + __osm_sminfo_rcv_process_get_sm(sm, p_sm, + osm_madw_get_smi_context_ptr(p_madw)->light_sweep); _unlock_and_exit: CL_PLOCK_RELEASE(sm->p_lock); diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 5c5167f..3cdb2cf 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #include #include @@ -495,6 +496,25 @@ Exit: return (status); } +static void query_sm_info(cl_map_item_t *item, void *cxt) +{ + osm_madw_context_t context; + osm_remote_sm_t *r_sm = cl_item_obj(item, r_sm, map_item); + osm_sm_t *sm = cxt; + ib_api_status_t ret; + + context.smi_context.port_guid = r_sm->p_port->guid; + context.smi_context.set_method = FALSE; + context.smi_context.light_sweep = TRUE; + + ret = osm_req_get(sm, osm_physp_get_dr_path_ptr(r_sm->p_port->p_physp), + IB_MAD_ATTR_SM_INFO, 0, CL_DISP_MSGID_NONE, &context); + if (ret != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, "ERR 3314: " + "Failure requesting SMInfo (%s)\n", + ib_get_err_str(ret)); +} + /********************************************************************** Initiates a lightweight sweep of the subnet. Used during normal sweeps after the subnet is up. @@ -560,6 +580,9 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) } } } + + cl_qmap_apply_func(&sm->p_subn->sm_guid_tbl, query_sm_info, sm); + CL_PLOCK_RELEASE(sm->p_lock); _exit: -- 1.5.5.1.178.g1f811 From sashak at voltaire.com Wed Aug 6 23:55:40 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 7 Aug 2008 09:55:40 +0300 Subject: [ofa-general] [PATCH] opensm: redo lex and yacc files generation Message-ID: <20080807065540.GH14250@sashak.voltaire.com> Use standard automake mechanisms for dealing with yacc and lex files - only *.l and *.y files are listed as SOURCES. In addition to this basic change I was need to rename lex and yacc files to have uniqye *.c file names, remove __qos_parser_ prefix (config/ylwrap doesn't deal with it very well) and move header file generation to local directory. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/Makefile.am | 22 +- opensm/opensm/osm_qos_parser.l | 394 ----- opensm/opensm/osm_qos_parser.y | 3064 -------------------------------------- opensm/opensm/osm_qos_parser_l.l | 394 +++++ opensm/opensm/osm_qos_parser_y.y | 3063 +++++++++++++++++++++++++++++++++++++ 5 files changed, 3460 insertions(+), 3477 deletions(-) delete mode 100644 opensm/opensm/osm_qos_parser.l delete mode 100644 opensm/opensm/osm_qos_parser.y create mode 100644 opensm/opensm/osm_qos_parser_l.l create mode 100644 opensm/opensm/osm_qos_parser_y.y diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index d14b913..06c27cc 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -59,17 +59,9 @@ opensm_SOURCES = main.c osm_console_io.c osm_console.c osm_db_files.c \ osm_vl15intf.c osm_vl_arb_rcv.c \ st.c osm_perfmgr.c osm_perfmgr_db.c \ osm_event_plugin.c osm_dump.c \ - $(srcdir)/osm_qos_parser_y.c $(srcdir)/osm_qos_parser_l.c \ - osm_qos_policy.c + osm_qos_parser_y.y osm_qos_parser_l.l osm_qos_policy.c -$(srcdir)/osm_qos_parser_y.c: $(srcdir)/osm_qos_parser.y $(srcdir)/../include/opensm/osm_qos_policy.h - $(YACC) -d -o $(srcdir)/osm_qos_parser_y.c -p__qos_parser_ $(srcdir)/osm_qos_parser.y - mv -f $(srcdir)/osm_qos_parser_y.h $(srcdir)/../include/opensm/osm_qos_parser_y.h - -$(srcdir)/osm_qos_parser_l.c: $(srcdir)/osm_qos_parser.l $(srcdir)/../include/opensm/osm_qos_policy.h osm_qos_parser_y.c - $(LEX) -P__qos_parser_ -o$(srcdir)/osm_qos_parser_l.c $(srcdir)/osm_qos_parser.l - -$(srcdir)/../include/opensm/osm_qos_parser_y.h: $(srcdir)/osm_qos_parser_y.c +AM_YFLAGS:= -d if OSMV_OPENIB opensm_CFLAGS = -Wall $(OSMV_CFLAGS) -fno-strict-aliasing -DVENDOR_RMPP_SUPPORT -DDUAL_SIDED_RMPP $(DBGFLAGS) -D_XOPEN_SOURCE=600 -D_BSD_SOURCE=1 @@ -127,7 +119,6 @@ opensminclude_HEADERS = \ $(srcdir)/../include/opensm/osm_port.h \ $(srcdir)/../include/opensm/osm_port_profile.h \ $(srcdir)/../include/opensm/osm_prefix_route.h \ - $(srcdir)/../include/opensm/osm_qos_parser_y.h \ $(srcdir)/../include/opensm/osm_qos_policy.h \ $(srcdir)/../include/opensm/osm_rand_fwd_tbl.h \ $(srcdir)/../include/opensm/osm_remote_sm.h \ @@ -159,11 +150,4 @@ osm_version: # headers are distributed as part of the include dir EXTRA_DIST = $(srcdir)/libopensm.map $(srcdir)/libopensm.ver \ - $(srcdir)/ChangeLog \ - $(srcdir)/osm_qos_parser.y $(srcdir)/osm_qos_parser.l - -# generate c and h files from the lex and yacc files -dist-hook: $(srcdir)/osm_qos_parser_y.c $(srcdir)/osm_qos_parser_l.c - -maintainer-clean-generic: - rm -f $(srcdir)/osm_qos_parser_y.c $(srcdir)/osm_qos_parser_l.c + $(srcdir)/ChangeLog diff --git a/opensm/opensm/osm_qos_parser.l b/opensm/opensm/osm_qos_parser.l deleted file mode 100644 index 40e061d..0000000 --- a/opensm/opensm/osm_qos_parser.l +++ /dev/null @@ -1,394 +0,0 @@ -%{ -/* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Lexer of OSM QoS parser. - * - * Environment: - * Linux User Mode - * - * Author: - * Yevgeny Kliteynik, Mellanox - */ - -#include -#include - -#define HANDLE_IF_IN_DESCRIPTION if (in_description) { __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } - -#define SAVE_POS save_pos() -static void save_pos(); - -extern int column_num; -extern int line_num; -extern FILE * __qos_parser_in; -extern YYSTYPE __qos_parser_lval; - -boolean_t in_description = FALSE; -boolean_t in_list_of_hex_num_ranges = FALSE; -boolean_t in_node_type = FALSE; -boolean_t in_list_of_numbers = FALSE; -boolean_t in_list_of_strings = FALSE; -boolean_t in_list_of_num_pairs = FALSE; -boolean_t in_asterisk_or_list_of_numbers = FALSE; -boolean_t in_list_of_num_ranges = FALSE; -boolean_t in_single_string = FALSE; -boolean_t in_single_number = FALSE; - -static void reset_new_line_flags(); -#define RESET_NEW_LINE_FLAGS reset_new_line_flags() - -#define START_USE {in_description = TRUE;} /* list of strings including whitespace (description) */ -#define START_PORT_GUID {in_list_of_hex_num_ranges = TRUE;} /* comma-separated list of hex num ranges */ -#define START_PORT_NAME {in_list_of_strings = TRUE;} /* comma-separated list of following strings: ../../.. */ -#define START_PARTITION {in_single_string = TRUE;} /* single string w/o whitespaces (partition name) */ -#define START_NAME {in_single_string = TRUE;} /* single string w/o whitespaces (port group name) */ -#define START_QOS_LEVEL_NAME {in_single_string = TRUE;} /* single string w/o whitespaces (qos level name in match rule) */ - -#define START_NODE_TYPE {in_node_type = TRUE;} /* comma-separated list of node types (ROUTER,CA,...) */ -#define START_SL2VL_TABLE {in_list_of_numbers = TRUE;} /* comma-separated list of hex or dec numbers */ - -#define START_GROUP {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ -#define START_ACROSS {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ -#define START_ACROSS_TO {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ -#define START_ACROSS_FROM {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ -#define START_SOURCE {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ -#define START_DESTINATION {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ - -#define START_VLARB_HIGH {in_list_of_num_pairs = TRUE;} /* comma-separated list of hex or dec num pairs: "num1:num2" */ -#define START_VLARB_LOW {in_list_of_num_pairs = TRUE;} /* comma-separated list of hex or dec num pairs: "num1:num2" */ - -#define START_TO {in_asterisk_or_list_of_numbers = TRUE;} /* (asterisk) or (comma-separated list of hex or dec numbers) */ -#define START_FROM {in_asterisk_or_list_of_numbers = TRUE;} /* (asterisk) or (comma-separated list of hex or dec numbers) */ - -#define START_PATH_BITS {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_QOS_CLASS {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_SERVICE_ID {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_PKEY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ - -#define START_SL {in_single_number = TRUE;} /* single number */ -#define START_VLARB_HIGH_LIMIT {in_single_number = TRUE;} /* single number */ -#define START_MTU_LIMIT {in_single_number = TRUE;} /* single number */ -#define START_RATE_LIMIT {in_single_number = TRUE;} /* single number */ -#define START_PACKET_LIFE {in_single_number = TRUE;} /* single number */ - -#define START_ULP_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_ANY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_ULP_SDP_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_SDP_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_ULP_RDS_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_RDS_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_ULP_ISER_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_ISER_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_ULP_SRP_GUID {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ -#define START_ULP_IPOIB_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_IPOIB_PKEY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ - - -%} - -%option nounput noinput - -QOS_ULPS_START qos\-ulps -QOS_ULPS_END end\-qos\-ulps -PORT_GROUPS_START port\-groups -PORT_GROUPS_END end\-port\-groups -PORT_GROUP_START port\-group -PORT_GROUP_END end\-port\-group -PORT_NUM port\-num -NAME name -USE use -PORT_GUID port\-guid -TARGET_PORT_GUID target\-port\-guid -PORT_NAME port\-name -PARTITION partition -NODE_TYPE node\-type -QOS_SETUP_START qos\-setup -QOS_SETUP_END end\-qos\-setup -VLARB_TABLES_START vlarb\-tables -VLARB_TABLES_END end\-vlarb\-tables -VLARB_SCOPE_START vlarb\-scope -VLARB_SCOPE_END end\-vlarb\-scope -GROUP group -ACROSS across -VLARB_HIGH vlarb\-high -VLARB_LOW vlarb\-low -VLARB_HIGH_LIMIT vl\-high\-limit -SL2VL_TABLES_START sl2vl\-tables -SL2VL_TABLES_END end\-sl2vl\-tables -SL2VL_SCOPE_START sl2vl\-scope -SL2VL_SCOPE_END end\-sl2vl\-scope -TO to -FROM from -ACROSS_TO across\-to -ACROSS_FROM across\-from -SL2VL_TABLE sl2vl\-table -QOS_LEVELS_START qos\-levels -QOS_LEVELS_END end\-qos\-levels -QOS_LEVEL_START qos\-level -QOS_LEVEL_END end\-qos\-level -SL sl -MTU_LIMIT mtu\-limit -RATE_LIMIT rate\-limit -PACKET_LIFE packet\-life -PATH_BITS path\-bits -QOS_MATCH_RULES_START qos\-match\-rules -QOS_MATCH_RULES_END end\-qos\-match\-rules -QOS_MATCH_RULE_START qos\-match\-rule -QOS_MATCH_RULE_END end\-qos\-match\-rule -QOS_CLASS qos\-class -SOURCE source -DESTINATION destination -SERVICE_ID service\-id -PKEY pkey -QOS_LEVEL_NAME qos\-level\-name - -ROUTER [Rr][Oo][Uu][Tt][Ee][Rr] -CA [Cc][Aa] -SWITCH [Ss][Ww][Ii][Tt][Cc][Hh] -SELF [Ss][Ee][Ll][Ff] -ALL [Aa][Ll][Ll] - -ULP_SDP [Ss][Dd][Pp] -ULP_SRP [Ss][Rr][Pp] -ULP_RDS [Rr][Dd][Ss] -ULP_IPOIB [Ii][Pp][Oo][Ii][Bb] -ULP_ISER [Ii][Ss][Ee][Rr] -ULP_ANY [Aa][Nn][Yy] -ULP_DEFAULT [Dd][Ee][Ff][Aa][Uu][Ll][Tt] - -WHITE [ \t]+ -NEW_LINE \n -COMMENT \#.*\n -WHITE_DOTDOT_WHITE [ \t]*:[ \t]* -WHITE_COMMA_WHITE [ \t]*,[ \t]* -QUOTED_TEXT \"[^\"]*\" - -%% - - -{COMMENT} { SAVE_POS; RESET_NEW_LINE_FLAGS; } /* swallow comment */ -{WHITE}{NEW_LINE} { SAVE_POS; RESET_NEW_LINE_FLAGS; } /* trailing blanks with new line */ -{WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; } -{NEW_LINE} { SAVE_POS; RESET_NEW_LINE_FLAGS; } - -{QOS_ULPS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_ULPS_START; } -{QOS_ULPS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_ULPS_END; } - -{PORT_GROUPS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUPS_START; } -{PORT_GROUPS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUPS_END; } -{PORT_GROUP_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUP_START; } -{PORT_GROUP_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUP_END; } - -{QOS_SETUP_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_SETUP_START; } -{QOS_SETUP_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_SETUP_END; } -{VLARB_TABLES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_TABLES_START; } -{VLARB_TABLES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_TABLES_END; } -{VLARB_SCOPE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_SCOPE_START; } -{VLARB_SCOPE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_SCOPE_END; } - -{SL2VL_TABLES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_TABLES_START; } -{SL2VL_TABLES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_TABLES_END; } -{SL2VL_SCOPE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_SCOPE_START; } -{SL2VL_SCOPE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_SCOPE_END; } - -{QOS_LEVELS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVELS_START; } -{QOS_LEVELS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVELS_END; } -{QOS_LEVEL_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVEL_START; } -{QOS_LEVEL_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVEL_END; } - -{QOS_MATCH_RULES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULES_START; } -{QOS_MATCH_RULES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULES_END; } -{QOS_MATCH_RULE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULE_START; } -{QOS_MATCH_RULE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULE_END; } - -{PORT_GUID}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PORT_GUID; return TK_PORT_GUID; } -{PORT_NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PORT_NAME; return TK_PORT_NAME; } -{PARTITION}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PARTITION; return TK_PARTITION; } -{NODE_TYPE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_NODE_TYPE; return TK_NODE_TYPE; } -{NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_NAME; return TK_NAME; } -{USE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_USE; return TK_USE; } -{GROUP}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_GROUP; return TK_GROUP; } -{VLARB_HIGH}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_HIGH; return TK_VLARB_HIGH; } -{VLARB_LOW}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_LOW; return TK_VLARB_LOW; } -{VLARB_HIGH_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_HIGH_LIMIT; return TK_VLARB_HIGH_LIMIT;} -{TO}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_TO; return TK_TO; } -{FROM}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_FROM; return TK_FROM; } -{ACROSS_TO}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS_TO; return TK_ACROSS_TO; } -{ACROSS_FROM}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS_FROM; return TK_ACROSS_FROM;} -{ACROSS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS; return TK_ACROSS; } -{SL2VL_TABLE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SL2VL_TABLE; return TK_SL2VL_TABLE;} -{SL}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SL; return TK_SL; } -{MTU_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_MTU_LIMIT; return TK_MTU_LIMIT; } -{RATE_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_RATE_LIMIT; return TK_RATE_LIMIT; } -{PACKET_LIFE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PACKET_LIFE; return TK_PACKET_LIFE;} -{PATH_BITS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PATH_BITS; return TK_PATH_BITS; } -{QOS_CLASS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_QOS_CLASS; return TK_QOS_CLASS; } -{SOURCE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SOURCE; return TK_SOURCE; } -{DESTINATION}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_DESTINATION; return TK_DESTINATION;} -{SERVICE_ID}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SERVICE_ID; return TK_SERVICE_ID; } -{PKEY}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PKEY; return TK_PKEY; } -{QOS_LEVEL_NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_QOS_LEVEL_NAME; return TK_QOS_LEVEL_NAME;} - -{ROUTER} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_ROUTER; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } -{CA} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_CA; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } -{SWITCH} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_SWITCH; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } -{SELF} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_SELF; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } -{ALL} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_ALL; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT; } - -{ULP_DEFAULT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_DEFAULT; return TK_ULP_DEFAULT; } -{ULP_ANY}{WHITE_COMMA_WHITE}{SERVICE_ID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_SERVICE_ID; } -{ULP_ANY}{WHITE_COMMA_WHITE}{PKEY} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_PKEY; } -{ULP_ANY}{WHITE_COMMA_WHITE}{TARGET_PORT_GUID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_TARGET_PORT_GUID; } - -{ULP_SDP}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_DEFAULT; return TK_ULP_SDP_DEFAULT; } -{ULP_SDP}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_SDP_PORT; } - -{ULP_RDS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_DEFAULT; return TK_ULP_RDS_DEFAULT; } -{ULP_RDS}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_PORT; return TK_ULP_RDS_PORT; } - -{ULP_ISER}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_DEFAULT; return TK_ULP_ISER_DEFAULT; } -{ULP_ISER}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_ISER_PORT; } - -{ULP_SRP}{WHITE_COMMA_WHITE}{TARGET_PORT_GUID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SRP_GUID; return TK_ULP_SRP_GUID; } - -{ULP_IPOIB}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_IPOIB_DEFAULT; return TK_ULP_IPOIB_DEFAULT; } -{ULP_IPOIB}{WHITE_COMMA_WHITE}{PKEY} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_IPOIB_PKEY; return TK_ULP_IPOIB_PKEY; } - -0[xX][0-9a-fA-F]+ { - SAVE_POS; - __qos_parser_lval = strdup(__qos_parser_text); - if (in_description || in_list_of_strings || in_single_string) - return TK_TEXT; - return TK_NUMBER; - } - -[0-9]+ { - SAVE_POS; - __qos_parser_lval = strdup(__qos_parser_text); - if (in_description || in_list_of_strings || in_single_string) - return TK_TEXT; - return TK_NUMBER; - } - - -- { - SAVE_POS; - if (in_description || in_list_of_strings || in_single_string) - { - __qos_parser_lval = strdup(__qos_parser_text); - return TK_TEXT; - } - return TK_DASH; - } - -: { - SAVE_POS; - if (in_description || in_list_of_strings || in_single_string) - { - __qos_parser_lval = strdup(__qos_parser_text); - return TK_TEXT; - } - return TK_DOTDOT; - } - -, { - SAVE_POS; - if (in_description) - { - __qos_parser_lval = strdup(__qos_parser_text); - return TK_TEXT; - } - return TK_COMMA; - } - -\* { - SAVE_POS; - if (in_description || in_list_of_strings || in_single_string) - { - __qos_parser_lval = strdup(__qos_parser_text); - return TK_TEXT; - } - return TK_ASTERISK; - } - -{QUOTED_TEXT} { - SAVE_POS; - __qos_parser_lval = strdup(&__qos_parser_text[1]); - __qos_parser_lval[strlen(__qos_parser_lval)-1] = '\0'; - return TK_TEXT; - } - -. { SAVE_POS; __qos_parser_lval = strdup(__qos_parser_text); return TK_TEXT;} - -%% - - -/********************************************* - *********************************************/ - -static void save_pos() -{ - int i; - for (i = 0; i < __qos_parser_leng; i++) - { - if (__qos_parser_text[i] == '\n') - { - line_num ++; - column_num = 1; - } - else - column_num ++; - } -} - -/********************************************* - *********************************************/ - -static void reset_new_line_flags() -{ - in_description = FALSE; - in_list_of_hex_num_ranges = FALSE; - in_node_type = FALSE; - in_list_of_numbers = FALSE; - in_list_of_strings = FALSE; - in_list_of_num_pairs = FALSE; - in_asterisk_or_list_of_numbers = FALSE; - in_list_of_num_ranges = FALSE; - in_single_string = FALSE; - in_single_number = FALSE; -} diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y deleted file mode 100644 index 6fa024c..0000000 --- a/opensm/opensm/osm_qos_parser.y +++ /dev/null @@ -1,3064 +0,0 @@ -%{ -/* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * Copyright (c) 2008 HNR Consulting. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Grammar of OSM QoS parser. - * - * Environment: - * Linux User Mode - * - * Author: - * Yevgeny Kliteynik, Mellanox - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define OSM_QOS_POLICY_MAX_LINE_LEN 1024*10 -#define OSM_QOS_POLICY_SL2VL_TABLE_LEN IB_MAX_NUM_VLS -#define OSM_QOS_POLICY_MAX_VL_NUM IB_MAX_NUM_VLS - -typedef struct tmp_parser_struct_t_ { - char str[OSM_QOS_POLICY_MAX_LINE_LEN]; - uint64_t num_pair[2]; - cl_list_t str_list; - cl_list_t num_list; - cl_list_t num_pair_list; -} tmp_parser_struct_t; - -static void __parser_tmp_struct_init(); -static void __parser_tmp_struct_reset(); -static void __parser_tmp_struct_destroy(); - -static char * __parser_strip_white(char * str); - -static void __parser_str2uint64(uint64_t * p_val, char * str); - -static void __parser_port_group_start(); -static int __parser_port_group_end(); - -static void __parser_sl2vl_scope_start(); -static int __parser_sl2vl_scope_end(); - -static void __parser_vlarb_scope_start(); -static int __parser_vlarb_scope_end(); - -static void __parser_qos_level_start(); -static int __parser_qos_level_end(); - -static void __parser_match_rule_start(); -static int __parser_match_rule_end(); - -static void __parser_ulp_match_rule_start(); -static int __parser_ulp_match_rule_end(); - -static void __pkey_rangelist2rangearr( - cl_list_t * p_list, - uint64_t ** * p_arr, - unsigned * p_arr_len); - -static void __rangelist2rangearr( - cl_list_t * p_list, - uint64_t ** * p_arr, - unsigned * p_arr_len); - -static void __merge_rangearr( - uint64_t ** range_arr_1, - unsigned range_len_1, - uint64_t ** range_arr_2, - unsigned range_len_2, - uint64_t ** * p_arr, - unsigned * p_arr_len ); - -static void __parser_add_port_to_port_map( - cl_qmap_t * p_map, - osm_physp_t * p_physp); - -static void __parser_add_guid_range_to_port_map( - cl_qmap_t * p_map, - uint64_t ** range_arr, - unsigned range_len); - -static void __parser_add_pkey_range_to_port_map( - cl_qmap_t * p_map, - uint64_t ** range_arr, - unsigned range_len); - -static void __parser_add_partition_list_to_port_map( - cl_qmap_t * p_map, - cl_list_t * p_list); - -static void __parser_add_map_to_port_map( - cl_qmap_t * p_dmap, - cl_map_t * p_smap); - -static int __validate_pkeys( - uint64_t ** range_arr, - unsigned range_len, - boolean_t is_ipoib); - -static void __setup_simple_qos_levels(); -static void __clear_simple_qos_levels(); -static void __setup_ulp_match_rules(); -static void __process_ulp_match_rules(); -static void __qos_parser_error(const char *format, ...); - -extern char * __qos_parser_text; -extern int __qos_parser_lex (void); -extern FILE * __qos_parser_in; -extern int errno; -int __qos_parser_parse(); - -#define RESET_BUFFER __parser_tmp_struct_reset() - -tmp_parser_struct_t tmp_parser_struct; - -int column_num; -int line_num; - -osm_qos_policy_t * p_qos_policy = NULL; -osm_qos_port_group_t * p_current_port_group = NULL; -osm_qos_sl2vl_scope_t * p_current_sl2vl_scope = NULL; -osm_qos_vlarb_scope_t * p_current_vlarb_scope = NULL; -osm_qos_level_t * p_current_qos_level = NULL; -osm_qos_match_rule_t * p_current_qos_match_rule = NULL; -osm_log_t * p_qos_parser_osm_log; - -/* 16 Simple QoS Levels - one for each SL */ -static osm_qos_level_t osm_qos_policy_simple_qos_levels[16]; - -/* Default Simple QoS Level */ -osm_qos_level_t __default_simple_qos_level; - -/* - * List of match rules that will be generated by the - * qos-ulp section. These rules are concatenated to - * the end of the usual matching rules list at the - * end of parsing. - */ -static cl_list_t __ulp_match_rules; - -/***************************************************/ - -%} - -%token TK_NUMBER -%token TK_DASH -%token TK_DOTDOT -%token TK_COMMA -%token TK_ASTERISK -%token TK_TEXT - -%token TK_QOS_ULPS_START -%token TK_QOS_ULPS_END - -%token TK_PORT_GROUPS_START -%token TK_PORT_GROUPS_END -%token TK_PORT_GROUP_START -%token TK_PORT_GROUP_END - -%token TK_QOS_SETUP_START -%token TK_QOS_SETUP_END -%token TK_VLARB_TABLES_START -%token TK_VLARB_TABLES_END -%token TK_VLARB_SCOPE_START -%token TK_VLARB_SCOPE_END - -%token TK_SL2VL_TABLES_START -%token TK_SL2VL_TABLES_END -%token TK_SL2VL_SCOPE_START -%token TK_SL2VL_SCOPE_END - -%token TK_QOS_LEVELS_START -%token TK_QOS_LEVELS_END -%token TK_QOS_LEVEL_START -%token TK_QOS_LEVEL_END - -%token TK_QOS_MATCH_RULES_START -%token TK_QOS_MATCH_RULES_END -%token TK_QOS_MATCH_RULE_START -%token TK_QOS_MATCH_RULE_END - -%token TK_NAME -%token TK_USE -%token TK_PORT_GUID -%token TK_PORT_NAME -%token TK_PARTITION -%token TK_NODE_TYPE -%token TK_GROUP -%token TK_ACROSS -%token TK_VLARB_HIGH -%token TK_VLARB_LOW -%token TK_VLARB_HIGH_LIMIT -%token TK_TO -%token TK_FROM -%token TK_ACROSS_TO -%token TK_ACROSS_FROM -%token TK_SL2VL_TABLE -%token TK_SL -%token TK_MTU_LIMIT -%token TK_RATE_LIMIT -%token TK_PACKET_LIFE -%token TK_PATH_BITS -%token TK_QOS_CLASS -%token TK_SOURCE -%token TK_DESTINATION -%token TK_SERVICE_ID -%token TK_QOS_LEVEL_NAME -%token TK_PKEY - -%token TK_NODE_TYPE_ROUTER -%token TK_NODE_TYPE_CA -%token TK_NODE_TYPE_SWITCH -%token TK_NODE_TYPE_SELF -%token TK_NODE_TYPE_ALL - -%token TK_ULP_DEFAULT -%token TK_ULP_ANY_SERVICE_ID -%token TK_ULP_ANY_PKEY -%token TK_ULP_ANY_TARGET_PORT_GUID -%token TK_ULP_SDP_DEFAULT -%token TK_ULP_SDP_PORT -%token TK_ULP_RDS_DEFAULT -%token TK_ULP_RDS_PORT -%token TK_ULP_ISER_DEFAULT -%token TK_ULP_ISER_PORT -%token TK_ULP_SRP_GUID -%token TK_ULP_IPOIB_DEFAULT -%token TK_ULP_IPOIB_PKEY - -%start head - -%% - -head: qos_policy_entries - ; - -qos_policy_entries: /* empty */ - | qos_policy_entries qos_policy_entry - ; - -qos_policy_entry: qos_ulps_section - | port_groups_section - | qos_setup_section - | qos_levels_section - | qos_match_rules_section - ; - - /* - * Parsing qos-ulps: - * ------------------- - * qos-ulps - * default : 0 #default SL - * sdp, port-num 30000 : 1 #SL for SDP when destination port is 30000 - * sdp, port-num 10000-20000 : 2 - * sdp : 0 #default SL for SDP - * srp, target-port-guid 0x1234 : 2 - * rds, port-num 25000 : 2 #SL for RDS when destination port is 25000 - * rds, : 0 #default SL for RDS - * iser, port-num 900 : 5 #SL for iSER where target port is 900 - * iser : 4 #default SL for iSER - * ipoib, pkey 0x0001 : 5 #SL for IPoIB on partition with pkey 0x0001 - * ipoib : 6 #default IPoIB partition - pkey=0x7FFF - * any, service-id 0x6234 : 2 - * any, pkey 0x0ABC : 3 - * any, target-port-guid 0x0ABC-0xFFFFF : 6 - * end-qos-ulps - */ - -qos_ulps_section: TK_QOS_ULPS_START qos_ulps TK_QOS_ULPS_END - ; - -qos_ulps: qos_ulp - | qos_ulps qos_ulp - ; - - /* - * Parsing port groups: - * ------------------- - * port-groups - * port-group - * name: Storage - * use: our SRP storage targets - * port-guid: 0x1000000000000001,0x1000000000000002 - * ... - * port-name: vs1 HCA-1/P1 - * port-name: node_description/P2 - * ... - * pkey: 0x00FF-0x0FFF - * ... - * partition: Part1 - * ... - * node-type: ROUTER,CA,SWITCH,SELF,ALL - * ... - * end-port-group - * port-group - * ... - * end-port-group - * end-port-groups - */ - - -port_groups_section: TK_PORT_GROUPS_START port_groups TK_PORT_GROUPS_END - ; - -port_groups: port_group - | port_groups port_group - ; - -port_group: port_group_start port_group_entries port_group_end - ; - -port_group_start: TK_PORT_GROUP_START { - __parser_port_group_start(); - } - ; - -port_group_end: TK_PORT_GROUP_END { - if ( __parser_port_group_end() ) - return 1; - } - ; - -port_group_entries: /* empty */ - | port_group_entries port_group_entry - ; - -port_group_entry: port_group_name - | port_group_use - | port_group_port_guid - | port_group_port_name - | port_group_pkey - | port_group_partition - | port_group_node_type - ; - - - /* - * Parsing qos setup: - * ----------------- - * qos-setup - * vlarb-tables - * vlarb-scope - * ... - * end-vlarb-scope - * vlarb-scope - * ... - * end-vlarb-scope - * end-vlarb-tables - * sl2vl-tables - * sl2vl-scope - * ... - * end-sl2vl-scope - * sl2vl-scope - * ... - * end-sl2vl-scope - * end-sl2vl-tables - * end-qos-setup - */ - -qos_setup_section: TK_QOS_SETUP_START qos_setup_items TK_QOS_SETUP_END - ; - -qos_setup_items: /* empty */ - | qos_setup_items vlarb_tables - | qos_setup_items sl2vl_tables - ; - - /* Parsing vlarb-tables */ - -vlarb_tables: TK_VLARB_TABLES_START vlarb_scope_items TK_VLARB_TABLES_END - ; - -vlarb_scope_items: /* empty */ - | vlarb_scope_items vlarb_scope - ; - -vlarb_scope: vlarb_scope_start vlarb_scope_entries vlarb_scope_end - ; - -vlarb_scope_start: TK_VLARB_SCOPE_START { - __parser_vlarb_scope_start(); - } - ; - -vlarb_scope_end: TK_VLARB_SCOPE_END { - if ( __parser_vlarb_scope_end() ) - return 1; - } - ; - -vlarb_scope_entries:/* empty */ - | vlarb_scope_entries vlarb_scope_entry - ; - - /* - * vlarb-scope - * group: Storage - * ... - * across: Storage - * ... - * vlarb-high: 0:255,1:127,2:63,3:31,4:15,5:7,6:3,7:1 - * vlarb-low: 8:255,9:127,10:63,11:31,12:15,13:7,14:3 - * vl-high-limit: 10 - * end-vlarb-scope - */ - -vlarb_scope_entry: vlarb_scope_group - | vlarb_scope_across - | vlarb_scope_vlarb_high - | vlarb_scope_vlarb_low - | vlarb_scope_vlarb_high_limit - ; - - /* Parsing sl2vl-tables */ - -sl2vl_tables: TK_SL2VL_TABLES_START sl2vl_scope_items TK_SL2VL_TABLES_END - ; - -sl2vl_scope_items: /* empty */ - | sl2vl_scope_items sl2vl_scope - ; - -sl2vl_scope: sl2vl_scope_start sl2vl_scope_entries sl2vl_scope_end - ; - -sl2vl_scope_start: TK_SL2VL_SCOPE_START { - __parser_sl2vl_scope_start(); - } - ; - -sl2vl_scope_end: TK_SL2VL_SCOPE_END { - if ( __parser_sl2vl_scope_end() ) - return 1; - } - ; - -sl2vl_scope_entries:/* empty */ - | sl2vl_scope_entries sl2vl_scope_entry - ; - - /* - * sl2vl-scope - * group: Part1 - * ... - * from: * - * ... - * to: * - * ... - * across-to: Storage2 - * ... - * across-from: Storage1 - * ... - * sl2vl-table: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 - * end-sl2vl-scope - */ - -sl2vl_scope_entry: sl2vl_scope_group - | sl2vl_scope_across - | sl2vl_scope_across_from - | sl2vl_scope_across_to - | sl2vl_scope_from - | sl2vl_scope_to - | sl2vl_scope_sl2vl_table - ; - - /* - * Parsing qos-levels: - * ------------------ - * qos-levels - * qos-level - * name: qos_level_1 - * use: for the lowest priority communication - * sl: 15 - * mtu-limit: 1 - * rate-limit: 1 - * packet-life: 12 - * path-bits: 2,4,8-32 - * pkey: 0x00FF-0x0FFF - * end-qos-level - * ... - * qos-level - * end-qos-level - * end-qos-levels - */ - - -qos_levels_section: TK_QOS_LEVELS_START qos_levels TK_QOS_LEVELS_END - ; - -qos_levels: /* empty */ - | qos_levels qos_level - ; - -qos_level: qos_level_start qos_level_entries qos_level_end - ; - -qos_level_start: TK_QOS_LEVEL_START { - __parser_qos_level_start(); - } - ; - -qos_level_end: TK_QOS_LEVEL_END { - if ( __parser_qos_level_end() ) - return 1; - } - ; - -qos_level_entries: /* empty */ - | qos_level_entries qos_level_entry - ; - -qos_level_entry: qos_level_name - | qos_level_use - | qos_level_sl - | qos_level_mtu_limit - | qos_level_rate_limit - | qos_level_packet_life - | qos_level_path_bits - | qos_level_pkey - ; - - /* - * Parsing qos-match-rules: - * ----------------------- - * qos-match-rules - * qos-match-rule - * use: low latency by class 7-9 or 11 and bla bla - * qos-class: 7-9,11 - * qos-level-name: default - * source: Storage - * destination: Storage - * service-id: 22,4719-5000 - * pkey: 0x00FF-0x0FFF - * end-qos-match-rule - * qos-match-rule - * ... - * end-qos-match-rule - * end-qos-match-rules - */ - -qos_match_rules_section: TK_QOS_MATCH_RULES_START qos_match_rules TK_QOS_MATCH_RULES_END - ; - -qos_match_rules: /* empty */ - | qos_match_rules qos_match_rule - ; - -qos_match_rule: qos_match_rule_start qos_match_rule_entries qos_match_rule_end - ; - -qos_match_rule_start: TK_QOS_MATCH_RULE_START { - __parser_match_rule_start(); - } - ; - -qos_match_rule_end: TK_QOS_MATCH_RULE_END { - if ( __parser_match_rule_end() ) - return 1; - } - ; - -qos_match_rule_entries: /* empty */ - | qos_match_rule_entries qos_match_rule_entry - ; - -qos_match_rule_entry: qos_match_rule_use - | qos_match_rule_qos_class - | qos_match_rule_qos_level_name - | qos_match_rule_source - | qos_match_rule_destination - | qos_match_rule_service_id - | qos_match_rule_pkey - ; - - - /* - * Parsing qos-ulps: - * ----------------- - * default - * sdp - * sdp with port-num - * rds - * rds with port-num - * srp with port-guid - * iser - * iser with port-num - * ipoib - * ipoib with pkey - * any with service-id - * any with pkey - * any with target-port-guid - */ - -qos_ulp: TK_ULP_DEFAULT single_number { - /* parsing default ulp rule: "default: num" */ - cl_list_iterator_t list_iterator; - uint64_t * p_tmp_num; - - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); - if (*p_tmp_num > 15) - { - __qos_parser_error("illegal SL value"); - return 1; - } - __default_simple_qos_level.sl = (uint8_t)(*p_tmp_num); - __default_simple_qos_level.sl_set = TRUE; - free(p_tmp_num); - cl_list_remove_all(&tmp_parser_struct.num_list); - } - - | qos_ulp_type_any_service list_of_ranges TK_DOTDOT { - /* "any, service-id ... : sl" - one instance of list of ranges */ - uint64_t ** range_arr; - unsigned range_len; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("ULP rule doesn't have service ids"); - return 1; - } - - /* get all the service id ranges */ - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - - } qos_ulp_sl - - | qos_ulp_type_any_pkey list_of_ranges TK_DOTDOT { - /* "any, pkey ... : sl" - one instance of list of ranges */ - uint64_t ** range_arr; - unsigned range_len; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("ULP rule doesn't have pkeys"); - return 1; - } - - /* get all the pkey ranges */ - __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - p_current_qos_match_rule->pkey_range_arr = range_arr; - p_current_qos_match_rule->pkey_range_len = range_len; - - } qos_ulp_sl - - | qos_ulp_type_any_target_port_guid list_of_ranges TK_DOTDOT { - /* any, target-port-guid ... : sl */ - uint64_t ** range_arr; - unsigned range_len; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("ULP rule doesn't have port guids"); - return 1; - } - - /* create a new port group with these ports */ - __parser_port_group_start(); - - p_current_port_group->name = strdup("_ULP_Targets_"); - p_current_port_group->use = strdup("Generated from ULP rules"); - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - __parser_add_guid_range_to_port_map( - &p_current_port_group->port_map, - range_arr, - range_len); - - /* add this port group to the destination - groups of the current match rule */ - cl_list_insert_tail(&p_current_qos_match_rule->destination_group_list, - p_current_port_group); - - __parser_port_group_end(); - - } qos_ulp_sl - - | qos_ulp_type_sdp_default { - /* "sdp : sl" - default SL for SDP */ - uint64_t ** range_arr = - (uint64_t **)malloc(sizeof(uint64_t *)); - range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; - range_arr[0][1] = OSM_QOS_POLICY_ULP_SDP_SERVICE_ID + 0xFFFF; - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = 1; - - } qos_ulp_sl - - | qos_ulp_type_sdp_port list_of_ranges TK_DOTDOT { - /* sdp with port numbers */ - uint64_t ** range_arr; - unsigned range_len; - unsigned i; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("SDP ULP rule doesn't have port numbers"); - return 1; - } - - /* get all the port ranges */ - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - /* now translate these port numbers into service ids */ - for (i = 0; i < range_len; i++) - { - if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) - { - __qos_parser_error("SDP port number out of range"); - return 1; - } - range_arr[i][0] += OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; - range_arr[i][1] += OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; - } - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - - } qos_ulp_sl - - | qos_ulp_type_rds_default { - /* "rds : sl" - default SL for RDS */ - uint64_t ** range_arr = - (uint64_t **)malloc(sizeof(uint64_t *)); - range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = range_arr[0][1] = - OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + OSM_QOS_POLICY_ULP_RDS_PORT; - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = 1; - - } qos_ulp_sl - - | qos_ulp_type_rds_port list_of_ranges TK_DOTDOT { - /* rds with port numbers */ - uint64_t ** range_arr; - unsigned range_len; - unsigned i; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("RDS ULP rule doesn't have port numbers"); - return 1; - } - - /* get all the port ranges */ - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - /* now translate these port numbers into service ids */ - for (i = 0; i < range_len; i++) - { - if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) - { - __qos_parser_error("SDP port number out of range"); - return 1; - } - range_arr[i][0] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - range_arr[i][1] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - } - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - - } qos_ulp_sl - - | qos_ulp_type_iser_default { - /* "iSER : sl" - default SL for iSER */ - uint64_t ** range_arr = - (uint64_t **)malloc(sizeof(uint64_t *)); - range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = range_arr[0][1] = - OSM_QOS_POLICY_ULP_ISER_SERVICE_ID + OSM_QOS_POLICY_ULP_ISER_PORT; - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = 1; - - } qos_ulp_sl - - | qos_ulp_type_iser_port list_of_ranges TK_DOTDOT { - /* iser with port numbers */ - uint64_t ** range_arr; - unsigned range_len; - unsigned i; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("iSER ULP rule doesn't have port numbers"); - return 1; - } - - /* get all the port ranges */ - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - /* now translate these port numbers into service ids */ - for (i = 0; i < range_len; i++) - { - if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) - { - __qos_parser_error("SDP port number out of range"); - return 1; - } - range_arr[i][0] += OSM_QOS_POLICY_ULP_ISER_SERVICE_ID; - range_arr[i][1] += OSM_QOS_POLICY_ULP_ISER_SERVICE_ID; - } - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - - } qos_ulp_sl - - | qos_ulp_type_srp_guid list_of_ranges TK_DOTDOT { - /* srp with target guids - this rule is similar - to writing 'any' ulp with target port guids */ - uint64_t ** range_arr; - unsigned range_len; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("SRP ULP rule doesn't have port guids"); - return 1; - } - - /* create a new port group with these ports */ - __parser_port_group_start(); - - p_current_port_group->name = strdup("_SRP_Targets_"); - p_current_port_group->use = strdup("Generated from ULP rules"); - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - __parser_add_guid_range_to_port_map( - &p_current_port_group->port_map, - range_arr, - range_len); - - /* add this port group to the destination - groups of the current match rule */ - cl_list_insert_tail(&p_current_qos_match_rule->destination_group_list, - p_current_port_group); - - __parser_port_group_end(); - - } qos_ulp_sl - - | qos_ulp_type_ipoib_default { - /* ipoib w/o any pkeys (default pkey) */ - uint64_t ** range_arr = - (uint64_t **)malloc(sizeof(uint64_t *)); - range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = range_arr[0][1] = 0x7fff; - - /* - * Although we know that the default partition exists, - * we still need to validate it by checking that it has - * at least two full members. Otherwise IPoIB won't work. - */ - if (__validate_pkeys(range_arr, 1, TRUE)) - return 1; - - p_current_qos_match_rule->pkey_range_arr = range_arr; - p_current_qos_match_rule->pkey_range_len = 1; - - } qos_ulp_sl - - | qos_ulp_type_ipoib_pkey list_of_ranges TK_DOTDOT { - /* ipoib with pkeys */ - uint64_t ** range_arr; - unsigned range_len; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("IPoIB ULP rule doesn't have pkeys"); - return 1; - } - - /* get all the pkey ranges */ - __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - /* - * Validate pkeys. - * For IPoIB pkeys the validation is strict. - * If some problem would be found, parsing will - * be aborted with a proper error messages. - */ - if (__validate_pkeys(range_arr, range_len, TRUE)) - return 1; - - p_current_qos_match_rule->pkey_range_arr = range_arr; - p_current_qos_match_rule->pkey_range_len = range_len; - - } qos_ulp_sl - ; - -qos_ulp_type_any_service: TK_ULP_ANY_SERVICE_ID - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_any_pkey: TK_ULP_ANY_PKEY - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_any_target_port_guid: TK_ULP_ANY_TARGET_PORT_GUID - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_sdp_default: TK_ULP_SDP_DEFAULT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_sdp_port: TK_ULP_SDP_PORT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_rds_default: TK_ULP_RDS_DEFAULT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_rds_port: TK_ULP_RDS_PORT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_iser_default: TK_ULP_ISER_DEFAULT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_iser_port: TK_ULP_ISER_PORT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_srp_guid: TK_ULP_SRP_GUID - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_ipoib_default: TK_ULP_IPOIB_DEFAULT - { __parser_ulp_match_rule_start(); }; - -qos_ulp_type_ipoib_pkey: TK_ULP_IPOIB_PKEY - { __parser_ulp_match_rule_start(); }; - - -qos_ulp_sl: single_number { - /* get the SL for ULP rules */ - cl_list_iterator_t list_iterator; - uint64_t * p_tmp_num; - uint8_t sl; - - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); - if (*p_tmp_num > 15) - { - __qos_parser_error("illegal SL value"); - return 1; - } - - sl = (uint8_t)(*p_tmp_num); - free(p_tmp_num); - cl_list_remove_all(&tmp_parser_struct.num_list); - - p_current_qos_match_rule->p_qos_level = - &osm_qos_policy_simple_qos_levels[sl]; - p_current_qos_match_rule->qos_level_name = - strdup(osm_qos_policy_simple_qos_levels[sl].name); - - if (__parser_ulp_match_rule_end()) - return 1; - } - ; - - /* - * port_group_entry values: - * port_group_name - * port_group_use - * port_group_port_guid - * port_group_port_name - * port_group_pkey - * port_group_partition - * port_group_node_type - */ - -port_group_name: port_group_name_start single_string { - /* 'name' of 'port-group' - one instance */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_port_group->name) - { - __qos_parser_error("port-group has multiple 'name' tags"); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_port_group->name = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -port_group_name_start: TK_NAME { - RESET_BUFFER; - } - ; - -port_group_use: port_group_use_start single_string { - /* 'use' of 'port-group' - one instance */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_port_group->use) - { - __qos_parser_error("port-group has multiple 'use' tags"); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_port_group->use = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -port_group_use_start: TK_USE { - RESET_BUFFER; - } - ; - -port_group_port_name: port_group_port_name_start string_list { - /* 'port-name' in 'port-group' - any num of instances */ - cl_list_iterator_t list_iterator; - osm_node_t * p_node; - osm_physp_t * p_physp; - unsigned port_num; - char * tmp_str; - char * port_str; - - /* parsing port name strings */ - for (list_iterator = cl_list_head(&tmp_parser_struct.str_list); - list_iterator != cl_list_end(&tmp_parser_struct.str_list); - list_iterator = cl_list_next(list_iterator)) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - { - /* last slash in port name string is a separator - between node name and port number */ - port_str = strrchr(tmp_str, '/'); - if (!port_str || (strlen(port_str) < 3) || - (port_str[1] != 'p' && port_str[1] != 'P')) { - __qos_parser_error("'%s' - illegal port name", - tmp_str); - free(tmp_str); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - if (!(port_num = strtoul(&port_str[2],NULL,0))) { - __qos_parser_error( - "'%s' - illegal port number in port name", - tmp_str); - free(tmp_str); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - /* separate node name from port number */ - port_str[0] = '\0'; - - if (st_lookup(p_qos_policy->p_node_hash, - (st_data_t)tmp_str, - (st_data_t*)&p_node)) - { - /* we found the node, now get the right port */ - p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (!p_physp) { - __qos_parser_error( - "'%s' - port number out of range in port name", - tmp_str); - free(tmp_str); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - /* we found the port, now add it to guid table */ - __parser_add_port_to_port_map(&p_current_port_group->port_map, - p_physp); - } - free(tmp_str); - } - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -port_group_port_name_start: TK_PORT_NAME { - RESET_BUFFER; - } - ; - -port_group_port_guid: port_group_port_guid_start list_of_ranges { - /* 'port-guid' in 'port-group' - any num of instances */ - /* list of guid ranges */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - __parser_add_guid_range_to_port_map( - &p_current_port_group->port_map, - range_arr, - range_len); - } - } - ; - -port_group_port_guid_start: TK_PORT_GUID { - RESET_BUFFER; - } - ; - -port_group_pkey: port_group_pkey_start list_of_ranges { - /* 'pkey' in 'port-group' - any num of instances */ - /* list of pkey ranges */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - __parser_add_pkey_range_to_port_map( - &p_current_port_group->port_map, - range_arr, - range_len); - } - } - ; - -port_group_pkey_start: TK_PKEY { - RESET_BUFFER; - } - ; - -port_group_partition: port_group_partition_start string_list { - /* 'partition' in 'port-group' - any num of instances */ - __parser_add_partition_list_to_port_map( - &p_current_port_group->port_map, - &tmp_parser_struct.str_list); - } - ; - -port_group_partition_start: TK_PARTITION { - RESET_BUFFER; - } - ; - -port_group_node_type: port_group_node_type_start port_group_node_type_list { - /* 'node-type' in 'port-group' - any num of instances */ - } - ; - -port_group_node_type_start: TK_NODE_TYPE { - RESET_BUFFER; - } - ; - -port_group_node_type_list: node_type_item - | port_group_node_type_list TK_COMMA node_type_item - ; - -node_type_item: node_type_ca - | node_type_switch - | node_type_router - | node_type_all - | node_type_self - ; - -node_type_ca: TK_NODE_TYPE_CA { - p_current_port_group->node_types |= - OSM_QOS_POLICY_NODE_TYPE_CA; - } - ; - -node_type_switch: TK_NODE_TYPE_SWITCH { - p_current_port_group->node_types |= - OSM_QOS_POLICY_NODE_TYPE_SWITCH; - } - ; - -node_type_router: TK_NODE_TYPE_ROUTER { - p_current_port_group->node_types |= - OSM_QOS_POLICY_NODE_TYPE_ROUTER; - } - ; - -node_type_all: TK_NODE_TYPE_ALL { - p_current_port_group->node_types |= - (OSM_QOS_POLICY_NODE_TYPE_CA | - OSM_QOS_POLICY_NODE_TYPE_SWITCH | - OSM_QOS_POLICY_NODE_TYPE_ROUTER); - } - ; - -node_type_self: TK_NODE_TYPE_SELF { - osm_port_t * p_osm_port = - osm_get_port_by_guid(p_qos_policy->p_subn, - p_qos_policy->p_subn->sm_port_guid); - if (p_osm_port) - __parser_add_port_to_port_map( - &p_current_port_group->port_map, - p_osm_port->p_physp); - } - ; - - /* - * vlarb_scope_entry values: - * vlarb_scope_group - * vlarb_scope_across - * vlarb_scope_vlarb_high - * vlarb_scope_vlarb_low - * vlarb_scope_vlarb_high_limit - */ - - - -vlarb_scope_group: vlarb_scope_group_start string_list { - /* 'group' in 'vlarb-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_vlarb_scope->group_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -vlarb_scope_group_start: TK_GROUP { - RESET_BUFFER; - } - ; - -vlarb_scope_across: vlarb_scope_across_start string_list { - /* 'across' in 'vlarb-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_vlarb_scope->across_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -vlarb_scope_across_start: TK_ACROSS { - RESET_BUFFER; - } - ; - -vlarb_scope_vlarb_high_limit: vlarb_scope_vlarb_high_limit_start single_number { - /* 'vl-high-limit' in 'vlarb-scope' - one instance of one number */ - cl_list_iterator_t list_iterator; - uint64_t * p_tmp_num; - - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); - if (p_tmp_num) - { - p_current_vlarb_scope->vl_high_limit = (uint32_t)(*p_tmp_num); - p_current_vlarb_scope->vl_high_limit_set = TRUE; - free(p_tmp_num); - } - - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -vlarb_scope_vlarb_high_limit_start: TK_VLARB_HIGH_LIMIT { - RESET_BUFFER; - } - ; - -vlarb_scope_vlarb_high: vlarb_scope_vlarb_high_start num_list_with_dotdot { - /* 'vlarb-high' in 'vlarb-scope' - list of pairs of numbers with ':' and ',' */ - cl_list_iterator_t list_iterator; - uint64_t * num_pair; - - list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) - { - num_pair = (uint64_t*)cl_list_obj(list_iterator); - if (num_pair) - cl_list_insert_tail(&p_current_vlarb_scope->vlarb_high_list,num_pair); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - } - ; - -vlarb_scope_vlarb_high_start: TK_VLARB_HIGH { - RESET_BUFFER; - } - ; - -vlarb_scope_vlarb_low: vlarb_scope_vlarb_low_start num_list_with_dotdot { - /* 'vlarb-low' in 'vlarb-scope' - list of pairs of numbers with ':' and ',' */ - cl_list_iterator_t list_iterator; - uint64_t * num_pair; - - list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) - { - num_pair = (uint64_t*)cl_list_obj(list_iterator); - if (num_pair) - cl_list_insert_tail(&p_current_vlarb_scope->vlarb_low_list,num_pair); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - } - ; - -vlarb_scope_vlarb_low_start: TK_VLARB_LOW { - RESET_BUFFER; - } - ; - - /* - * sl2vl_scope_entry values: - * sl2vl_scope_group - * sl2vl_scope_across - * sl2vl_scope_across_from - * sl2vl_scope_across_to - * sl2vl_scope_from - * sl2vl_scope_to - * sl2vl_scope_sl2vl_table - */ - -sl2vl_scope_group: sl2vl_scope_group_start string_list { - /* 'group' in 'sl2vl-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_sl2vl_scope->group_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -sl2vl_scope_group_start: TK_GROUP { - RESET_BUFFER; - } - ; - -sl2vl_scope_across: sl2vl_scope_across_start string_list { - /* 'across' in 'sl2vl-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) { - cl_list_insert_tail(&p_current_sl2vl_scope->across_from_list,tmp_str); - cl_list_insert_tail(&p_current_sl2vl_scope->across_to_list,strdup(tmp_str)); - } - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -sl2vl_scope_across_start: TK_ACROSS { - RESET_BUFFER; - } - ; - -sl2vl_scope_across_from: sl2vl_scope_across_from_start string_list { - /* 'across-from' in 'sl2vl-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_sl2vl_scope->across_from_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -sl2vl_scope_across_from_start: TK_ACROSS_FROM { - RESET_BUFFER; - } - ; - -sl2vl_scope_across_to: sl2vl_scope_across_to_start string_list { - /* 'across-to' in 'sl2vl-scope' - any num of instances */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) { - cl_list_insert_tail(&p_current_sl2vl_scope->across_to_list,tmp_str); - } - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -sl2vl_scope_across_to_start: TK_ACROSS_TO { - RESET_BUFFER; - } - ; - -sl2vl_scope_from: sl2vl_scope_from_start sl2vl_scope_from_list_or_asterisk { - /* 'from' in 'sl2vl-scope' - any num of instances */ - } - ; - -sl2vl_scope_from_start: TK_FROM { - RESET_BUFFER; - } - ; - -sl2vl_scope_to: sl2vl_scope_to_start sl2vl_scope_to_list_or_asterisk { - /* 'to' in 'sl2vl-scope' - any num of instances */ - } - ; - -sl2vl_scope_to_start: TK_TO { - RESET_BUFFER; - } - ; - -sl2vl_scope_from_list_or_asterisk: sl2vl_scope_from_asterisk - | sl2vl_scope_from_list_of_ranges - ; - -sl2vl_scope_from_asterisk: TK_ASTERISK { - int i; - for (i = 0; i < OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH; i++) - p_current_sl2vl_scope->from[i] = TRUE; - } - ; - -sl2vl_scope_to_list_or_asterisk: sl2vl_scope_to_asterisk - | sl2vl_scope_to_list_of_ranges - ; - -sl2vl_scope_to_asterisk: TK_ASTERISK { - int i; - for (i = 0; i < OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH; i++) - p_current_sl2vl_scope->to[i] = TRUE; - } - ; - -sl2vl_scope_from_list_of_ranges: list_of_ranges { - int i; - cl_list_iterator_t list_iterator; - uint64_t * num_pair; - uint8_t num1, num2; - - list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) - { - num_pair = (uint64_t*)cl_list_obj(list_iterator); - if (num_pair) - { - if ( num_pair[0] < 0 || - num_pair[1] >= OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH ) - { - __qos_parser_error("port number out of range 'from' list"); - free(num_pair); - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - return 1; - } - num1 = (uint8_t)num_pair[0]; - num2 = (uint8_t)num_pair[1]; - free(num_pair); - for (i = num1; i <= num2; i++) - p_current_sl2vl_scope->from[i] = TRUE; - } - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - } - ; - -sl2vl_scope_to_list_of_ranges: list_of_ranges { - int i; - cl_list_iterator_t list_iterator; - uint64_t * num_pair; - uint8_t num1, num2; - - list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) - { - num_pair = (uint64_t*)cl_list_obj(list_iterator); - if (num_pair) - { - if ( num_pair[0] < 0 || - num_pair[1] >= OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH ) - { - __qos_parser_error("port number out of range 'to' list"); - free(num_pair); - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - return 1; - } - num1 = (uint8_t)num_pair[0]; - num2 = (uint8_t)num_pair[1]; - free(num_pair); - for (i = num1; i <= num2; i++) - p_current_sl2vl_scope->to[i] = TRUE; - } - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.num_pair_list); - } - ; - - -sl2vl_scope_sl2vl_table: sl2vl_scope_sl2vl_table_start num_list { - /* 'sl2vl-table' - one instance of exactly - OSM_QOS_POLICY_SL2VL_TABLE_LEN numbers */ - cl_list_iterator_t list_iterator; - uint64_t num; - uint64_t * p_num; - int i = 0; - - if (p_current_sl2vl_scope->sl2vl_table_set) - { - __qos_parser_error("sl2vl-scope has more than one sl2vl-table"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - - if (cl_list_count(&tmp_parser_struct.num_list) != OSM_QOS_POLICY_SL2VL_TABLE_LEN) - { - __qos_parser_error("wrong number of values in 'sl2vl-table' (should be 16)"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.num_list) ) - { - p_num = (uint64_t*)cl_list_obj(list_iterator); - num = *p_num; - free(p_num); - if (num >= OSM_QOS_POLICY_MAX_VL_NUM) - { - __qos_parser_error("wrong VL value in 'sl2vl-table' (should be 0 to 15)"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - - p_current_sl2vl_scope->sl2vl_table[i++] = (uint8_t)num; - list_iterator = cl_list_next(list_iterator); - } - p_current_sl2vl_scope->sl2vl_table_set = TRUE; - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -sl2vl_scope_sl2vl_table_start: TK_SL2VL_TABLE { - RESET_BUFFER; - } - ; - - /* - * qos_level_entry values: - * qos_level_name - * qos_level_use - * qos_level_sl - * qos_level_mtu_limit - * qos_level_rate_limit - * qos_level_packet_life - * qos_level_path_bits - * qos_level_pkey - */ - -qos_level_name: qos_level_name_start single_string { - /* 'name' of 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_qos_level->name) - { - __qos_parser_error("qos-level has multiple 'name' tags"); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_qos_level->name = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_level_name_start: TK_NAME { - RESET_BUFFER; - } - ; - -qos_level_use: qos_level_use_start single_string { - /* 'use' of 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_qos_level->use) - { - __qos_parser_error("qos-level has multiple 'use' tags"); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_qos_level->use = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_level_use_start: TK_USE { - RESET_BUFFER; - } - ; - -qos_level_sl: qos_level_sl_start single_number { - /* 'sl' in 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - uint64_t * p_num; - - if (p_current_qos_level->sl_set) - { - __qos_parser_error("'qos-level' has multiple 'sl' tags"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_num = (uint64_t*)cl_list_obj(list_iterator); - p_current_qos_level->sl = (uint8_t)(*p_num); - free(p_num); - p_current_qos_level->sl_set = TRUE; - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -qos_level_sl_start: TK_SL { - RESET_BUFFER; - } - ; - -qos_level_mtu_limit: qos_level_mtu_limit_start single_number { - /* 'mtu-limit' in 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - uint64_t * p_num; - - if (p_current_qos_level->mtu_limit_set) - { - __qos_parser_error("'qos-level' has multiple 'mtu-limit' tags"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_num = (uint64_t*)cl_list_obj(list_iterator); - p_current_qos_level->mtu_limit = (uint8_t)(*p_num); - free(p_num); - p_current_qos_level->mtu_limit_set = TRUE; - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -qos_level_mtu_limit_start: TK_MTU_LIMIT { - /* 'mtu-limit' in 'qos-level' - one instance */ - RESET_BUFFER; - } - ; - -qos_level_rate_limit: qos_level_rate_limit_start single_number { - /* 'rate-limit' in 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - uint64_t * p_num; - - if (p_current_qos_level->rate_limit_set) - { - __qos_parser_error("'qos-level' has multiple 'rate-limit' tags"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_num = (uint64_t*)cl_list_obj(list_iterator); - p_current_qos_level->rate_limit = (uint8_t)(*p_num); - free(p_num); - p_current_qos_level->rate_limit_set = TRUE; - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -qos_level_rate_limit_start: TK_RATE_LIMIT { - /* 'rate-limit' in 'qos-level' - one instance */ - RESET_BUFFER; - } - ; - -qos_level_packet_life: qos_level_packet_life_start single_number { - /* 'packet-life' in 'qos-level' - one instance */ - cl_list_iterator_t list_iterator; - uint64_t * p_num; - - if (p_current_qos_level->pkt_life_set) - { - __qos_parser_error("'qos-level' has multiple 'packet-life' tags"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - list_iterator = cl_list_head(&tmp_parser_struct.num_list); - p_num = (uint64_t*)cl_list_obj(list_iterator); - p_current_qos_level->pkt_life = (uint8_t)(*p_num); - free(p_num); - p_current_qos_level->pkt_life_set= TRUE; - cl_list_remove_all(&tmp_parser_struct.num_list); - } - ; - -qos_level_packet_life_start: TK_PACKET_LIFE { - /* 'packet-life' in 'qos-level' - one instance */ - RESET_BUFFER; - } - ; - -qos_level_path_bits: qos_level_path_bits_start list_of_ranges { - /* 'path-bits' in 'qos-level' - any num of instances */ - /* list of path bit ranges */ - - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - if ( !p_current_qos_level->path_bits_range_len ) - { - p_current_qos_level->path_bits_range_arr = range_arr; - p_current_qos_level->path_bits_range_len = range_len; - } - else - { - uint64_t ** new_range_arr; - unsigned new_range_len; - __merge_rangearr( p_current_qos_level->path_bits_range_arr, - p_current_qos_level->path_bits_range_len, - range_arr, - range_len, - &new_range_arr, - &new_range_len ); - p_current_qos_level->path_bits_range_arr = new_range_arr; - p_current_qos_level->path_bits_range_len = new_range_len; - } - } - } - ; - -qos_level_path_bits_start: TK_PATH_BITS { - RESET_BUFFER; - } - ; - -qos_level_pkey: qos_level_pkey_start list_of_ranges { - /* 'pkey' in 'qos-level' - num of instances of list of ranges */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - if ( !p_current_qos_level->pkey_range_len ) - { - p_current_qos_level->pkey_range_arr = range_arr; - p_current_qos_level->pkey_range_len = range_len; - } - else - { - uint64_t ** new_range_arr; - unsigned new_range_len; - __merge_rangearr( p_current_qos_level->pkey_range_arr, - p_current_qos_level->pkey_range_len, - range_arr, - range_len, - &new_range_arr, - &new_range_len ); - p_current_qos_level->pkey_range_arr = new_range_arr; - p_current_qos_level->pkey_range_len = new_range_len; - } - } - } - ; - -qos_level_pkey_start: TK_PKEY { - RESET_BUFFER; - } - ; - - /* - * qos_match_rule_entry values: - * qos_match_rule_use - * qos_match_rule_qos_class - * qos_match_rule_qos_level_name - * qos_match_rule_source - * qos_match_rule_destination - * qos_match_rule_service_id - * qos_match_rule_pkey - */ - - -qos_match_rule_use: qos_match_rule_use_start single_string { - /* 'use' of 'qos-match-rule' - one instance */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_qos_match_rule->use) - { - __qos_parser_error("'qos-match-rule' has multiple 'use' tags"); - cl_list_remove_all(&tmp_parser_struct.str_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_qos_match_rule->use = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_match_rule_use_start: TK_USE { - RESET_BUFFER; - } - ; - -qos_match_rule_qos_class: qos_match_rule_qos_class_start list_of_ranges { - /* 'qos-class' in 'qos-match-rule' - num of instances of list of ranges */ - /* list of class ranges (QoS Class is 12-bit value) */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - if ( !p_current_qos_match_rule->qos_class_range_len ) - { - p_current_qos_match_rule->qos_class_range_arr = range_arr; - p_current_qos_match_rule->qos_class_range_len = range_len; - } - else - { - uint64_t ** new_range_arr; - unsigned new_range_len; - __merge_rangearr( p_current_qos_match_rule->qos_class_range_arr, - p_current_qos_match_rule->qos_class_range_len, - range_arr, - range_len, - &new_range_arr, - &new_range_len ); - p_current_qos_match_rule->qos_class_range_arr = new_range_arr; - p_current_qos_match_rule->qos_class_range_len = new_range_len; - } - } - } - ; - -qos_match_rule_qos_class_start: TK_QOS_CLASS { - RESET_BUFFER; - } - ; - -qos_match_rule_source: qos_match_rule_source_start string_list { - /* 'source' in 'qos-match-rule' - text */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_qos_match_rule->source_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_match_rule_source_start: TK_SOURCE { - RESET_BUFFER; - } - ; - -qos_match_rule_destination: qos_match_rule_destination_start string_list { - /* 'destination' in 'qos-match-rule' - text */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - cl_list_insert_tail(&p_current_qos_match_rule->destination_list,tmp_str); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_match_rule_destination_start: TK_DESTINATION { - RESET_BUFFER; - } - ; - -qos_match_rule_qos_level_name: qos_match_rule_qos_level_name_start single_string { - /* 'qos-level-name' in 'qos-match-rule' - single string */ - cl_list_iterator_t list_iterator; - char * tmp_str; - - if (p_current_qos_match_rule->qos_level_name) - { - __qos_parser_error("qos-match-rule has multiple 'qos-level-name' tags"); - cl_list_remove_all(&tmp_parser_struct.num_list); - return 1; - } - - list_iterator = cl_list_head(&tmp_parser_struct.str_list); - if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) - { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) - p_current_qos_match_rule->qos_level_name = tmp_str; - } - cl_list_remove_all(&tmp_parser_struct.str_list); - } - ; - -qos_match_rule_qos_level_name_start: TK_QOS_LEVEL_NAME { - RESET_BUFFER; - } - ; - -qos_match_rule_service_id: qos_match_rule_service_id_start list_of_ranges { - /* 'service-id' in 'qos-match-rule' - num of instances of list of ranges */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - if ( !p_current_qos_match_rule->service_id_range_len ) - { - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - } - else - { - uint64_t ** new_range_arr; - unsigned new_range_len; - __merge_rangearr( p_current_qos_match_rule->service_id_range_arr, - p_current_qos_match_rule->service_id_range_len, - range_arr, - range_len, - &new_range_arr, - &new_range_len ); - p_current_qos_match_rule->service_id_range_arr = new_range_arr; - p_current_qos_match_rule->service_id_range_len = new_range_len; - } - } - } - ; - -qos_match_rule_service_id_start: TK_SERVICE_ID { - RESET_BUFFER; - } - ; - -qos_match_rule_pkey: qos_match_rule_pkey_start list_of_ranges { - /* 'pkey' in 'qos-match-rule' - num of instances of list of ranges */ - if (cl_list_count(&tmp_parser_struct.num_pair_list)) - { - uint64_t ** range_arr; - unsigned range_len; - - __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - - if ( !p_current_qos_match_rule->pkey_range_len ) - { - p_current_qos_match_rule->pkey_range_arr = range_arr; - p_current_qos_match_rule->pkey_range_len = range_len; - } - else - { - uint64_t ** new_range_arr; - unsigned new_range_len; - __merge_rangearr( p_current_qos_match_rule->pkey_range_arr, - p_current_qos_match_rule->pkey_range_len, - range_arr, - range_len, - &new_range_arr, - &new_range_len ); - p_current_qos_match_rule->pkey_range_arr = new_range_arr; - p_current_qos_match_rule->pkey_range_len = new_range_len; - } - } - } - ; - -qos_match_rule_pkey_start: TK_PKEY { - RESET_BUFFER; - } - ; - - - /* - * Common part - */ - - -single_string: single_string_elems { - cl_list_insert_tail(&tmp_parser_struct.str_list, - strdup(__parser_strip_white(tmp_parser_struct.str))); - tmp_parser_struct.str[0] = '\0'; - } - ; - -single_string_elems: single_string_element - | single_string_elems single_string_element - ; - -single_string_element: TK_TEXT { - strcat(tmp_parser_struct.str,$1); - free($1); - } - ; - - -string_list: single_string - | string_list TK_COMMA single_string - ; - - - -single_number: number - ; - -num_list: number - | num_list TK_COMMA number - ; - -number: TK_NUMBER { - uint64_t * p_num = (uint64_t*)malloc(sizeof(uint64_t)); - __parser_str2uint64(p_num,$1); - free($1); - cl_list_insert_tail(&tmp_parser_struct.num_list, p_num); - } - ; - -num_list_with_dotdot: number_from_pair_1 TK_DOTDOT number_from_pair_2 { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - | num_list_with_dotdot TK_COMMA number_from_pair_1 TK_DOTDOT number_from_pair_2 { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - ; - -number_from_pair_1: TK_NUMBER { - __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); - free($1); - } - ; - -number_from_pair_2: TK_NUMBER { - __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); - free($1); - } - ; - -list_of_ranges: num_list_with_dash - ; - -num_list_with_dash: single_number_from_range { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - | number_from_range_1 TK_DASH number_from_range_2 { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - if (tmp_parser_struct.num_pair[0] <= tmp_parser_struct.num_pair[1]) { - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - } - else { - num_pair[1] = tmp_parser_struct.num_pair[0]; - num_pair[0] = tmp_parser_struct.num_pair[1]; - } - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - | num_list_with_dash TK_COMMA number_from_range_1 TK_DASH number_from_range_2 { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - if (tmp_parser_struct.num_pair[0] <= tmp_parser_struct.num_pair[1]) { - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - } - else { - num_pair[1] = tmp_parser_struct.num_pair[0]; - num_pair[0] = tmp_parser_struct.num_pair[1]; - } - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - | num_list_with_dash TK_COMMA single_number_from_range { - uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); - num_pair[0] = tmp_parser_struct.num_pair[0]; - num_pair[1] = tmp_parser_struct.num_pair[1]; - cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); - } - ; - -single_number_from_range: TK_NUMBER { - __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); - __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); - free($1); - } - ; - -number_from_range_1: TK_NUMBER { - __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); - free($1); - } - ; - -number_from_range_2: TK_NUMBER { - __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); - free($1); - } - ; - -%% - -/*************************************************** - ***************************************************/ - -int osm_qos_parse_policy_file(IN osm_subn_t * const p_subn) -{ - int res = 0; - static boolean_t first_time = TRUE; - p_qos_parser_osm_log = &p_subn->p_osm->log; - - OSM_LOG_ENTER(p_qos_parser_osm_log); - - osm_qos_policy_destroy(p_subn->p_qos_policy); - p_subn->p_qos_policy = NULL; - - __qos_parser_in = fopen (p_subn->opt.qos_policy_file, "r"); - if (!__qos_parser_in) - { - if (strcmp(p_subn->opt.qos_policy_file,OSM_DEFAULT_QOS_POLICY_FILE)) { - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC01: " - "Failed opening QoS policy file %s - %s\n", - p_subn->opt.qos_policy_file, strerror(errno)); - res = 1; - } - else - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_VERBOSE, - "QoS policy file not found (%s)\n", - p_subn->opt.qos_policy_file); - - goto Exit; - } - - if (first_time) - { - first_time = FALSE; - __setup_simple_qos_levels(); - __setup_ulp_match_rules(); - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_INFO, - "Loading QoS policy file (%s)\n", - p_subn->opt.qos_policy_file); - } - else - /* - * ULP match rules list was emptied at the end of - * previous parsing iteration. - * What's left is to clear simple QoS levels. - */ - __clear_simple_qos_levels(); - - column_num = 1; - line_num = 1; - - p_subn->p_qos_policy = osm_qos_policy_create(p_subn); - - __parser_tmp_struct_init(); - p_qos_policy = p_subn->p_qos_policy; - - res = __qos_parser_parse(); - - __parser_tmp_struct_destroy(); - - if (res != 0) - { - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC03: " - "Failed parsing QoS policy file (%s)\n", - p_subn->opt.qos_policy_file); - osm_qos_policy_destroy(p_subn->p_qos_policy); - p_subn->p_qos_policy = NULL; - res = 1; - goto Exit; - } - - /* add generated ULP match rules to the usual match rules */ - __process_ulp_match_rules(); - - if (osm_qos_policy_validate(p_subn->p_qos_policy,p_qos_parser_osm_log)) - { - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC04: " - "Error(s) in QoS policy file (%s)\n", - p_subn->opt.qos_policy_file); - fprintf(stderr, "Error(s) in QoS policy file (%s)\n", - p_subn->opt.qos_policy_file); - osm_qos_policy_destroy(p_subn->p_qos_policy); - p_subn->p_qos_policy = NULL; - res = 1; - goto Exit; - } - - Exit: - if (__qos_parser_in) - fclose(__qos_parser_in); - OSM_LOG_EXIT(p_qos_parser_osm_log); - return res; -} - -/*************************************************** - ***************************************************/ - -int __qos_parser_wrap() -{ - return(1); -} - -/*************************************************** - ***************************************************/ - -static void __qos_parser_error(const char *format, ...) -{ - char s[256]; - va_list pvar; - - OSM_LOG_ENTER(p_qos_parser_osm_log); - - va_start(pvar, format); - vsnprintf(s, sizeof(s), format, pvar); - va_end(pvar); - - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC05: " - "Syntax error (line %d:%d): %s\n", - line_num, column_num, s); - fprintf(stderr, "Error in QoS Policy File (line %d:%d): %s.\n", - line_num, column_num, s); - OSM_LOG_EXIT(p_qos_parser_osm_log); -} - -/*************************************************** - ***************************************************/ - -static char * __parser_strip_white(char * str) -{ - int i; - for (i = (strlen(str)-1); i >= 0; i--) - { - if (isspace(str[i])) - str[i] = '\0'; - else - break; - } - for (i = 0; i < strlen(str); i++) - { - if (!isspace(str[i])) - break; - } - return &(str[i]); -} - -/*************************************************** - ***************************************************/ - -static void __parser_str2uint64(uint64_t * p_val, char * str) -{ - *p_val = strtoull(str, NULL, 0); -} - -/*************************************************** - ***************************************************/ - -static void __parser_port_group_start() -{ - p_current_port_group = osm_qos_policy_port_group_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_port_group_end() -{ - if(!p_current_port_group->name) - { - __qos_parser_error("port-group validation failed - no port group name specified"); - return -1; - } - - cl_list_insert_tail(&p_qos_policy->port_groups, - p_current_port_group); - p_current_port_group = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_vlarb_scope_start() -{ - p_current_vlarb_scope = osm_qos_policy_vlarb_scope_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_vlarb_scope_end() -{ - if ( !cl_list_count(&p_current_vlarb_scope->group_list) && - !cl_list_count(&p_current_vlarb_scope->across_list) ) - { - __qos_parser_error("vlarb-scope validation failed - no port groups specified by 'group' or by 'across'"); - return -1; - } - - cl_list_insert_tail(&p_qos_policy->vlarb_tables, - p_current_vlarb_scope); - p_current_vlarb_scope = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_sl2vl_scope_start() -{ - p_current_sl2vl_scope = osm_qos_policy_sl2vl_scope_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_sl2vl_scope_end() -{ - if (!p_current_sl2vl_scope->sl2vl_table_set) - { - __qos_parser_error("sl2vl-scope validation failed - no sl2vl table specified"); - return -1; - } - if ( !cl_list_count(&p_current_sl2vl_scope->group_list) && - !cl_list_count(&p_current_sl2vl_scope->across_to_list) && - !cl_list_count(&p_current_sl2vl_scope->across_from_list) ) - { - __qos_parser_error("sl2vl-scope validation failed - no port groups specified by 'group', 'across-to' or 'across-from'"); - return -1; - } - - cl_list_insert_tail(&p_qos_policy->sl2vl_tables, - p_current_sl2vl_scope); - p_current_sl2vl_scope = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_qos_level_start() -{ - p_current_qos_level = osm_qos_policy_qos_level_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_qos_level_end() -{ - if (!p_current_qos_level->sl_set) - { - __qos_parser_error("qos-level validation failed - no 'sl' specified"); - return -1; - } - if (!p_current_qos_level->name) - { - __qos_parser_error("qos-level validation failed - no 'name' specified"); - return -1; - } - - cl_list_insert_tail(&p_qos_policy->qos_levels, - p_current_qos_level); - p_current_qos_level = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_match_rule_start() -{ - p_current_qos_match_rule = osm_qos_policy_match_rule_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_match_rule_end() -{ - if (!p_current_qos_match_rule->qos_level_name) - { - __qos_parser_error("match-rule validation failed - no 'qos-level-name' specified"); - return -1; - } - - cl_list_insert_tail(&p_qos_policy->qos_match_rules, - p_current_qos_match_rule); - p_current_qos_match_rule = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_ulp_match_rule_start() -{ - p_current_qos_match_rule = osm_qos_policy_match_rule_create(); -} - -/*************************************************** - ***************************************************/ - -static int __parser_ulp_match_rule_end() -{ - CL_ASSERT(p_current_qos_match_rule->p_qos_level); - cl_list_insert_tail(&__ulp_match_rules, - p_current_qos_match_rule); - p_current_qos_match_rule = NULL; - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __parser_tmp_struct_init() -{ - tmp_parser_struct.str[0] = '\0'; - cl_list_construct(&tmp_parser_struct.str_list); - cl_list_init(&tmp_parser_struct.str_list, 10); - cl_list_construct(&tmp_parser_struct.num_list); - cl_list_init(&tmp_parser_struct.num_list, 10); - cl_list_construct(&tmp_parser_struct.num_pair_list); - cl_list_init(&tmp_parser_struct.num_pair_list, 10); -} - -/*************************************************** - ***************************************************/ - -/* - * Do NOT free objects from the temp struct. - * Either they are inserted into the parse tree data - * structure, or they are already freed when copying - * their values to the parse tree data structure. - */ -static void __parser_tmp_struct_reset() -{ - tmp_parser_struct.str[0] = '\0'; - cl_list_remove_all(&tmp_parser_struct.str_list); - cl_list_remove_all(&tmp_parser_struct.num_list); - cl_list_remove_all(&tmp_parser_struct.num_pair_list); -} - -/*************************************************** - ***************************************************/ - -static void __parser_tmp_struct_destroy() -{ - __parser_tmp_struct_reset(); - cl_list_destroy(&tmp_parser_struct.str_list); - cl_list_destroy(&tmp_parser_struct.num_list); - cl_list_destroy(&tmp_parser_struct.num_pair_list); -} - -/*************************************************** - ***************************************************/ - -#define __SIMPLE_QOS_LEVEL_NAME "SimpleQoSLevel_SL" -#define __SIMPLE_QOS_LEVEL_DEFAULT_NAME "SimpleQoSLevel_DEFAULT" - -static void __setup_simple_qos_levels() -{ - uint8_t i; - char tmp_buf[30]; - memset(osm_qos_policy_simple_qos_levels, 0, - sizeof(osm_qos_policy_simple_qos_levels)); - for (i = 0; i < 16; i++) - { - osm_qos_policy_simple_qos_levels[i].sl = i; - osm_qos_policy_simple_qos_levels[i].sl_set = TRUE; - sprintf(tmp_buf, "%s%u", __SIMPLE_QOS_LEVEL_NAME, i); - osm_qos_policy_simple_qos_levels[i].name = strdup(tmp_buf); - } - - memset(&__default_simple_qos_level, 0, - sizeof(__default_simple_qos_level)); - __default_simple_qos_level.name = - strdup(__SIMPLE_QOS_LEVEL_DEFAULT_NAME); -} - -/*************************************************** - ***************************************************/ - -static void __clear_simple_qos_levels() -{ - /* - * Simple QoS levels are static. - * What's left is to invalidate default simple QoS level. - */ - __default_simple_qos_level.sl_set = FALSE; -} - -/*************************************************** - ***************************************************/ - -static void __setup_ulp_match_rules() -{ - cl_list_construct(&__ulp_match_rules); - cl_list_init(&__ulp_match_rules, 10); -} - -/*************************************************** - ***************************************************/ - -static void __process_ulp_match_rules() -{ - cl_list_iterator_t list_iterator; - osm_qos_match_rule_t *p_qos_match_rule = NULL; - - list_iterator = cl_list_head(&__ulp_match_rules); - while (list_iterator != cl_list_end(&__ulp_match_rules)) - { - p_qos_match_rule = (osm_qos_match_rule_t *) cl_list_obj(list_iterator); - if (p_qos_match_rule) - cl_list_insert_tail(&p_qos_policy->qos_match_rules, - p_qos_match_rule); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(&__ulp_match_rules); -} - -/*************************************************** - ***************************************************/ - -static int OSM_CDECL -__cmp_num_range( - const void * p1, - const void * p2) -{ - uint64_t * pair1 = *((uint64_t **)p1); - uint64_t * pair2 = *((uint64_t **)p2); - - if (pair1[0] < pair2[0]) - return -1; - if (pair1[0] > pair2[0]) - return 1; - - if (pair1[1] < pair2[1]) - return -1; - if (pair1[1] > pair2[1]) - return 1; - - return 0; -} - -/*************************************************** - ***************************************************/ - -static void __sort_reduce_rangearr( - uint64_t ** arr, - unsigned arr_len, - uint64_t ** * p_res_arr, - unsigned * p_res_arr_len ) -{ - unsigned i = 0; - unsigned j = 0; - unsigned last_valid_ind = 0; - unsigned valid_cnt = 0; - uint64_t ** res_arr; - boolean_t * is_valid_arr; - - *p_res_arr = NULL; - *p_res_arr_len = 0; - - qsort(arr, arr_len, sizeof(uint64_t*), __cmp_num_range); - - is_valid_arr = (boolean_t *)malloc(arr_len * sizeof(boolean_t)); - is_valid_arr[last_valid_ind] = TRUE; - valid_cnt++; - for (i = 1; i < arr_len; i++) - { - if (arr[i][0] <= arr[last_valid_ind][1]) - { - if (arr[i][1] > arr[last_valid_ind][1]) - arr[last_valid_ind][1] = arr[i][1]; - free(arr[i]); - arr[i] = NULL; - is_valid_arr[i] = FALSE; - } - else if ((arr[i][0] - 1) == arr[last_valid_ind][1]) - { - arr[last_valid_ind][1] = arr[i][1]; - free(arr[i]); - arr[i] = NULL; - is_valid_arr[i] = FALSE; - } - else - { - is_valid_arr[i] = TRUE; - last_valid_ind = i; - valid_cnt++; - } - } - - res_arr = (uint64_t **)malloc(valid_cnt * sizeof(uint64_t *)); - for (i = 0; i < arr_len; i++) - { - if (is_valid_arr[i]) - res_arr[j++] = arr[i]; - } - free(is_valid_arr); - free(arr); - - *p_res_arr = res_arr; - *p_res_arr_len = valid_cnt; -} - -/*************************************************** - ***************************************************/ - -static void __pkey_rangelist2rangearr( - cl_list_t * p_list, - uint64_t ** * p_arr, - unsigned * p_arr_len) -{ - uint64_t tmp_pkey; - uint64_t * p_pkeys; - cl_list_iterator_t list_iterator; - - list_iterator= cl_list_head(p_list); - while( list_iterator != cl_list_end(p_list) ) - { - p_pkeys = (uint64_t *)cl_list_obj(list_iterator); - p_pkeys[0] &= 0x7fff; - p_pkeys[1] &= 0x7fff; - if (p_pkeys[0] > p_pkeys[1]) - { - tmp_pkey = p_pkeys[1]; - p_pkeys[1] = p_pkeys[0]; - p_pkeys[0] = tmp_pkey; - } - list_iterator = cl_list_next(list_iterator); - } - - __rangelist2rangearr(p_list, p_arr, p_arr_len); -} - -/*************************************************** - ***************************************************/ - -static void __rangelist2rangearr( - cl_list_t * p_list, - uint64_t ** * p_arr, - unsigned * p_arr_len) -{ - cl_list_iterator_t list_iterator; - unsigned len = cl_list_count(p_list); - unsigned i = 0; - uint64_t ** tmp_arr; - uint64_t ** res_arr = NULL; - unsigned res_arr_len = 0; - - tmp_arr = (uint64_t **)malloc(len * sizeof(uint64_t *)); - - list_iterator = cl_list_head(p_list); - while( list_iterator != cl_list_end(p_list) ) - { - tmp_arr[i++] = (uint64_t *)cl_list_obj(list_iterator); - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(p_list); - - __sort_reduce_rangearr( tmp_arr, - len, - &res_arr, - &res_arr_len ); - *p_arr = res_arr; - *p_arr_len = res_arr_len; -} - -/*************************************************** - ***************************************************/ - -static void __merge_rangearr( - uint64_t ** range_arr_1, - unsigned range_len_1, - uint64_t ** range_arr_2, - unsigned range_len_2, - uint64_t ** * p_arr, - unsigned * p_arr_len ) -{ - unsigned i = 0; - unsigned j = 0; - unsigned len = range_len_1 + range_len_2; - uint64_t ** tmp_arr; - uint64_t ** res_arr = NULL; - unsigned res_arr_len = 0; - - *p_arr = NULL; - *p_arr_len = 0; - - tmp_arr = (uint64_t **)malloc(len * sizeof(uint64_t *)); - - for (i = 0; i < range_len_1; i++) - tmp_arr[j++] = range_arr_1[i]; - for (i = 0; i < range_len_2; i++) - tmp_arr[j++] = range_arr_2[i]; - free(range_arr_1); - free(range_arr_2); - - __sort_reduce_rangearr( tmp_arr, - len, - &res_arr, - &res_arr_len ); - *p_arr = res_arr; - *p_arr_len = res_arr_len; -} - -/*************************************************** - ***************************************************/ - -static void __parser_add_port_to_port_map( - cl_qmap_t * p_map, - osm_physp_t * p_physp) -{ - if (cl_qmap_get(p_map, cl_ntoh64(osm_physp_get_port_guid(p_physp))) == - cl_qmap_end(p_map)) - { - osm_qos_port_t * p_port = osm_qos_policy_port_create(p_physp); - if (p_port) - cl_qmap_insert(p_map, - cl_ntoh64(osm_physp_get_port_guid(p_physp)), - &p_port->map_item); - } -} - -/*************************************************** - ***************************************************/ - -static void __parser_add_guid_range_to_port_map( - cl_qmap_t * p_map, - uint64_t ** range_arr, - unsigned range_len) -{ - unsigned i; - uint64_t guid_ho; - osm_port_t * p_osm_port; - - if (!range_arr || !range_len) - return; - - for (i = 0; i < range_len; i++) { - for (guid_ho = range_arr[i][0]; guid_ho <= range_arr[i][1]; guid_ho++) { - p_osm_port = - osm_get_port_by_guid(p_qos_policy->p_subn, cl_hton64(guid_ho)); - if (p_osm_port) - __parser_add_port_to_port_map(p_map, p_osm_port->p_physp); - } - free(range_arr[i]); - } - free(range_arr); -} - -/*************************************************** - ***************************************************/ - -static void __parser_add_pkey_range_to_port_map( - cl_qmap_t * p_map, - uint64_t ** range_arr, - unsigned range_len) -{ - unsigned i; - uint64_t pkey_64; - ib_net16_t pkey; - osm_prtn_t * p_prtn; - - if (!range_arr || !range_len) - return; - - for (i = 0; i < range_len; i++) { - for (pkey_64 = range_arr[i][0]; pkey_64 <= range_arr[i][1]; pkey_64++) { - pkey = cl_hton16((uint16_t)(pkey_64 & 0x7fff)); - p_prtn = (osm_prtn_t *) - cl_qmap_get(&p_qos_policy->p_subn->prtn_pkey_tbl, pkey); - if (p_prtn != (osm_prtn_t *)cl_qmap_end( - &p_qos_policy->p_subn->prtn_pkey_tbl)) { - __parser_add_map_to_port_map(p_map, &p_prtn->part_guid_tbl); - __parser_add_map_to_port_map(p_map, &p_prtn->full_guid_tbl); - } - } - free(range_arr[i]); - } - free(range_arr); -} - -/*************************************************** - ***************************************************/ - -static void __parser_add_partition_list_to_port_map( - cl_qmap_t * p_map, - cl_list_t * p_list) -{ - cl_list_iterator_t list_iterator; - char * tmp_str; - osm_prtn_t * p_prtn; - - /* extract all the ports from the partition - to the port map of this port group */ - list_iterator = cl_list_head(p_list); - while(list_iterator != cl_list_end(p_list)) { - tmp_str = (char*)cl_list_obj(list_iterator); - if (tmp_str) { - p_prtn = osm_prtn_find_by_name(p_qos_policy->p_subn, tmp_str); - if (p_prtn) { - __parser_add_map_to_port_map(p_map, &p_prtn->part_guid_tbl); - __parser_add_map_to_port_map(p_map, &p_prtn->full_guid_tbl); - } - free(tmp_str); - } - list_iterator = cl_list_next(list_iterator); - } - cl_list_remove_all(p_list); -} - -/*************************************************** - ***************************************************/ - -static void __parser_add_map_to_port_map( - cl_qmap_t * p_dmap, - cl_map_t * p_smap) -{ - cl_map_iterator_t map_iterator; - osm_physp_t * p_physp; - - if (!p_dmap || !p_smap) - return; - - map_iterator = cl_map_head(p_smap); - while (map_iterator != cl_map_end(p_smap)) { - p_physp = (osm_physp_t*)cl_map_obj(map_iterator); - __parser_add_port_to_port_map(p_dmap, p_physp); - map_iterator = cl_map_next(map_iterator); - } -} - -/*************************************************** - ***************************************************/ - -static int __validate_pkeys( uint64_t ** range_arr, - unsigned range_len, - boolean_t is_ipoib) -{ - unsigned i; - uint64_t pkey_64; - ib_net16_t pkey; - osm_prtn_t * p_prtn; - - if (!range_arr || !range_len) - return 0; - - for (i = 0; i < range_len; i++) { - for (pkey_64 = range_arr[i][0]; pkey_64 <= range_arr[i][1]; pkey_64++) { - pkey = cl_hton16((uint16_t)(pkey_64 & 0x7fff)); - p_prtn = (osm_prtn_t *) - cl_qmap_get(&p_qos_policy->p_subn->prtn_pkey_tbl, pkey); - - if (p_prtn == (osm_prtn_t *)cl_qmap_end( - &p_qos_policy->p_subn->prtn_pkey_tbl)) - p_prtn = NULL; - - if (is_ipoib) { - /* - * Be very strict for IPoIB partition: - * - the partition for the pkey have to exist - * - it has to have at least 2 full members - */ - if (!p_prtn) { - __qos_parser_error("IPoIB partition, pkey 0x%04X - " - "partition doesn't exist", - cl_ntoh16(pkey)); - return 1; - } - else if (cl_map_count(&p_prtn->full_guid_tbl) < 2) { - __qos_parser_error("IPoIB partition, pkey 0x%04X - " - "partition has less than two full members", - cl_ntoh16(pkey)); - return 1; - } - } - else if (!p_prtn) { - /* - * For non-IPoIB pkey we just want to check that - * the relevant partition exists. - * And even if it doesn't, don't exit - just print - * error message and continue. - */ - OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC02: " - "pkey 0x%04X - partition doesn't exist", - cl_ntoh16(pkey)); - } - } - } - return 0; -} - -/*************************************************** - ***************************************************/ diff --git a/opensm/opensm/osm_qos_parser_l.l b/opensm/opensm/osm_qos_parser_l.l new file mode 100644 index 0000000..ecdee8a --- /dev/null +++ b/opensm/opensm/osm_qos_parser_l.l @@ -0,0 +1,394 @@ +%{ +/* + * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +/* + * Abstract: + * Lexer of OSM QoS parser. + * + * Environment: + * Linux User Mode + * + * Author: + * Yevgeny Kliteynik, Mellanox + */ + +#include +#include "osm_qos_parser_y.h" + +#define HANDLE_IF_IN_DESCRIPTION if (in_description) { yylval = strdup(yytext); return TK_TEXT; } + +#define SAVE_POS save_pos() +static void save_pos(); + +extern int column_num; +extern int line_num; +extern FILE * yyin; +extern YYSTYPE yylval; + +boolean_t in_description = FALSE; +boolean_t in_list_of_hex_num_ranges = FALSE; +boolean_t in_node_type = FALSE; +boolean_t in_list_of_numbers = FALSE; +boolean_t in_list_of_strings = FALSE; +boolean_t in_list_of_num_pairs = FALSE; +boolean_t in_asterisk_or_list_of_numbers = FALSE; +boolean_t in_list_of_num_ranges = FALSE; +boolean_t in_single_string = FALSE; +boolean_t in_single_number = FALSE; + +static void reset_new_line_flags(); +#define RESET_NEW_LINE_FLAGS reset_new_line_flags() + +#define START_USE {in_description = TRUE;} /* list of strings including whitespace (description) */ +#define START_PORT_GUID {in_list_of_hex_num_ranges = TRUE;} /* comma-separated list of hex num ranges */ +#define START_PORT_NAME {in_list_of_strings = TRUE;} /* comma-separated list of following strings: ../../.. */ +#define START_PARTITION {in_single_string = TRUE;} /* single string w/o whitespaces (partition name) */ +#define START_NAME {in_single_string = TRUE;} /* single string w/o whitespaces (port group name) */ +#define START_QOS_LEVEL_NAME {in_single_string = TRUE;} /* single string w/o whitespaces (qos level name in match rule) */ + +#define START_NODE_TYPE {in_node_type = TRUE;} /* comma-separated list of node types (ROUTER,CA,...) */ +#define START_SL2VL_TABLE {in_list_of_numbers = TRUE;} /* comma-separated list of hex or dec numbers */ + +#define START_GROUP {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ +#define START_ACROSS {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ +#define START_ACROSS_TO {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ +#define START_ACROSS_FROM {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ +#define START_SOURCE {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ +#define START_DESTINATION {in_list_of_strings = TRUE;} /* list of strings w/o whitespaces (group names) */ + +#define START_VLARB_HIGH {in_list_of_num_pairs = TRUE;} /* comma-separated list of hex or dec num pairs: "num1:num2" */ +#define START_VLARB_LOW {in_list_of_num_pairs = TRUE;} /* comma-separated list of hex or dec num pairs: "num1:num2" */ + +#define START_TO {in_asterisk_or_list_of_numbers = TRUE;} /* (asterisk) or (comma-separated list of hex or dec numbers) */ +#define START_FROM {in_asterisk_or_list_of_numbers = TRUE;} /* (asterisk) or (comma-separated list of hex or dec numbers) */ + +#define START_PATH_BITS {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_QOS_CLASS {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_SERVICE_ID {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_PKEY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ + +#define START_SL {in_single_number = TRUE;} /* single number */ +#define START_VLARB_HIGH_LIMIT {in_single_number = TRUE;} /* single number */ +#define START_MTU_LIMIT {in_single_number = TRUE;} /* single number */ +#define START_RATE_LIMIT {in_single_number = TRUE;} /* single number */ +#define START_PACKET_LIFE {in_single_number = TRUE;} /* single number */ + +#define START_ULP_DEFAULT {in_single_number = TRUE;} /* single number */ +#define START_ULP_ANY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_ULP_SDP_DEFAULT {in_single_number = TRUE;} /* single number */ +#define START_ULP_SDP_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_ULP_RDS_DEFAULT {in_single_number = TRUE;} /* single number */ +#define START_ULP_RDS_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_ULP_ISER_DEFAULT {in_single_number = TRUE;} /* single number */ +#define START_ULP_ISER_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_ULP_SRP_GUID {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ +#define START_ULP_IPOIB_DEFAULT {in_single_number = TRUE;} /* single number */ +#define START_ULP_IPOIB_PKEY {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ + + +%} + +%option nounput noinput + +QOS_ULPS_START qos\-ulps +QOS_ULPS_END end\-qos\-ulps +PORT_GROUPS_START port\-groups +PORT_GROUPS_END end\-port\-groups +PORT_GROUP_START port\-group +PORT_GROUP_END end\-port\-group +PORT_NUM port\-num +NAME name +USE use +PORT_GUID port\-guid +TARGET_PORT_GUID target\-port\-guid +PORT_NAME port\-name +PARTITION partition +NODE_TYPE node\-type +QOS_SETUP_START qos\-setup +QOS_SETUP_END end\-qos\-setup +VLARB_TABLES_START vlarb\-tables +VLARB_TABLES_END end\-vlarb\-tables +VLARB_SCOPE_START vlarb\-scope +VLARB_SCOPE_END end\-vlarb\-scope +GROUP group +ACROSS across +VLARB_HIGH vlarb\-high +VLARB_LOW vlarb\-low +VLARB_HIGH_LIMIT vl\-high\-limit +SL2VL_TABLES_START sl2vl\-tables +SL2VL_TABLES_END end\-sl2vl\-tables +SL2VL_SCOPE_START sl2vl\-scope +SL2VL_SCOPE_END end\-sl2vl\-scope +TO to +FROM from +ACROSS_TO across\-to +ACROSS_FROM across\-from +SL2VL_TABLE sl2vl\-table +QOS_LEVELS_START qos\-levels +QOS_LEVELS_END end\-qos\-levels +QOS_LEVEL_START qos\-level +QOS_LEVEL_END end\-qos\-level +SL sl +MTU_LIMIT mtu\-limit +RATE_LIMIT rate\-limit +PACKET_LIFE packet\-life +PATH_BITS path\-bits +QOS_MATCH_RULES_START qos\-match\-rules +QOS_MATCH_RULES_END end\-qos\-match\-rules +QOS_MATCH_RULE_START qos\-match\-rule +QOS_MATCH_RULE_END end\-qos\-match\-rule +QOS_CLASS qos\-class +SOURCE source +DESTINATION destination +SERVICE_ID service\-id +PKEY pkey +QOS_LEVEL_NAME qos\-level\-name + +ROUTER [Rr][Oo][Uu][Tt][Ee][Rr] +CA [Cc][Aa] +SWITCH [Ss][Ww][Ii][Tt][Cc][Hh] +SELF [Ss][Ee][Ll][Ff] +ALL [Aa][Ll][Ll] + +ULP_SDP [Ss][Dd][Pp] +ULP_SRP [Ss][Rr][Pp] +ULP_RDS [Rr][Dd][Ss] +ULP_IPOIB [Ii][Pp][Oo][Ii][Bb] +ULP_ISER [Ii][Ss][Ee][Rr] +ULP_ANY [Aa][Nn][Yy] +ULP_DEFAULT [Dd][Ee][Ff][Aa][Uu][Ll][Tt] + +WHITE [ \t]+ +NEW_LINE \n +COMMENT \#.*\n +WHITE_DOTDOT_WHITE [ \t]*:[ \t]* +WHITE_COMMA_WHITE [ \t]*,[ \t]* +QUOTED_TEXT \"[^\"]*\" + +%% + + +{COMMENT} { SAVE_POS; RESET_NEW_LINE_FLAGS; } /* swallow comment */ +{WHITE}{NEW_LINE} { SAVE_POS; RESET_NEW_LINE_FLAGS; } /* trailing blanks with new line */ +{WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; } +{NEW_LINE} { SAVE_POS; RESET_NEW_LINE_FLAGS; } + +{QOS_ULPS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_ULPS_START; } +{QOS_ULPS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_ULPS_END; } + +{PORT_GROUPS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUPS_START; } +{PORT_GROUPS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUPS_END; } +{PORT_GROUP_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUP_START; } +{PORT_GROUP_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_PORT_GROUP_END; } + +{QOS_SETUP_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_SETUP_START; } +{QOS_SETUP_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_SETUP_END; } +{VLARB_TABLES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_TABLES_START; } +{VLARB_TABLES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_TABLES_END; } +{VLARB_SCOPE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_SCOPE_START; } +{VLARB_SCOPE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_VLARB_SCOPE_END; } + +{SL2VL_TABLES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_TABLES_START; } +{SL2VL_TABLES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_TABLES_END; } +{SL2VL_SCOPE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_SCOPE_START; } +{SL2VL_SCOPE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_SL2VL_SCOPE_END; } + +{QOS_LEVELS_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVELS_START; } +{QOS_LEVELS_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVELS_END; } +{QOS_LEVEL_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVEL_START; } +{QOS_LEVEL_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_LEVEL_END; } + +{QOS_MATCH_RULES_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULES_START; } +{QOS_MATCH_RULES_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULES_END; } +{QOS_MATCH_RULE_START} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULE_START; } +{QOS_MATCH_RULE_END} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; return TK_QOS_MATCH_RULE_END; } + +{PORT_GUID}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PORT_GUID; return TK_PORT_GUID; } +{PORT_NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PORT_NAME; return TK_PORT_NAME; } +{PARTITION}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PARTITION; return TK_PARTITION; } +{NODE_TYPE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_NODE_TYPE; return TK_NODE_TYPE; } +{NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_NAME; return TK_NAME; } +{USE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_USE; return TK_USE; } +{GROUP}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_GROUP; return TK_GROUP; } +{VLARB_HIGH}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_HIGH; return TK_VLARB_HIGH; } +{VLARB_LOW}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_LOW; return TK_VLARB_LOW; } +{VLARB_HIGH_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_VLARB_HIGH_LIMIT; return TK_VLARB_HIGH_LIMIT;} +{TO}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_TO; return TK_TO; } +{FROM}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_FROM; return TK_FROM; } +{ACROSS_TO}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS_TO; return TK_ACROSS_TO; } +{ACROSS_FROM}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS_FROM; return TK_ACROSS_FROM;} +{ACROSS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ACROSS; return TK_ACROSS; } +{SL2VL_TABLE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SL2VL_TABLE; return TK_SL2VL_TABLE;} +{SL}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SL; return TK_SL; } +{MTU_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_MTU_LIMIT; return TK_MTU_LIMIT; } +{RATE_LIMIT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_RATE_LIMIT; return TK_RATE_LIMIT; } +{PACKET_LIFE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PACKET_LIFE; return TK_PACKET_LIFE;} +{PATH_BITS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PATH_BITS; return TK_PATH_BITS; } +{QOS_CLASS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_QOS_CLASS; return TK_QOS_CLASS; } +{SOURCE}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SOURCE; return TK_SOURCE; } +{DESTINATION}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_DESTINATION; return TK_DESTINATION;} +{SERVICE_ID}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_SERVICE_ID; return TK_SERVICE_ID; } +{PKEY}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_PKEY; return TK_PKEY; } +{QOS_LEVEL_NAME}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_QOS_LEVEL_NAME; return TK_QOS_LEVEL_NAME;} + +{ROUTER} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_ROUTER; yylval = strdup(yytext); return TK_TEXT; } +{CA} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_CA; yylval = strdup(yytext); return TK_TEXT; } +{SWITCH} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_SWITCH; yylval = strdup(yytext); return TK_TEXT; } +{SELF} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_SELF; yylval = strdup(yytext); return TK_TEXT; } +{ALL} { SAVE_POS; if (in_node_type) return TK_NODE_TYPE_ALL; yylval = strdup(yytext); return TK_TEXT; } + +{ULP_DEFAULT}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_DEFAULT; return TK_ULP_DEFAULT; } +{ULP_ANY}{WHITE_COMMA_WHITE}{SERVICE_ID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_SERVICE_ID; } +{ULP_ANY}{WHITE_COMMA_WHITE}{PKEY} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_PKEY; } +{ULP_ANY}{WHITE_COMMA_WHITE}{TARGET_PORT_GUID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_ANY; return TK_ULP_ANY_TARGET_PORT_GUID; } + +{ULP_SDP}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_DEFAULT; return TK_ULP_SDP_DEFAULT; } +{ULP_SDP}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_SDP_PORT; } + +{ULP_RDS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_DEFAULT; return TK_ULP_RDS_DEFAULT; } +{ULP_RDS}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_PORT; return TK_ULP_RDS_PORT; } + +{ULP_ISER}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_DEFAULT; return TK_ULP_ISER_DEFAULT; } +{ULP_ISER}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_ISER_PORT; } + +{ULP_SRP}{WHITE_COMMA_WHITE}{TARGET_PORT_GUID} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SRP_GUID; return TK_ULP_SRP_GUID; } + +{ULP_IPOIB}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_IPOIB_DEFAULT; return TK_ULP_IPOIB_DEFAULT; } +{ULP_IPOIB}{WHITE_COMMA_WHITE}{PKEY} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_IPOIB_PKEY; return TK_ULP_IPOIB_PKEY; } + +0[xX][0-9a-fA-F]+ { + SAVE_POS; + yylval = strdup(yytext); + if (in_description || in_list_of_strings || in_single_string) + return TK_TEXT; + return TK_NUMBER; + } + +[0-9]+ { + SAVE_POS; + yylval = strdup(yytext); + if (in_description || in_list_of_strings || in_single_string) + return TK_TEXT; + return TK_NUMBER; + } + + +- { + SAVE_POS; + if (in_description || in_list_of_strings || in_single_string) + { + yylval = strdup(yytext); + return TK_TEXT; + } + return TK_DASH; + } + +: { + SAVE_POS; + if (in_description || in_list_of_strings || in_single_string) + { + yylval = strdup(yytext); + return TK_TEXT; + } + return TK_DOTDOT; + } + +, { + SAVE_POS; + if (in_description) + { + yylval = strdup(yytext); + return TK_TEXT; + } + return TK_COMMA; + } + +\* { + SAVE_POS; + if (in_description || in_list_of_strings || in_single_string) + { + yylval = strdup(yytext); + return TK_TEXT; + } + return TK_ASTERISK; + } + +{QUOTED_TEXT} { + SAVE_POS; + yylval = strdup(&yytext[1]); + yylval[strlen(yylval)-1] = '\0'; + return TK_TEXT; + } + +. { SAVE_POS; yylval = strdup(yytext); return TK_TEXT;} + +%% + + +/********************************************* + *********************************************/ + +static void save_pos() +{ + int i; + for (i = 0; i < yyleng; i++) + { + if (yytext[i] == '\n') + { + line_num ++; + column_num = 1; + } + else + column_num ++; + } +} + +/********************************************* + *********************************************/ + +static void reset_new_line_flags() +{ + in_description = FALSE; + in_list_of_hex_num_ranges = FALSE; + in_node_type = FALSE; + in_list_of_numbers = FALSE; + in_list_of_strings = FALSE; + in_list_of_num_pairs = FALSE; + in_asterisk_or_list_of_numbers = FALSE; + in_list_of_num_ranges = FALSE; + in_single_string = FALSE; + in_single_number = FALSE; +} diff --git a/opensm/opensm/osm_qos_parser_y.y b/opensm/opensm/osm_qos_parser_y.y new file mode 100644 index 0000000..de60193 --- /dev/null +++ b/opensm/opensm/osm_qos_parser_y.y @@ -0,0 +1,3063 @@ +%{ +/* + * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. + * Copyright (c) 2008 HNR Consulting. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +/* + * Abstract: + * Grammar of OSM QoS parser. + * + * Environment: + * Linux User Mode + * + * Author: + * Yevgeny Kliteynik, Mellanox + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSM_QOS_POLICY_MAX_LINE_LEN 1024*10 +#define OSM_QOS_POLICY_SL2VL_TABLE_LEN IB_MAX_NUM_VLS +#define OSM_QOS_POLICY_MAX_VL_NUM IB_MAX_NUM_VLS + +typedef struct tmp_parser_struct_t_ { + char str[OSM_QOS_POLICY_MAX_LINE_LEN]; + uint64_t num_pair[2]; + cl_list_t str_list; + cl_list_t num_list; + cl_list_t num_pair_list; +} tmp_parser_struct_t; + +static void __parser_tmp_struct_init(); +static void __parser_tmp_struct_reset(); +static void __parser_tmp_struct_destroy(); + +static char * __parser_strip_white(char * str); + +static void __parser_str2uint64(uint64_t * p_val, char * str); + +static void __parser_port_group_start(); +static int __parser_port_group_end(); + +static void __parser_sl2vl_scope_start(); +static int __parser_sl2vl_scope_end(); + +static void __parser_vlarb_scope_start(); +static int __parser_vlarb_scope_end(); + +static void __parser_qos_level_start(); +static int __parser_qos_level_end(); + +static void __parser_match_rule_start(); +static int __parser_match_rule_end(); + +static void __parser_ulp_match_rule_start(); +static int __parser_ulp_match_rule_end(); + +static void __pkey_rangelist2rangearr( + cl_list_t * p_list, + uint64_t ** * p_arr, + unsigned * p_arr_len); + +static void __rangelist2rangearr( + cl_list_t * p_list, + uint64_t ** * p_arr, + unsigned * p_arr_len); + +static void __merge_rangearr( + uint64_t ** range_arr_1, + unsigned range_len_1, + uint64_t ** range_arr_2, + unsigned range_len_2, + uint64_t ** * p_arr, + unsigned * p_arr_len ); + +static void __parser_add_port_to_port_map( + cl_qmap_t * p_map, + osm_physp_t * p_physp); + +static void __parser_add_guid_range_to_port_map( + cl_qmap_t * p_map, + uint64_t ** range_arr, + unsigned range_len); + +static void __parser_add_pkey_range_to_port_map( + cl_qmap_t * p_map, + uint64_t ** range_arr, + unsigned range_len); + +static void __parser_add_partition_list_to_port_map( + cl_qmap_t * p_map, + cl_list_t * p_list); + +static void __parser_add_map_to_port_map( + cl_qmap_t * p_dmap, + cl_map_t * p_smap); + +static int __validate_pkeys( + uint64_t ** range_arr, + unsigned range_len, + boolean_t is_ipoib); + +static void __setup_simple_qos_levels(); +static void __clear_simple_qos_levels(); +static void __setup_ulp_match_rules(); +static void __process_ulp_match_rules(); +static void yyerror(const char *format, ...); + +extern char * yytext; +extern int yylex (void); +extern FILE * yyin; +extern int errno; +int yyparse(); + +#define RESET_BUFFER __parser_tmp_struct_reset() + +tmp_parser_struct_t tmp_parser_struct; + +int column_num; +int line_num; + +osm_qos_policy_t * p_qos_policy = NULL; +osm_qos_port_group_t * p_current_port_group = NULL; +osm_qos_sl2vl_scope_t * p_current_sl2vl_scope = NULL; +osm_qos_vlarb_scope_t * p_current_vlarb_scope = NULL; +osm_qos_level_t * p_current_qos_level = NULL; +osm_qos_match_rule_t * p_current_qos_match_rule = NULL; +osm_log_t * p_qos_parser_osm_log; + +/* 16 Simple QoS Levels - one for each SL */ +static osm_qos_level_t osm_qos_policy_simple_qos_levels[16]; + +/* Default Simple QoS Level */ +osm_qos_level_t __default_simple_qos_level; + +/* + * List of match rules that will be generated by the + * qos-ulp section. These rules are concatenated to + * the end of the usual matching rules list at the + * end of parsing. + */ +static cl_list_t __ulp_match_rules; + +/***************************************************/ + +%} + +%token TK_NUMBER +%token TK_DASH +%token TK_DOTDOT +%token TK_COMMA +%token TK_ASTERISK +%token TK_TEXT + +%token TK_QOS_ULPS_START +%token TK_QOS_ULPS_END + +%token TK_PORT_GROUPS_START +%token TK_PORT_GROUPS_END +%token TK_PORT_GROUP_START +%token TK_PORT_GROUP_END + +%token TK_QOS_SETUP_START +%token TK_QOS_SETUP_END +%token TK_VLARB_TABLES_START +%token TK_VLARB_TABLES_END +%token TK_VLARB_SCOPE_START +%token TK_VLARB_SCOPE_END + +%token TK_SL2VL_TABLES_START +%token TK_SL2VL_TABLES_END +%token TK_SL2VL_SCOPE_START +%token TK_SL2VL_SCOPE_END + +%token TK_QOS_LEVELS_START +%token TK_QOS_LEVELS_END +%token TK_QOS_LEVEL_START +%token TK_QOS_LEVEL_END + +%token TK_QOS_MATCH_RULES_START +%token TK_QOS_MATCH_RULES_END +%token TK_QOS_MATCH_RULE_START +%token TK_QOS_MATCH_RULE_END + +%token TK_NAME +%token TK_USE +%token TK_PORT_GUID +%token TK_PORT_NAME +%token TK_PARTITION +%token TK_NODE_TYPE +%token TK_GROUP +%token TK_ACROSS +%token TK_VLARB_HIGH +%token TK_VLARB_LOW +%token TK_VLARB_HIGH_LIMIT +%token TK_TO +%token TK_FROM +%token TK_ACROSS_TO +%token TK_ACROSS_FROM +%token TK_SL2VL_TABLE +%token TK_SL +%token TK_MTU_LIMIT +%token TK_RATE_LIMIT +%token TK_PACKET_LIFE +%token TK_PATH_BITS +%token TK_QOS_CLASS +%token TK_SOURCE +%token TK_DESTINATION +%token TK_SERVICE_ID +%token TK_QOS_LEVEL_NAME +%token TK_PKEY + +%token TK_NODE_TYPE_ROUTER +%token TK_NODE_TYPE_CA +%token TK_NODE_TYPE_SWITCH +%token TK_NODE_TYPE_SELF +%token TK_NODE_TYPE_ALL + +%token TK_ULP_DEFAULT +%token TK_ULP_ANY_SERVICE_ID +%token TK_ULP_ANY_PKEY +%token TK_ULP_ANY_TARGET_PORT_GUID +%token TK_ULP_SDP_DEFAULT +%token TK_ULP_SDP_PORT +%token TK_ULP_RDS_DEFAULT +%token TK_ULP_RDS_PORT +%token TK_ULP_ISER_DEFAULT +%token TK_ULP_ISER_PORT +%token TK_ULP_SRP_GUID +%token TK_ULP_IPOIB_DEFAULT +%token TK_ULP_IPOIB_PKEY + +%start head + +%% + +head: qos_policy_entries + ; + +qos_policy_entries: /* empty */ + | qos_policy_entries qos_policy_entry + ; + +qos_policy_entry: qos_ulps_section + | port_groups_section + | qos_setup_section + | qos_levels_section + | qos_match_rules_section + ; + + /* + * Parsing qos-ulps: + * ------------------- + * qos-ulps + * default : 0 #default SL + * sdp, port-num 30000 : 1 #SL for SDP when destination port is 30000 + * sdp, port-num 10000-20000 : 2 + * sdp : 0 #default SL for SDP + * srp, target-port-guid 0x1234 : 2 + * rds, port-num 25000 : 2 #SL for RDS when destination port is 25000 + * rds, : 0 #default SL for RDS + * iser, port-num 900 : 5 #SL for iSER where target port is 900 + * iser : 4 #default SL for iSER + * ipoib, pkey 0x0001 : 5 #SL for IPoIB on partition with pkey 0x0001 + * ipoib : 6 #default IPoIB partition - pkey=0x7FFF + * any, service-id 0x6234 : 2 + * any, pkey 0x0ABC : 3 + * any, target-port-guid 0x0ABC-0xFFFFF : 6 + * end-qos-ulps + */ + +qos_ulps_section: TK_QOS_ULPS_START qos_ulps TK_QOS_ULPS_END + ; + +qos_ulps: qos_ulp + | qos_ulps qos_ulp + ; + + /* + * Parsing port groups: + * ------------------- + * port-groups + * port-group + * name: Storage + * use: our SRP storage targets + * port-guid: 0x1000000000000001,0x1000000000000002 + * ... + * port-name: vs1 HCA-1/P1 + * port-name: node_description/P2 + * ... + * pkey: 0x00FF-0x0FFF + * ... + * partition: Part1 + * ... + * node-type: ROUTER,CA,SWITCH,SELF,ALL + * ... + * end-port-group + * port-group + * ... + * end-port-group + * end-port-groups + */ + + +port_groups_section: TK_PORT_GROUPS_START port_groups TK_PORT_GROUPS_END + ; + +port_groups: port_group + | port_groups port_group + ; + +port_group: port_group_start port_group_entries port_group_end + ; + +port_group_start: TK_PORT_GROUP_START { + __parser_port_group_start(); + } + ; + +port_group_end: TK_PORT_GROUP_END { + if ( __parser_port_group_end() ) + return 1; + } + ; + +port_group_entries: /* empty */ + | port_group_entries port_group_entry + ; + +port_group_entry: port_group_name + | port_group_use + | port_group_port_guid + | port_group_port_name + | port_group_pkey + | port_group_partition + | port_group_node_type + ; + + + /* + * Parsing qos setup: + * ----------------- + * qos-setup + * vlarb-tables + * vlarb-scope + * ... + * end-vlarb-scope + * vlarb-scope + * ... + * end-vlarb-scope + * end-vlarb-tables + * sl2vl-tables + * sl2vl-scope + * ... + * end-sl2vl-scope + * sl2vl-scope + * ... + * end-sl2vl-scope + * end-sl2vl-tables + * end-qos-setup + */ + +qos_setup_section: TK_QOS_SETUP_START qos_setup_items TK_QOS_SETUP_END + ; + +qos_setup_items: /* empty */ + | qos_setup_items vlarb_tables + | qos_setup_items sl2vl_tables + ; + + /* Parsing vlarb-tables */ + +vlarb_tables: TK_VLARB_TABLES_START vlarb_scope_items TK_VLARB_TABLES_END + ; + +vlarb_scope_items: /* empty */ + | vlarb_scope_items vlarb_scope + ; + +vlarb_scope: vlarb_scope_start vlarb_scope_entries vlarb_scope_end + ; + +vlarb_scope_start: TK_VLARB_SCOPE_START { + __parser_vlarb_scope_start(); + } + ; + +vlarb_scope_end: TK_VLARB_SCOPE_END { + if ( __parser_vlarb_scope_end() ) + return 1; + } + ; + +vlarb_scope_entries:/* empty */ + | vlarb_scope_entries vlarb_scope_entry + ; + + /* + * vlarb-scope + * group: Storage + * ... + * across: Storage + * ... + * vlarb-high: 0:255,1:127,2:63,3:31,4:15,5:7,6:3,7:1 + * vlarb-low: 8:255,9:127,10:63,11:31,12:15,13:7,14:3 + * vl-high-limit: 10 + * end-vlarb-scope + */ + +vlarb_scope_entry: vlarb_scope_group + | vlarb_scope_across + | vlarb_scope_vlarb_high + | vlarb_scope_vlarb_low + | vlarb_scope_vlarb_high_limit + ; + + /* Parsing sl2vl-tables */ + +sl2vl_tables: TK_SL2VL_TABLES_START sl2vl_scope_items TK_SL2VL_TABLES_END + ; + +sl2vl_scope_items: /* empty */ + | sl2vl_scope_items sl2vl_scope + ; + +sl2vl_scope: sl2vl_scope_start sl2vl_scope_entries sl2vl_scope_end + ; + +sl2vl_scope_start: TK_SL2VL_SCOPE_START { + __parser_sl2vl_scope_start(); + } + ; + +sl2vl_scope_end: TK_SL2VL_SCOPE_END { + if ( __parser_sl2vl_scope_end() ) + return 1; + } + ; + +sl2vl_scope_entries:/* empty */ + | sl2vl_scope_entries sl2vl_scope_entry + ; + + /* + * sl2vl-scope + * group: Part1 + * ... + * from: * + * ... + * to: * + * ... + * across-to: Storage2 + * ... + * across-from: Storage1 + * ... + * sl2vl-table: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 + * end-sl2vl-scope + */ + +sl2vl_scope_entry: sl2vl_scope_group + | sl2vl_scope_across + | sl2vl_scope_across_from + | sl2vl_scope_across_to + | sl2vl_scope_from + | sl2vl_scope_to + | sl2vl_scope_sl2vl_table + ; + + /* + * Parsing qos-levels: + * ------------------ + * qos-levels + * qos-level + * name: qos_level_1 + * use: for the lowest priority communication + * sl: 15 + * mtu-limit: 1 + * rate-limit: 1 + * packet-life: 12 + * path-bits: 2,4,8-32 + * pkey: 0x00FF-0x0FFF + * end-qos-level + * ... + * qos-level + * end-qos-level + * end-qos-levels + */ + + +qos_levels_section: TK_QOS_LEVELS_START qos_levels TK_QOS_LEVELS_END + ; + +qos_levels: /* empty */ + | qos_levels qos_level + ; + +qos_level: qos_level_start qos_level_entries qos_level_end + ; + +qos_level_start: TK_QOS_LEVEL_START { + __parser_qos_level_start(); + } + ; + +qos_level_end: TK_QOS_LEVEL_END { + if ( __parser_qos_level_end() ) + return 1; + } + ; + +qos_level_entries: /* empty */ + | qos_level_entries qos_level_entry + ; + +qos_level_entry: qos_level_name + | qos_level_use + | qos_level_sl + | qos_level_mtu_limit + | qos_level_rate_limit + | qos_level_packet_life + | qos_level_path_bits + | qos_level_pkey + ; + + /* + * Parsing qos-match-rules: + * ----------------------- + * qos-match-rules + * qos-match-rule + * use: low latency by class 7-9 or 11 and bla bla + * qos-class: 7-9,11 + * qos-level-name: default + * source: Storage + * destination: Storage + * service-id: 22,4719-5000 + * pkey: 0x00FF-0x0FFF + * end-qos-match-rule + * qos-match-rule + * ... + * end-qos-match-rule + * end-qos-match-rules + */ + +qos_match_rules_section: TK_QOS_MATCH_RULES_START qos_match_rules TK_QOS_MATCH_RULES_END + ; + +qos_match_rules: /* empty */ + | qos_match_rules qos_match_rule + ; + +qos_match_rule: qos_match_rule_start qos_match_rule_entries qos_match_rule_end + ; + +qos_match_rule_start: TK_QOS_MATCH_RULE_START { + __parser_match_rule_start(); + } + ; + +qos_match_rule_end: TK_QOS_MATCH_RULE_END { + if ( __parser_match_rule_end() ) + return 1; + } + ; + +qos_match_rule_entries: /* empty */ + | qos_match_rule_entries qos_match_rule_entry + ; + +qos_match_rule_entry: qos_match_rule_use + | qos_match_rule_qos_class + | qos_match_rule_qos_level_name + | qos_match_rule_source + | qos_match_rule_destination + | qos_match_rule_service_id + | qos_match_rule_pkey + ; + + + /* + * Parsing qos-ulps: + * ----------------- + * default + * sdp + * sdp with port-num + * rds + * rds with port-num + * srp with port-guid + * iser + * iser with port-num + * ipoib + * ipoib with pkey + * any with service-id + * any with pkey + * any with target-port-guid + */ + +qos_ulp: TK_ULP_DEFAULT single_number { + /* parsing default ulp rule: "default: num" */ + cl_list_iterator_t list_iterator; + uint64_t * p_tmp_num; + + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); + if (*p_tmp_num > 15) + { + yyerror("illegal SL value"); + return 1; + } + __default_simple_qos_level.sl = (uint8_t)(*p_tmp_num); + __default_simple_qos_level.sl_set = TRUE; + free(p_tmp_num); + cl_list_remove_all(&tmp_parser_struct.num_list); + } + + | qos_ulp_type_any_service list_of_ranges TK_DOTDOT { + /* "any, service-id ... : sl" - one instance of list of ranges */ + uint64_t ** range_arr; + unsigned range_len; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("ULP rule doesn't have service ids"); + return 1; + } + + /* get all the service id ranges */ + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = range_len; + + } qos_ulp_sl + + | qos_ulp_type_any_pkey list_of_ranges TK_DOTDOT { + /* "any, pkey ... : sl" - one instance of list of ranges */ + uint64_t ** range_arr; + unsigned range_len; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("ULP rule doesn't have pkeys"); + return 1; + } + + /* get all the pkey ranges */ + __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + p_current_qos_match_rule->pkey_range_arr = range_arr; + p_current_qos_match_rule->pkey_range_len = range_len; + + } qos_ulp_sl + + | qos_ulp_type_any_target_port_guid list_of_ranges TK_DOTDOT { + /* any, target-port-guid ... : sl */ + uint64_t ** range_arr; + unsigned range_len; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("ULP rule doesn't have port guids"); + return 1; + } + + /* create a new port group with these ports */ + __parser_port_group_start(); + + p_current_port_group->name = strdup("_ULP_Targets_"); + p_current_port_group->use = strdup("Generated from ULP rules"); + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + __parser_add_guid_range_to_port_map( + &p_current_port_group->port_map, + range_arr, + range_len); + + /* add this port group to the destination + groups of the current match rule */ + cl_list_insert_tail(&p_current_qos_match_rule->destination_group_list, + p_current_port_group); + + __parser_port_group_end(); + + } qos_ulp_sl + + | qos_ulp_type_sdp_default { + /* "sdp : sl" - default SL for SDP */ + uint64_t ** range_arr = + (uint64_t **)malloc(sizeof(uint64_t *)); + range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); + range_arr[0][0] = OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; + range_arr[0][1] = OSM_QOS_POLICY_ULP_SDP_SERVICE_ID + 0xFFFF; + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = 1; + + } qos_ulp_sl + + | qos_ulp_type_sdp_port list_of_ranges TK_DOTDOT { + /* sdp with port numbers */ + uint64_t ** range_arr; + unsigned range_len; + unsigned i; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("SDP ULP rule doesn't have port numbers"); + return 1; + } + + /* get all the port ranges */ + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + /* now translate these port numbers into service ids */ + for (i = 0; i < range_len; i++) + { + if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) + { + yyerror("SDP port number out of range"); + return 1; + } + range_arr[i][0] += OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; + range_arr[i][1] += OSM_QOS_POLICY_ULP_SDP_SERVICE_ID; + } + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = range_len; + + } qos_ulp_sl + + | qos_ulp_type_rds_default { + /* "rds : sl" - default SL for RDS */ + uint64_t ** range_arr = + (uint64_t **)malloc(sizeof(uint64_t *)); + range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); + range_arr[0][0] = range_arr[0][1] = + OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + OSM_QOS_POLICY_ULP_RDS_PORT; + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = 1; + + } qos_ulp_sl + + | qos_ulp_type_rds_port list_of_ranges TK_DOTDOT { + /* rds with port numbers */ + uint64_t ** range_arr; + unsigned range_len; + unsigned i; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("RDS ULP rule doesn't have port numbers"); + return 1; + } + + /* get all the port ranges */ + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + /* now translate these port numbers into service ids */ + for (i = 0; i < range_len; i++) + { + if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) + { + yyerror("SDP port number out of range"); + return 1; + } + range_arr[i][0] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; + range_arr[i][1] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; + } + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = range_len; + + } qos_ulp_sl + + | qos_ulp_type_iser_default { + /* "iSER : sl" - default SL for iSER */ + uint64_t ** range_arr = + (uint64_t **)malloc(sizeof(uint64_t *)); + range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); + range_arr[0][0] = range_arr[0][1] = + OSM_QOS_POLICY_ULP_ISER_SERVICE_ID + OSM_QOS_POLICY_ULP_ISER_PORT; + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = 1; + + } qos_ulp_sl + + | qos_ulp_type_iser_port list_of_ranges TK_DOTDOT { + /* iser with port numbers */ + uint64_t ** range_arr; + unsigned range_len; + unsigned i; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("iSER ULP rule doesn't have port numbers"); + return 1; + } + + /* get all the port ranges */ + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + /* now translate these port numbers into service ids */ + for (i = 0; i < range_len; i++) + { + if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) + { + yyerror("SDP port number out of range"); + return 1; + } + range_arr[i][0] += OSM_QOS_POLICY_ULP_ISER_SERVICE_ID; + range_arr[i][1] += OSM_QOS_POLICY_ULP_ISER_SERVICE_ID; + } + + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = range_len; + + } qos_ulp_sl + + | qos_ulp_type_srp_guid list_of_ranges TK_DOTDOT { + /* srp with target guids - this rule is similar + to writing 'any' ulp with target port guids */ + uint64_t ** range_arr; + unsigned range_len; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("SRP ULP rule doesn't have port guids"); + return 1; + } + + /* create a new port group with these ports */ + __parser_port_group_start(); + + p_current_port_group->name = strdup("_SRP_Targets_"); + p_current_port_group->use = strdup("Generated from ULP rules"); + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + __parser_add_guid_range_to_port_map( + &p_current_port_group->port_map, + range_arr, + range_len); + + /* add this port group to the destination + groups of the current match rule */ + cl_list_insert_tail(&p_current_qos_match_rule->destination_group_list, + p_current_port_group); + + __parser_port_group_end(); + + } qos_ulp_sl + + | qos_ulp_type_ipoib_default { + /* ipoib w/o any pkeys (default pkey) */ + uint64_t ** range_arr = + (uint64_t **)malloc(sizeof(uint64_t *)); + range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); + range_arr[0][0] = range_arr[0][1] = 0x7fff; + + /* + * Although we know that the default partition exists, + * we still need to validate it by checking that it has + * at least two full members. Otherwise IPoIB won't work. + */ + if (__validate_pkeys(range_arr, 1, TRUE)) + return 1; + + p_current_qos_match_rule->pkey_range_arr = range_arr; + p_current_qos_match_rule->pkey_range_len = 1; + + } qos_ulp_sl + + | qos_ulp_type_ipoib_pkey list_of_ranges TK_DOTDOT { + /* ipoib with pkeys */ + uint64_t ** range_arr; + unsigned range_len; + + if (!cl_list_count(&tmp_parser_struct.num_pair_list)) + { + yyerror("IPoIB ULP rule doesn't have pkeys"); + return 1; + } + + /* get all the pkey ranges */ + __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + /* + * Validate pkeys. + * For IPoIB pkeys the validation is strict. + * If some problem would be found, parsing will + * be aborted with a proper error messages. + */ + if (__validate_pkeys(range_arr, range_len, TRUE)) + return 1; + + p_current_qos_match_rule->pkey_range_arr = range_arr; + p_current_qos_match_rule->pkey_range_len = range_len; + + } qos_ulp_sl + ; + +qos_ulp_type_any_service: TK_ULP_ANY_SERVICE_ID + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_any_pkey: TK_ULP_ANY_PKEY + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_any_target_port_guid: TK_ULP_ANY_TARGET_PORT_GUID + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_sdp_default: TK_ULP_SDP_DEFAULT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_sdp_port: TK_ULP_SDP_PORT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_rds_default: TK_ULP_RDS_DEFAULT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_rds_port: TK_ULP_RDS_PORT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_iser_default: TK_ULP_ISER_DEFAULT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_iser_port: TK_ULP_ISER_PORT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_srp_guid: TK_ULP_SRP_GUID + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_ipoib_default: TK_ULP_IPOIB_DEFAULT + { __parser_ulp_match_rule_start(); }; + +qos_ulp_type_ipoib_pkey: TK_ULP_IPOIB_PKEY + { __parser_ulp_match_rule_start(); }; + + +qos_ulp_sl: single_number { + /* get the SL for ULP rules */ + cl_list_iterator_t list_iterator; + uint64_t * p_tmp_num; + uint8_t sl; + + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); + if (*p_tmp_num > 15) + { + yyerror("illegal SL value"); + return 1; + } + + sl = (uint8_t)(*p_tmp_num); + free(p_tmp_num); + cl_list_remove_all(&tmp_parser_struct.num_list); + + p_current_qos_match_rule->p_qos_level = + &osm_qos_policy_simple_qos_levels[sl]; + p_current_qos_match_rule->qos_level_name = + strdup(osm_qos_policy_simple_qos_levels[sl].name); + + if (__parser_ulp_match_rule_end()) + return 1; + } + ; + + /* + * port_group_entry values: + * port_group_name + * port_group_use + * port_group_port_guid + * port_group_port_name + * port_group_pkey + * port_group_partition + * port_group_node_type + */ + +port_group_name: port_group_name_start single_string { + /* 'name' of 'port-group' - one instance */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_port_group->name) + { + yyerror("port-group has multiple 'name' tags"); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_port_group->name = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +port_group_name_start: TK_NAME { + RESET_BUFFER; + } + ; + +port_group_use: port_group_use_start single_string { + /* 'use' of 'port-group' - one instance */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_port_group->use) + { + yyerror("port-group has multiple 'use' tags"); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_port_group->use = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +port_group_use_start: TK_USE { + RESET_BUFFER; + } + ; + +port_group_port_name: port_group_port_name_start string_list { + /* 'port-name' in 'port-group' - any num of instances */ + cl_list_iterator_t list_iterator; + osm_node_t * p_node; + osm_physp_t * p_physp; + unsigned port_num; + char * tmp_str; + char * port_str; + + /* parsing port name strings */ + for (list_iterator = cl_list_head(&tmp_parser_struct.str_list); + list_iterator != cl_list_end(&tmp_parser_struct.str_list); + list_iterator = cl_list_next(list_iterator)) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + { + /* last slash in port name string is a separator + between node name and port number */ + port_str = strrchr(tmp_str, '/'); + if (!port_str || (strlen(port_str) < 3) || + (port_str[1] != 'p' && port_str[1] != 'P')) { + yyerror("'%s' - illegal port name", + tmp_str); + free(tmp_str); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + if (!(port_num = strtoul(&port_str[2],NULL,0))) { + yyerror( + "'%s' - illegal port number in port name", + tmp_str); + free(tmp_str); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + /* separate node name from port number */ + port_str[0] = '\0'; + + if (st_lookup(p_qos_policy->p_node_hash, + (st_data_t)tmp_str, + (st_data_t*)&p_node)) + { + /* we found the node, now get the right port */ + p_physp = osm_node_get_physp_ptr(p_node, port_num); + if (!p_physp) { + yyerror( + "'%s' - port number out of range in port name", + tmp_str); + free(tmp_str); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + /* we found the port, now add it to guid table */ + __parser_add_port_to_port_map(&p_current_port_group->port_map, + p_physp); + } + free(tmp_str); + } + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +port_group_port_name_start: TK_PORT_NAME { + RESET_BUFFER; + } + ; + +port_group_port_guid: port_group_port_guid_start list_of_ranges { + /* 'port-guid' in 'port-group' - any num of instances */ + /* list of guid ranges */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + __parser_add_guid_range_to_port_map( + &p_current_port_group->port_map, + range_arr, + range_len); + } + } + ; + +port_group_port_guid_start: TK_PORT_GUID { + RESET_BUFFER; + } + ; + +port_group_pkey: port_group_pkey_start list_of_ranges { + /* 'pkey' in 'port-group' - any num of instances */ + /* list of pkey ranges */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + __parser_add_pkey_range_to_port_map( + &p_current_port_group->port_map, + range_arr, + range_len); + } + } + ; + +port_group_pkey_start: TK_PKEY { + RESET_BUFFER; + } + ; + +port_group_partition: port_group_partition_start string_list { + /* 'partition' in 'port-group' - any num of instances */ + __parser_add_partition_list_to_port_map( + &p_current_port_group->port_map, + &tmp_parser_struct.str_list); + } + ; + +port_group_partition_start: TK_PARTITION { + RESET_BUFFER; + } + ; + +port_group_node_type: port_group_node_type_start port_group_node_type_list { + /* 'node-type' in 'port-group' - any num of instances */ + } + ; + +port_group_node_type_start: TK_NODE_TYPE { + RESET_BUFFER; + } + ; + +port_group_node_type_list: node_type_item + | port_group_node_type_list TK_COMMA node_type_item + ; + +node_type_item: node_type_ca + | node_type_switch + | node_type_router + | node_type_all + | node_type_self + ; + +node_type_ca: TK_NODE_TYPE_CA { + p_current_port_group->node_types |= + OSM_QOS_POLICY_NODE_TYPE_CA; + } + ; + +node_type_switch: TK_NODE_TYPE_SWITCH { + p_current_port_group->node_types |= + OSM_QOS_POLICY_NODE_TYPE_SWITCH; + } + ; + +node_type_router: TK_NODE_TYPE_ROUTER { + p_current_port_group->node_types |= + OSM_QOS_POLICY_NODE_TYPE_ROUTER; + } + ; + +node_type_all: TK_NODE_TYPE_ALL { + p_current_port_group->node_types |= + (OSM_QOS_POLICY_NODE_TYPE_CA | + OSM_QOS_POLICY_NODE_TYPE_SWITCH | + OSM_QOS_POLICY_NODE_TYPE_ROUTER); + } + ; + +node_type_self: TK_NODE_TYPE_SELF { + osm_port_t * p_osm_port = + osm_get_port_by_guid(p_qos_policy->p_subn, + p_qos_policy->p_subn->sm_port_guid); + if (p_osm_port) + __parser_add_port_to_port_map( + &p_current_port_group->port_map, + p_osm_port->p_physp); + } + ; + + /* + * vlarb_scope_entry values: + * vlarb_scope_group + * vlarb_scope_across + * vlarb_scope_vlarb_high + * vlarb_scope_vlarb_low + * vlarb_scope_vlarb_high_limit + */ + + + +vlarb_scope_group: vlarb_scope_group_start string_list { + /* 'group' in 'vlarb-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_vlarb_scope->group_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +vlarb_scope_group_start: TK_GROUP { + RESET_BUFFER; + } + ; + +vlarb_scope_across: vlarb_scope_across_start string_list { + /* 'across' in 'vlarb-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_vlarb_scope->across_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +vlarb_scope_across_start: TK_ACROSS { + RESET_BUFFER; + } + ; + +vlarb_scope_vlarb_high_limit: vlarb_scope_vlarb_high_limit_start single_number { + /* 'vl-high-limit' in 'vlarb-scope' - one instance of one number */ + cl_list_iterator_t list_iterator; + uint64_t * p_tmp_num; + + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_tmp_num = (uint64_t*)cl_list_obj(list_iterator); + if (p_tmp_num) + { + p_current_vlarb_scope->vl_high_limit = (uint32_t)(*p_tmp_num); + p_current_vlarb_scope->vl_high_limit_set = TRUE; + free(p_tmp_num); + } + + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +vlarb_scope_vlarb_high_limit_start: TK_VLARB_HIGH_LIMIT { + RESET_BUFFER; + } + ; + +vlarb_scope_vlarb_high: vlarb_scope_vlarb_high_start num_list_with_dotdot { + /* 'vlarb-high' in 'vlarb-scope' - list of pairs of numbers with ':' and ',' */ + cl_list_iterator_t list_iterator; + uint64_t * num_pair; + + list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) + { + num_pair = (uint64_t*)cl_list_obj(list_iterator); + if (num_pair) + cl_list_insert_tail(&p_current_vlarb_scope->vlarb_high_list,num_pair); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + } + ; + +vlarb_scope_vlarb_high_start: TK_VLARB_HIGH { + RESET_BUFFER; + } + ; + +vlarb_scope_vlarb_low: vlarb_scope_vlarb_low_start num_list_with_dotdot { + /* 'vlarb-low' in 'vlarb-scope' - list of pairs of numbers with ':' and ',' */ + cl_list_iterator_t list_iterator; + uint64_t * num_pair; + + list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) + { + num_pair = (uint64_t*)cl_list_obj(list_iterator); + if (num_pair) + cl_list_insert_tail(&p_current_vlarb_scope->vlarb_low_list,num_pair); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + } + ; + +vlarb_scope_vlarb_low_start: TK_VLARB_LOW { + RESET_BUFFER; + } + ; + + /* + * sl2vl_scope_entry values: + * sl2vl_scope_group + * sl2vl_scope_across + * sl2vl_scope_across_from + * sl2vl_scope_across_to + * sl2vl_scope_from + * sl2vl_scope_to + * sl2vl_scope_sl2vl_table + */ + +sl2vl_scope_group: sl2vl_scope_group_start string_list { + /* 'group' in 'sl2vl-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_sl2vl_scope->group_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +sl2vl_scope_group_start: TK_GROUP { + RESET_BUFFER; + } + ; + +sl2vl_scope_across: sl2vl_scope_across_start string_list { + /* 'across' in 'sl2vl-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) { + cl_list_insert_tail(&p_current_sl2vl_scope->across_from_list,tmp_str); + cl_list_insert_tail(&p_current_sl2vl_scope->across_to_list,strdup(tmp_str)); + } + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +sl2vl_scope_across_start: TK_ACROSS { + RESET_BUFFER; + } + ; + +sl2vl_scope_across_from: sl2vl_scope_across_from_start string_list { + /* 'across-from' in 'sl2vl-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_sl2vl_scope->across_from_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +sl2vl_scope_across_from_start: TK_ACROSS_FROM { + RESET_BUFFER; + } + ; + +sl2vl_scope_across_to: sl2vl_scope_across_to_start string_list { + /* 'across-to' in 'sl2vl-scope' - any num of instances */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) { + cl_list_insert_tail(&p_current_sl2vl_scope->across_to_list,tmp_str); + } + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +sl2vl_scope_across_to_start: TK_ACROSS_TO { + RESET_BUFFER; + } + ; + +sl2vl_scope_from: sl2vl_scope_from_start sl2vl_scope_from_list_or_asterisk { + /* 'from' in 'sl2vl-scope' - any num of instances */ + } + ; + +sl2vl_scope_from_start: TK_FROM { + RESET_BUFFER; + } + ; + +sl2vl_scope_to: sl2vl_scope_to_start sl2vl_scope_to_list_or_asterisk { + /* 'to' in 'sl2vl-scope' - any num of instances */ + } + ; + +sl2vl_scope_to_start: TK_TO { + RESET_BUFFER; + } + ; + +sl2vl_scope_from_list_or_asterisk: sl2vl_scope_from_asterisk + | sl2vl_scope_from_list_of_ranges + ; + +sl2vl_scope_from_asterisk: TK_ASTERISK { + int i; + for (i = 0; i < OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH; i++) + p_current_sl2vl_scope->from[i] = TRUE; + } + ; + +sl2vl_scope_to_list_or_asterisk: sl2vl_scope_to_asterisk + | sl2vl_scope_to_list_of_ranges + ; + +sl2vl_scope_to_asterisk: TK_ASTERISK { + int i; + for (i = 0; i < OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH; i++) + p_current_sl2vl_scope->to[i] = TRUE; + } + ; + +sl2vl_scope_from_list_of_ranges: list_of_ranges { + int i; + cl_list_iterator_t list_iterator; + uint64_t * num_pair; + uint8_t num1, num2; + + list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) + { + num_pair = (uint64_t*)cl_list_obj(list_iterator); + if (num_pair) + { + if ( num_pair[0] < 0 || + num_pair[1] >= OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH ) + { + yyerror("port number out of range 'from' list"); + free(num_pair); + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + return 1; + } + num1 = (uint8_t)num_pair[0]; + num2 = (uint8_t)num_pair[1]; + free(num_pair); + for (i = num1; i <= num2; i++) + p_current_sl2vl_scope->from[i] = TRUE; + } + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + } + ; + +sl2vl_scope_to_list_of_ranges: list_of_ranges { + int i; + cl_list_iterator_t list_iterator; + uint64_t * num_pair; + uint8_t num1, num2; + + list_iterator = cl_list_head(&tmp_parser_struct.num_pair_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.num_pair_list) ) + { + num_pair = (uint64_t*)cl_list_obj(list_iterator); + if (num_pair) + { + if ( num_pair[0] < 0 || + num_pair[1] >= OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH ) + { + yyerror("port number out of range 'to' list"); + free(num_pair); + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + return 1; + } + num1 = (uint8_t)num_pair[0]; + num2 = (uint8_t)num_pair[1]; + free(num_pair); + for (i = num1; i <= num2; i++) + p_current_sl2vl_scope->to[i] = TRUE; + } + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.num_pair_list); + } + ; + + +sl2vl_scope_sl2vl_table: sl2vl_scope_sl2vl_table_start num_list { + /* 'sl2vl-table' - one instance of exactly + OSM_QOS_POLICY_SL2VL_TABLE_LEN numbers */ + cl_list_iterator_t list_iterator; + uint64_t num; + uint64_t * p_num; + int i = 0; + + if (p_current_sl2vl_scope->sl2vl_table_set) + { + yyerror("sl2vl-scope has more than one sl2vl-table"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + + if (cl_list_count(&tmp_parser_struct.num_list) != OSM_QOS_POLICY_SL2VL_TABLE_LEN) + { + yyerror("wrong number of values in 'sl2vl-table' (should be 16)"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.num_list) ) + { + p_num = (uint64_t*)cl_list_obj(list_iterator); + num = *p_num; + free(p_num); + if (num >= OSM_QOS_POLICY_MAX_VL_NUM) + { + yyerror("wrong VL value in 'sl2vl-table' (should be 0 to 15)"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + + p_current_sl2vl_scope->sl2vl_table[i++] = (uint8_t)num; + list_iterator = cl_list_next(list_iterator); + } + p_current_sl2vl_scope->sl2vl_table_set = TRUE; + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +sl2vl_scope_sl2vl_table_start: TK_SL2VL_TABLE { + RESET_BUFFER; + } + ; + + /* + * qos_level_entry values: + * qos_level_name + * qos_level_use + * qos_level_sl + * qos_level_mtu_limit + * qos_level_rate_limit + * qos_level_packet_life + * qos_level_path_bits + * qos_level_pkey + */ + +qos_level_name: qos_level_name_start single_string { + /* 'name' of 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_qos_level->name) + { + yyerror("qos-level has multiple 'name' tags"); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_qos_level->name = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_level_name_start: TK_NAME { + RESET_BUFFER; + } + ; + +qos_level_use: qos_level_use_start single_string { + /* 'use' of 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_qos_level->use) + { + yyerror("qos-level has multiple 'use' tags"); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_qos_level->use = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_level_use_start: TK_USE { + RESET_BUFFER; + } + ; + +qos_level_sl: qos_level_sl_start single_number { + /* 'sl' in 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + uint64_t * p_num; + + if (p_current_qos_level->sl_set) + { + yyerror("'qos-level' has multiple 'sl' tags"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_num = (uint64_t*)cl_list_obj(list_iterator); + p_current_qos_level->sl = (uint8_t)(*p_num); + free(p_num); + p_current_qos_level->sl_set = TRUE; + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +qos_level_sl_start: TK_SL { + RESET_BUFFER; + } + ; + +qos_level_mtu_limit: qos_level_mtu_limit_start single_number { + /* 'mtu-limit' in 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + uint64_t * p_num; + + if (p_current_qos_level->mtu_limit_set) + { + yyerror("'qos-level' has multiple 'mtu-limit' tags"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_num = (uint64_t*)cl_list_obj(list_iterator); + p_current_qos_level->mtu_limit = (uint8_t)(*p_num); + free(p_num); + p_current_qos_level->mtu_limit_set = TRUE; + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +qos_level_mtu_limit_start: TK_MTU_LIMIT { + /* 'mtu-limit' in 'qos-level' - one instance */ + RESET_BUFFER; + } + ; + +qos_level_rate_limit: qos_level_rate_limit_start single_number { + /* 'rate-limit' in 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + uint64_t * p_num; + + if (p_current_qos_level->rate_limit_set) + { + yyerror("'qos-level' has multiple 'rate-limit' tags"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_num = (uint64_t*)cl_list_obj(list_iterator); + p_current_qos_level->rate_limit = (uint8_t)(*p_num); + free(p_num); + p_current_qos_level->rate_limit_set = TRUE; + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +qos_level_rate_limit_start: TK_RATE_LIMIT { + /* 'rate-limit' in 'qos-level' - one instance */ + RESET_BUFFER; + } + ; + +qos_level_packet_life: qos_level_packet_life_start single_number { + /* 'packet-life' in 'qos-level' - one instance */ + cl_list_iterator_t list_iterator; + uint64_t * p_num; + + if (p_current_qos_level->pkt_life_set) + { + yyerror("'qos-level' has multiple 'packet-life' tags"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + list_iterator = cl_list_head(&tmp_parser_struct.num_list); + p_num = (uint64_t*)cl_list_obj(list_iterator); + p_current_qos_level->pkt_life = (uint8_t)(*p_num); + free(p_num); + p_current_qos_level->pkt_life_set= TRUE; + cl_list_remove_all(&tmp_parser_struct.num_list); + } + ; + +qos_level_packet_life_start: TK_PACKET_LIFE { + /* 'packet-life' in 'qos-level' - one instance */ + RESET_BUFFER; + } + ; + +qos_level_path_bits: qos_level_path_bits_start list_of_ranges { + /* 'path-bits' in 'qos-level' - any num of instances */ + /* list of path bit ranges */ + + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + if ( !p_current_qos_level->path_bits_range_len ) + { + p_current_qos_level->path_bits_range_arr = range_arr; + p_current_qos_level->path_bits_range_len = range_len; + } + else + { + uint64_t ** new_range_arr; + unsigned new_range_len; + __merge_rangearr( p_current_qos_level->path_bits_range_arr, + p_current_qos_level->path_bits_range_len, + range_arr, + range_len, + &new_range_arr, + &new_range_len ); + p_current_qos_level->path_bits_range_arr = new_range_arr; + p_current_qos_level->path_bits_range_len = new_range_len; + } + } + } + ; + +qos_level_path_bits_start: TK_PATH_BITS { + RESET_BUFFER; + } + ; + +qos_level_pkey: qos_level_pkey_start list_of_ranges { + /* 'pkey' in 'qos-level' - num of instances of list of ranges */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + if ( !p_current_qos_level->pkey_range_len ) + { + p_current_qos_level->pkey_range_arr = range_arr; + p_current_qos_level->pkey_range_len = range_len; + } + else + { + uint64_t ** new_range_arr; + unsigned new_range_len; + __merge_rangearr( p_current_qos_level->pkey_range_arr, + p_current_qos_level->pkey_range_len, + range_arr, + range_len, + &new_range_arr, + &new_range_len ); + p_current_qos_level->pkey_range_arr = new_range_arr; + p_current_qos_level->pkey_range_len = new_range_len; + } + } + } + ; + +qos_level_pkey_start: TK_PKEY { + RESET_BUFFER; + } + ; + + /* + * qos_match_rule_entry values: + * qos_match_rule_use + * qos_match_rule_qos_class + * qos_match_rule_qos_level_name + * qos_match_rule_source + * qos_match_rule_destination + * qos_match_rule_service_id + * qos_match_rule_pkey + */ + + +qos_match_rule_use: qos_match_rule_use_start single_string { + /* 'use' of 'qos-match-rule' - one instance */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_qos_match_rule->use) + { + yyerror("'qos-match-rule' has multiple 'use' tags"); + cl_list_remove_all(&tmp_parser_struct.str_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_qos_match_rule->use = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_match_rule_use_start: TK_USE { + RESET_BUFFER; + } + ; + +qos_match_rule_qos_class: qos_match_rule_qos_class_start list_of_ranges { + /* 'qos-class' in 'qos-match-rule' - num of instances of list of ranges */ + /* list of class ranges (QoS Class is 12-bit value) */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + if ( !p_current_qos_match_rule->qos_class_range_len ) + { + p_current_qos_match_rule->qos_class_range_arr = range_arr; + p_current_qos_match_rule->qos_class_range_len = range_len; + } + else + { + uint64_t ** new_range_arr; + unsigned new_range_len; + __merge_rangearr( p_current_qos_match_rule->qos_class_range_arr, + p_current_qos_match_rule->qos_class_range_len, + range_arr, + range_len, + &new_range_arr, + &new_range_len ); + p_current_qos_match_rule->qos_class_range_arr = new_range_arr; + p_current_qos_match_rule->qos_class_range_len = new_range_len; + } + } + } + ; + +qos_match_rule_qos_class_start: TK_QOS_CLASS { + RESET_BUFFER; + } + ; + +qos_match_rule_source: qos_match_rule_source_start string_list { + /* 'source' in 'qos-match-rule' - text */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_qos_match_rule->source_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_match_rule_source_start: TK_SOURCE { + RESET_BUFFER; + } + ; + +qos_match_rule_destination: qos_match_rule_destination_start string_list { + /* 'destination' in 'qos-match-rule' - text */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + while( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + cl_list_insert_tail(&p_current_qos_match_rule->destination_list,tmp_str); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_match_rule_destination_start: TK_DESTINATION { + RESET_BUFFER; + } + ; + +qos_match_rule_qos_level_name: qos_match_rule_qos_level_name_start single_string { + /* 'qos-level-name' in 'qos-match-rule' - single string */ + cl_list_iterator_t list_iterator; + char * tmp_str; + + if (p_current_qos_match_rule->qos_level_name) + { + yyerror("qos-match-rule has multiple 'qos-level-name' tags"); + cl_list_remove_all(&tmp_parser_struct.num_list); + return 1; + } + + list_iterator = cl_list_head(&tmp_parser_struct.str_list); + if ( list_iterator != cl_list_end(&tmp_parser_struct.str_list) ) + { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) + p_current_qos_match_rule->qos_level_name = tmp_str; + } + cl_list_remove_all(&tmp_parser_struct.str_list); + } + ; + +qos_match_rule_qos_level_name_start: TK_QOS_LEVEL_NAME { + RESET_BUFFER; + } + ; + +qos_match_rule_service_id: qos_match_rule_service_id_start list_of_ranges { + /* 'service-id' in 'qos-match-rule' - num of instances of list of ranges */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + if ( !p_current_qos_match_rule->service_id_range_len ) + { + p_current_qos_match_rule->service_id_range_arr = range_arr; + p_current_qos_match_rule->service_id_range_len = range_len; + } + else + { + uint64_t ** new_range_arr; + unsigned new_range_len; + __merge_rangearr( p_current_qos_match_rule->service_id_range_arr, + p_current_qos_match_rule->service_id_range_len, + range_arr, + range_len, + &new_range_arr, + &new_range_len ); + p_current_qos_match_rule->service_id_range_arr = new_range_arr; + p_current_qos_match_rule->service_id_range_len = new_range_len; + } + } + } + ; + +qos_match_rule_service_id_start: TK_SERVICE_ID { + RESET_BUFFER; + } + ; + +qos_match_rule_pkey: qos_match_rule_pkey_start list_of_ranges { + /* 'pkey' in 'qos-match-rule' - num of instances of list of ranges */ + if (cl_list_count(&tmp_parser_struct.num_pair_list)) + { + uint64_t ** range_arr; + unsigned range_len; + + __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list, + &range_arr, + &range_len ); + + if ( !p_current_qos_match_rule->pkey_range_len ) + { + p_current_qos_match_rule->pkey_range_arr = range_arr; + p_current_qos_match_rule->pkey_range_len = range_len; + } + else + { + uint64_t ** new_range_arr; + unsigned new_range_len; + __merge_rangearr( p_current_qos_match_rule->pkey_range_arr, + p_current_qos_match_rule->pkey_range_len, + range_arr, + range_len, + &new_range_arr, + &new_range_len ); + p_current_qos_match_rule->pkey_range_arr = new_range_arr; + p_current_qos_match_rule->pkey_range_len = new_range_len; + } + } + } + ; + +qos_match_rule_pkey_start: TK_PKEY { + RESET_BUFFER; + } + ; + + + /* + * Common part + */ + + +single_string: single_string_elems { + cl_list_insert_tail(&tmp_parser_struct.str_list, + strdup(__parser_strip_white(tmp_parser_struct.str))); + tmp_parser_struct.str[0] = '\0'; + } + ; + +single_string_elems: single_string_element + | single_string_elems single_string_element + ; + +single_string_element: TK_TEXT { + strcat(tmp_parser_struct.str,$1); + free($1); + } + ; + + +string_list: single_string + | string_list TK_COMMA single_string + ; + + + +single_number: number + ; + +num_list: number + | num_list TK_COMMA number + ; + +number: TK_NUMBER { + uint64_t * p_num = (uint64_t*)malloc(sizeof(uint64_t)); + __parser_str2uint64(p_num,$1); + free($1); + cl_list_insert_tail(&tmp_parser_struct.num_list, p_num); + } + ; + +num_list_with_dotdot: number_from_pair_1 TK_DOTDOT number_from_pair_2 { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + | num_list_with_dotdot TK_COMMA number_from_pair_1 TK_DOTDOT number_from_pair_2 { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + ; + +number_from_pair_1: TK_NUMBER { + __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); + free($1); + } + ; + +number_from_pair_2: TK_NUMBER { + __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); + free($1); + } + ; + +list_of_ranges: num_list_with_dash + ; + +num_list_with_dash: single_number_from_range { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + | number_from_range_1 TK_DASH number_from_range_2 { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + if (tmp_parser_struct.num_pair[0] <= tmp_parser_struct.num_pair[1]) { + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + } + else { + num_pair[1] = tmp_parser_struct.num_pair[0]; + num_pair[0] = tmp_parser_struct.num_pair[1]; + } + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + | num_list_with_dash TK_COMMA number_from_range_1 TK_DASH number_from_range_2 { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + if (tmp_parser_struct.num_pair[0] <= tmp_parser_struct.num_pair[1]) { + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + } + else { + num_pair[1] = tmp_parser_struct.num_pair[0]; + num_pair[0] = tmp_parser_struct.num_pair[1]; + } + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + | num_list_with_dash TK_COMMA single_number_from_range { + uint64_t * num_pair = (uint64_t*)malloc(sizeof(uint64_t)*2); + num_pair[0] = tmp_parser_struct.num_pair[0]; + num_pair[1] = tmp_parser_struct.num_pair[1]; + cl_list_insert_tail(&tmp_parser_struct.num_pair_list, num_pair); + } + ; + +single_number_from_range: TK_NUMBER { + __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); + __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); + free($1); + } + ; + +number_from_range_1: TK_NUMBER { + __parser_str2uint64(&tmp_parser_struct.num_pair[0],$1); + free($1); + } + ; + +number_from_range_2: TK_NUMBER { + __parser_str2uint64(&tmp_parser_struct.num_pair[1],$1); + free($1); + } + ; + +%% + +/*************************************************** + ***************************************************/ + +int osm_qos_parse_policy_file(IN osm_subn_t * const p_subn) +{ + int res = 0; + static boolean_t first_time = TRUE; + p_qos_parser_osm_log = &p_subn->p_osm->log; + + OSM_LOG_ENTER(p_qos_parser_osm_log); + + osm_qos_policy_destroy(p_subn->p_qos_policy); + p_subn->p_qos_policy = NULL; + + yyin = fopen (p_subn->opt.qos_policy_file, "r"); + if (!yyin) + { + if (strcmp(p_subn->opt.qos_policy_file,OSM_DEFAULT_QOS_POLICY_FILE)) { + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC01: " + "Failed opening QoS policy file %s - %s\n", + p_subn->opt.qos_policy_file, strerror(errno)); + res = 1; + } + else + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_VERBOSE, + "QoS policy file not found (%s)\n", + p_subn->opt.qos_policy_file); + + goto Exit; + } + + if (first_time) + { + first_time = FALSE; + __setup_simple_qos_levels(); + __setup_ulp_match_rules(); + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_INFO, + "Loading QoS policy file (%s)\n", + p_subn->opt.qos_policy_file); + } + else + /* + * ULP match rules list was emptied at the end of + * previous parsing iteration. + * What's left is to clear simple QoS levels. + */ + __clear_simple_qos_levels(); + + column_num = 1; + line_num = 1; + + p_subn->p_qos_policy = osm_qos_policy_create(p_subn); + + __parser_tmp_struct_init(); + p_qos_policy = p_subn->p_qos_policy; + + res = yyparse(); + + __parser_tmp_struct_destroy(); + + if (res != 0) + { + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC03: " + "Failed parsing QoS policy file (%s)\n", + p_subn->opt.qos_policy_file); + osm_qos_policy_destroy(p_subn->p_qos_policy); + p_subn->p_qos_policy = NULL; + res = 1; + goto Exit; + } + + /* add generated ULP match rules to the usual match rules */ + __process_ulp_match_rules(); + + if (osm_qos_policy_validate(p_subn->p_qos_policy,p_qos_parser_osm_log)) + { + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC04: " + "Error(s) in QoS policy file (%s)\n", + p_subn->opt.qos_policy_file); + fprintf(stderr, "Error(s) in QoS policy file (%s)\n", + p_subn->opt.qos_policy_file); + osm_qos_policy_destroy(p_subn->p_qos_policy); + p_subn->p_qos_policy = NULL; + res = 1; + goto Exit; + } + + Exit: + if (yyin) + fclose(yyin); + OSM_LOG_EXIT(p_qos_parser_osm_log); + return res; +} + +/*************************************************** + ***************************************************/ + +int yywrap() +{ + return(1); +} + +/*************************************************** + ***************************************************/ + +static void yyerror(const char *format, ...) +{ + char s[256]; + va_list pvar; + + OSM_LOG_ENTER(p_qos_parser_osm_log); + + va_start(pvar, format); + vsnprintf(s, sizeof(s), format, pvar); + va_end(pvar); + + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC05: " + "Syntax error (line %d:%d): %s\n", + line_num, column_num, s); + fprintf(stderr, "Error in QoS Policy File (line %d:%d): %s.\n", + line_num, column_num, s); + OSM_LOG_EXIT(p_qos_parser_osm_log); +} + +/*************************************************** + ***************************************************/ + +static char * __parser_strip_white(char * str) +{ + int i; + for (i = (strlen(str)-1); i >= 0; i--) + { + if (isspace(str[i])) + str[i] = '\0'; + else + break; + } + for (i = 0; i < strlen(str); i++) + { + if (!isspace(str[i])) + break; + } + return &(str[i]); +} + +/*************************************************** + ***************************************************/ + +static void __parser_str2uint64(uint64_t * p_val, char * str) +{ + *p_val = strtoull(str, NULL, 0); +} + +/*************************************************** + ***************************************************/ + +static void __parser_port_group_start() +{ + p_current_port_group = osm_qos_policy_port_group_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_port_group_end() +{ + if(!p_current_port_group->name) + { + yyerror("port-group validation failed - no port group name specified"); + return -1; + } + + cl_list_insert_tail(&p_qos_policy->port_groups, + p_current_port_group); + p_current_port_group = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_vlarb_scope_start() +{ + p_current_vlarb_scope = osm_qos_policy_vlarb_scope_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_vlarb_scope_end() +{ + if ( !cl_list_count(&p_current_vlarb_scope->group_list) && + !cl_list_count(&p_current_vlarb_scope->across_list) ) + { + yyerror("vlarb-scope validation failed - no port groups specified by 'group' or by 'across'"); + return -1; + } + + cl_list_insert_tail(&p_qos_policy->vlarb_tables, + p_current_vlarb_scope); + p_current_vlarb_scope = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_sl2vl_scope_start() +{ + p_current_sl2vl_scope = osm_qos_policy_sl2vl_scope_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_sl2vl_scope_end() +{ + if (!p_current_sl2vl_scope->sl2vl_table_set) + { + yyerror("sl2vl-scope validation failed - no sl2vl table specified"); + return -1; + } + if ( !cl_list_count(&p_current_sl2vl_scope->group_list) && + !cl_list_count(&p_current_sl2vl_scope->across_to_list) && + !cl_list_count(&p_current_sl2vl_scope->across_from_list) ) + { + yyerror("sl2vl-scope validation failed - no port groups specified by 'group', 'across-to' or 'across-from'"); + return -1; + } + + cl_list_insert_tail(&p_qos_policy->sl2vl_tables, + p_current_sl2vl_scope); + p_current_sl2vl_scope = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_qos_level_start() +{ + p_current_qos_level = osm_qos_policy_qos_level_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_qos_level_end() +{ + if (!p_current_qos_level->sl_set) + { + yyerror("qos-level validation failed - no 'sl' specified"); + return -1; + } + if (!p_current_qos_level->name) + { + yyerror("qos-level validation failed - no 'name' specified"); + return -1; + } + + cl_list_insert_tail(&p_qos_policy->qos_levels, + p_current_qos_level); + p_current_qos_level = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_match_rule_start() +{ + p_current_qos_match_rule = osm_qos_policy_match_rule_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_match_rule_end() +{ + if (!p_current_qos_match_rule->qos_level_name) + { + yyerror("match-rule validation failed - no 'qos-level-name' specified"); + return -1; + } + + cl_list_insert_tail(&p_qos_policy->qos_match_rules, + p_current_qos_match_rule); + p_current_qos_match_rule = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_ulp_match_rule_start() +{ + p_current_qos_match_rule = osm_qos_policy_match_rule_create(); +} + +/*************************************************** + ***************************************************/ + +static int __parser_ulp_match_rule_end() +{ + CL_ASSERT(p_current_qos_match_rule->p_qos_level); + cl_list_insert_tail(&__ulp_match_rules, + p_current_qos_match_rule); + p_current_qos_match_rule = NULL; + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __parser_tmp_struct_init() +{ + tmp_parser_struct.str[0] = '\0'; + cl_list_construct(&tmp_parser_struct.str_list); + cl_list_init(&tmp_parser_struct.str_list, 10); + cl_list_construct(&tmp_parser_struct.num_list); + cl_list_init(&tmp_parser_struct.num_list, 10); + cl_list_construct(&tmp_parser_struct.num_pair_list); + cl_list_init(&tmp_parser_struct.num_pair_list, 10); +} + +/*************************************************** + ***************************************************/ + +/* + * Do NOT free objects from the temp struct. + * Either they are inserted into the parse tree data + * structure, or they are already freed when copying + * their values to the parse tree data structure. + */ +static void __parser_tmp_struct_reset() +{ + tmp_parser_struct.str[0] = '\0'; + cl_list_remove_all(&tmp_parser_struct.str_list); + cl_list_remove_all(&tmp_parser_struct.num_list); + cl_list_remove_all(&tmp_parser_struct.num_pair_list); +} + +/*************************************************** + ***************************************************/ + +static void __parser_tmp_struct_destroy() +{ + __parser_tmp_struct_reset(); + cl_list_destroy(&tmp_parser_struct.str_list); + cl_list_destroy(&tmp_parser_struct.num_list); + cl_list_destroy(&tmp_parser_struct.num_pair_list); +} + +/*************************************************** + ***************************************************/ + +#define __SIMPLE_QOS_LEVEL_NAME "SimpleQoSLevel_SL" +#define __SIMPLE_QOS_LEVEL_DEFAULT_NAME "SimpleQoSLevel_DEFAULT" + +static void __setup_simple_qos_levels() +{ + uint8_t i; + char tmp_buf[30]; + memset(osm_qos_policy_simple_qos_levels, 0, + sizeof(osm_qos_policy_simple_qos_levels)); + for (i = 0; i < 16; i++) + { + osm_qos_policy_simple_qos_levels[i].sl = i; + osm_qos_policy_simple_qos_levels[i].sl_set = TRUE; + sprintf(tmp_buf, "%s%u", __SIMPLE_QOS_LEVEL_NAME, i); + osm_qos_policy_simple_qos_levels[i].name = strdup(tmp_buf); + } + + memset(&__default_simple_qos_level, 0, + sizeof(__default_simple_qos_level)); + __default_simple_qos_level.name = + strdup(__SIMPLE_QOS_LEVEL_DEFAULT_NAME); +} + +/*************************************************** + ***************************************************/ + +static void __clear_simple_qos_levels() +{ + /* + * Simple QoS levels are static. + * What's left is to invalidate default simple QoS level. + */ + __default_simple_qos_level.sl_set = FALSE; +} + +/*************************************************** + ***************************************************/ + +static void __setup_ulp_match_rules() +{ + cl_list_construct(&__ulp_match_rules); + cl_list_init(&__ulp_match_rules, 10); +} + +/*************************************************** + ***************************************************/ + +static void __process_ulp_match_rules() +{ + cl_list_iterator_t list_iterator; + osm_qos_match_rule_t *p_qos_match_rule = NULL; + + list_iterator = cl_list_head(&__ulp_match_rules); + while (list_iterator != cl_list_end(&__ulp_match_rules)) + { + p_qos_match_rule = (osm_qos_match_rule_t *) cl_list_obj(list_iterator); + if (p_qos_match_rule) + cl_list_insert_tail(&p_qos_policy->qos_match_rules, + p_qos_match_rule); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(&__ulp_match_rules); +} + +/*************************************************** + ***************************************************/ + +static int OSM_CDECL +__cmp_num_range( + const void * p1, + const void * p2) +{ + uint64_t * pair1 = *((uint64_t **)p1); + uint64_t * pair2 = *((uint64_t **)p2); + + if (pair1[0] < pair2[0]) + return -1; + if (pair1[0] > pair2[0]) + return 1; + + if (pair1[1] < pair2[1]) + return -1; + if (pair1[1] > pair2[1]) + return 1; + + return 0; +} + +/*************************************************** + ***************************************************/ + +static void __sort_reduce_rangearr( + uint64_t ** arr, + unsigned arr_len, + uint64_t ** * p_res_arr, + unsigned * p_res_arr_len ) +{ + unsigned i = 0; + unsigned j = 0; + unsigned last_valid_ind = 0; + unsigned valid_cnt = 0; + uint64_t ** res_arr; + boolean_t * is_valid_arr; + + *p_res_arr = NULL; + *p_res_arr_len = 0; + + qsort(arr, arr_len, sizeof(uint64_t*), __cmp_num_range); + + is_valid_arr = (boolean_t *)malloc(arr_len * sizeof(boolean_t)); + is_valid_arr[last_valid_ind] = TRUE; + valid_cnt++; + for (i = 1; i < arr_len; i++) + { + if (arr[i][0] <= arr[last_valid_ind][1]) + { + if (arr[i][1] > arr[last_valid_ind][1]) + arr[last_valid_ind][1] = arr[i][1]; + free(arr[i]); + arr[i] = NULL; + is_valid_arr[i] = FALSE; + } + else if ((arr[i][0] - 1) == arr[last_valid_ind][1]) + { + arr[last_valid_ind][1] = arr[i][1]; + free(arr[i]); + arr[i] = NULL; + is_valid_arr[i] = FALSE; + } + else + { + is_valid_arr[i] = TRUE; + last_valid_ind = i; + valid_cnt++; + } + } + + res_arr = (uint64_t **)malloc(valid_cnt * sizeof(uint64_t *)); + for (i = 0; i < arr_len; i++) + { + if (is_valid_arr[i]) + res_arr[j++] = arr[i]; + } + free(is_valid_arr); + free(arr); + + *p_res_arr = res_arr; + *p_res_arr_len = valid_cnt; +} + +/*************************************************** + ***************************************************/ + +static void __pkey_rangelist2rangearr( + cl_list_t * p_list, + uint64_t ** * p_arr, + unsigned * p_arr_len) +{ + uint64_t tmp_pkey; + uint64_t * p_pkeys; + cl_list_iterator_t list_iterator; + + list_iterator= cl_list_head(p_list); + while( list_iterator != cl_list_end(p_list) ) + { + p_pkeys = (uint64_t *)cl_list_obj(list_iterator); + p_pkeys[0] &= 0x7fff; + p_pkeys[1] &= 0x7fff; + if (p_pkeys[0] > p_pkeys[1]) + { + tmp_pkey = p_pkeys[1]; + p_pkeys[1] = p_pkeys[0]; + p_pkeys[0] = tmp_pkey; + } + list_iterator = cl_list_next(list_iterator); + } + + __rangelist2rangearr(p_list, p_arr, p_arr_len); +} + +/*************************************************** + ***************************************************/ + +static void __rangelist2rangearr( + cl_list_t * p_list, + uint64_t ** * p_arr, + unsigned * p_arr_len) +{ + cl_list_iterator_t list_iterator; + unsigned len = cl_list_count(p_list); + unsigned i = 0; + uint64_t ** tmp_arr; + uint64_t ** res_arr = NULL; + unsigned res_arr_len = 0; + + tmp_arr = (uint64_t **)malloc(len * sizeof(uint64_t *)); + + list_iterator = cl_list_head(p_list); + while( list_iterator != cl_list_end(p_list) ) + { + tmp_arr[i++] = (uint64_t *)cl_list_obj(list_iterator); + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(p_list); + + __sort_reduce_rangearr( tmp_arr, + len, + &res_arr, + &res_arr_len ); + *p_arr = res_arr; + *p_arr_len = res_arr_len; +} + +/*************************************************** + ***************************************************/ + +static void __merge_rangearr( + uint64_t ** range_arr_1, + unsigned range_len_1, + uint64_t ** range_arr_2, + unsigned range_len_2, + uint64_t ** * p_arr, + unsigned * p_arr_len ) +{ + unsigned i = 0; + unsigned j = 0; + unsigned len = range_len_1 + range_len_2; + uint64_t ** tmp_arr; + uint64_t ** res_arr = NULL; + unsigned res_arr_len = 0; + + *p_arr = NULL; + *p_arr_len = 0; + + tmp_arr = (uint64_t **)malloc(len * sizeof(uint64_t *)); + + for (i = 0; i < range_len_1; i++) + tmp_arr[j++] = range_arr_1[i]; + for (i = 0; i < range_len_2; i++) + tmp_arr[j++] = range_arr_2[i]; + free(range_arr_1); + free(range_arr_2); + + __sort_reduce_rangearr( tmp_arr, + len, + &res_arr, + &res_arr_len ); + *p_arr = res_arr; + *p_arr_len = res_arr_len; +} + +/*************************************************** + ***************************************************/ + +static void __parser_add_port_to_port_map( + cl_qmap_t * p_map, + osm_physp_t * p_physp) +{ + if (cl_qmap_get(p_map, cl_ntoh64(osm_physp_get_port_guid(p_physp))) == + cl_qmap_end(p_map)) + { + osm_qos_port_t * p_port = osm_qos_policy_port_create(p_physp); + if (p_port) + cl_qmap_insert(p_map, + cl_ntoh64(osm_physp_get_port_guid(p_physp)), + &p_port->map_item); + } +} + +/*************************************************** + ***************************************************/ + +static void __parser_add_guid_range_to_port_map( + cl_qmap_t * p_map, + uint64_t ** range_arr, + unsigned range_len) +{ + unsigned i; + uint64_t guid_ho; + osm_port_t * p_osm_port; + + if (!range_arr || !range_len) + return; + + for (i = 0; i < range_len; i++) { + for (guid_ho = range_arr[i][0]; guid_ho <= range_arr[i][1]; guid_ho++) { + p_osm_port = + osm_get_port_by_guid(p_qos_policy->p_subn, cl_hton64(guid_ho)); + if (p_osm_port) + __parser_add_port_to_port_map(p_map, p_osm_port->p_physp); + } + free(range_arr[i]); + } + free(range_arr); +} + +/*************************************************** + ***************************************************/ + +static void __parser_add_pkey_range_to_port_map( + cl_qmap_t * p_map, + uint64_t ** range_arr, + unsigned range_len) +{ + unsigned i; + uint64_t pkey_64; + ib_net16_t pkey; + osm_prtn_t * p_prtn; + + if (!range_arr || !range_len) + return; + + for (i = 0; i < range_len; i++) { + for (pkey_64 = range_arr[i][0]; pkey_64 <= range_arr[i][1]; pkey_64++) { + pkey = cl_hton16((uint16_t)(pkey_64 & 0x7fff)); + p_prtn = (osm_prtn_t *) + cl_qmap_get(&p_qos_policy->p_subn->prtn_pkey_tbl, pkey); + if (p_prtn != (osm_prtn_t *)cl_qmap_end( + &p_qos_policy->p_subn->prtn_pkey_tbl)) { + __parser_add_map_to_port_map(p_map, &p_prtn->part_guid_tbl); + __parser_add_map_to_port_map(p_map, &p_prtn->full_guid_tbl); + } + } + free(range_arr[i]); + } + free(range_arr); +} + +/*************************************************** + ***************************************************/ + +static void __parser_add_partition_list_to_port_map( + cl_qmap_t * p_map, + cl_list_t * p_list) +{ + cl_list_iterator_t list_iterator; + char * tmp_str; + osm_prtn_t * p_prtn; + + /* extract all the ports from the partition + to the port map of this port group */ + list_iterator = cl_list_head(p_list); + while(list_iterator != cl_list_end(p_list)) { + tmp_str = (char*)cl_list_obj(list_iterator); + if (tmp_str) { + p_prtn = osm_prtn_find_by_name(p_qos_policy->p_subn, tmp_str); + if (p_prtn) { + __parser_add_map_to_port_map(p_map, &p_prtn->part_guid_tbl); + __parser_add_map_to_port_map(p_map, &p_prtn->full_guid_tbl); + } + free(tmp_str); + } + list_iterator = cl_list_next(list_iterator); + } + cl_list_remove_all(p_list); +} + +/*************************************************** + ***************************************************/ + +static void __parser_add_map_to_port_map( + cl_qmap_t * p_dmap, + cl_map_t * p_smap) +{ + cl_map_iterator_t map_iterator; + osm_physp_t * p_physp; + + if (!p_dmap || !p_smap) + return; + + map_iterator = cl_map_head(p_smap); + while (map_iterator != cl_map_end(p_smap)) { + p_physp = (osm_physp_t*)cl_map_obj(map_iterator); + __parser_add_port_to_port_map(p_dmap, p_physp); + map_iterator = cl_map_next(map_iterator); + } +} + +/*************************************************** + ***************************************************/ + +static int __validate_pkeys( uint64_t ** range_arr, + unsigned range_len, + boolean_t is_ipoib) +{ + unsigned i; + uint64_t pkey_64; + ib_net16_t pkey; + osm_prtn_t * p_prtn; + + if (!range_arr || !range_len) + return 0; + + for (i = 0; i < range_len; i++) { + for (pkey_64 = range_arr[i][0]; pkey_64 <= range_arr[i][1]; pkey_64++) { + pkey = cl_hton16((uint16_t)(pkey_64 & 0x7fff)); + p_prtn = (osm_prtn_t *) + cl_qmap_get(&p_qos_policy->p_subn->prtn_pkey_tbl, pkey); + + if (p_prtn == (osm_prtn_t *)cl_qmap_end( + &p_qos_policy->p_subn->prtn_pkey_tbl)) + p_prtn = NULL; + + if (is_ipoib) { + /* + * Be very strict for IPoIB partition: + * - the partition for the pkey have to exist + * - it has to have at least 2 full members + */ + if (!p_prtn) { + yyerror("IPoIB partition, pkey 0x%04X - " + "partition doesn't exist", + cl_ntoh16(pkey)); + return 1; + } + else if (cl_map_count(&p_prtn->full_guid_tbl) < 2) { + yyerror("IPoIB partition, pkey 0x%04X - " + "partition has less than two full members", + cl_ntoh16(pkey)); + return 1; + } + } + else if (!p_prtn) { + /* + * For non-IPoIB pkey we just want to check that + * the relevant partition exists. + * And even if it doesn't, don't exit - just print + * error message and continue. + */ + OSM_LOG(p_qos_parser_osm_log, OSM_LOG_ERROR, "ERR AC02: " + "pkey 0x%04X - partition doesn't exist", + cl_ntoh16(pkey)); + } + } + } + return 0; +} + +/*************************************************** + ***************************************************/ -- 1.5.5.1.178.g1f811 From sashak at voltaire.com Thu Aug 7 01:09:18 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 7 Aug 2008 11:09:18 +0300 Subject: [ofa-general] Re: [PATCH V2] ibsim: Add a Node Description query drop error. In-Reply-To: <20080804140537.3d39a837.weiny2@llnl.gov> References: <20080730174011.5d80e036.weiny2@llnl.gov> <20080731180323.GV14872@sashak.voltaire.com> <20080731132053.17cd8915.weiny2@llnl.gov> <20080803170946.GF15644@sashak.voltaire.com> <20080804140537.3d39a837.weiny2@llnl.gov> Message-ID: <20080807080918.GJ14250@sashak.voltaire.com> On 14:05 Mon 04 Aug , Ira Weiny wrote: > > This did not quite work (had to add an exception in pc_updated). Right, I found this later too. > Here is a > revised version with some additions to the help message for ease of use. > > Ira > > > From 4d1fb5b5ba24584e27d09e51e29745f986f84a32 Mon Sep 17 00:00:00 2001 > From: Sasha Khapyorsky > Date: Sun, 3 Aug 2008 20:09:46 +0300 > Subject: [PATCH] ibsim: Add a Node Description query drop error. > > Hi Ira, > > On 13:20 Thu 31 Jul , Ira Weiny wrote: > > > > Like this? > > Not exactly. I meant possibility to specify any attribute to drop. Like > this. > > Sasha > > Signed-off-by: Ira Weiny Applied. Thanks. Sasha From diego.guella at deviltechnologies.net Thu Aug 7 01:22:13 2008 From: diego.guella at deviltechnologies.net (Diego Guella) Date: Thu, 7 Aug 2008 10:22:13 +0200 Subject: [ofa-general] Infiniband and Opensuse 10.3 stock rpms References: <000601c8f7ad$90eb6d10$05c8a8c0@DIEGO><4899848B.4080705@dev.mellanox.co.il><006501c8f7ba$6161db80$05c8a8c0@DIEGO><004701c8f7c8$3955bfe0$05c8a8c0@DIEGO> <20080806165405.GD19158@sashak.voltaire.com> Message-ID: <00e401c8f866$b22faa40$05c8a8c0@DIEGO> All of this was solved, thanks to the help of Hal Rosenstock, but messages gone off-list. The solution was this: I created the file /etc/udev/rules.d/99-udev-umad.rules: ----- KERNEL=="umad*", NAME="infiniband/%k", MODE="0666" ----- From: "Sasha Khapyorsky" > Also you will need /dev/infiniband/umad0 entry. Prior to this change, umad0 and umad1 were in /dev, now they are in /dev/infiniband. Shouldn't this rule be created by the package libibumad? Thanks, Diego From vlad at lists.openfabrics.org Thu Aug 7 02:44:47 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 7 Aug 2008 02:44:47 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080807-0200 daily build status Message-ID: <20080807094447.4027DE60E03@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.24 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-1.2798.fc6_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-1.2798.fc6' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.20 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.20_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.20' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: include/asm/apic.h:47: warning: value computed is not used /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1840: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.17_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080807-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From kliteyn at dev.mellanox.co.il Thu Aug 7 04:54:13 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Thu, 07 Aug 2008 14:54:13 +0300 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: <489AE265.2050109@dev.mellanox.co.il> Wen Hao Wang wrote: > Hi, Yevgeny: > > It seems the Cisco switch has subnet manager running. > > ... > > >By default, osmtest runs all validation tests, which is similar > >to 'osmtest -f a'. This flow expects to get an input inventory file. > >You should first run 'osmtest -f c' to create such file, and then > >'osmtest' or 'osmtest -f a' to run the tests. > >See 'man osmtest' for more details. > > > "osmtest -f c" failed to create the inventory file. Here's what I see in the osmtest log: Aug 07 04:56:53 400669 [42909940] 0x01 -> umad_receiver: ERR 5409: send completed with error (method=0x12 attr=0x35 trans_id=0x2900000004) -- dropping Aug 07 04:56:53 400674 [42909940] 0x01 -> umad_receiver: ERR 5410: class 0x3 LID 0x2 Aug 07 04:56:53 400687 [42909940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_TIMEOUT) Attribute ID 0x35 is PathRecord. I don't know why didn't the embedded SM answer the PathRecord query from the osmtest. I've never try to run osmtest against non-opensm subnet managers, and I don't know if someone did it before. Sorry. Perhaps someone could comment on that... -- Yevgeny > > Wen Hao Wang > Email: wangwhao at cn.ibm.com > From ruimario at gmail.com Thu Aug 7 05:47:27 2008 From: ruimario at gmail.com (Rui Machado) Date: Thu, 7 Aug 2008 14:47:27 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808061031w116cf699oea91ba299a695866@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> <4899D49C.2040204@gmail.com> <6978b4af0808061031w116cf699oea91ba299a695866@mail.gmail.com> Message-ID: <6978b4af0808070547m7a90abbcsbbd24cbc42e202d8@mail.gmail.com> >> I have a feeling that you refer to the value of max_mr (am i right?) > > :) yep sorry. > The value for max_mr_size is 18446744073709551615 (can this one be ? ) > Again, how do I decode this? >>> Mellanox >>> ca type:25218 (vendor_part_id) >>> fw_version : 5.1.400 (fw_ver) >>> hw_version : a0 (hw_ver) >>> >> >> The module parameter "num_mtt" control the size of the above described >> table. > > Ok. And is there a limit? > And out of curiosity, how does this calculation gets done? I mean, can > I take the values and say: Ok, with this num_mtt we can go up to X? Tried with the module parameter but not sucessfully. I get errors like ib_mthca 0000:23:00.0: Failed to initialize memory region table, aborting. or ib_mthca: Invalid value 1048580 for num_mtt in module parameter. ib_mthca: Corrected num_mtt to 2097152. I guess I'm just shooting in the dark :) And is there a relation to the max_mr_size above? Thanks for the patience ;) From brunel at diku.dk Thu Aug 7 08:15:53 2008 From: brunel at diku.dk (Julien Brunel) Date: Thu, 7 Aug 2008 17:15:53 +0200 Subject: [ofa-general] [PATCH] drivers/infiniband/core: Use an IS_ERR test rather than a NULL test Message-ID: <200808071715.54004.brunel@diku.dk> From: Julien Brunel In case of error, the function ib_create_send_mad returns an ERR pointer, but never returns a NULL pointer. So after a call to this function, a NULL test should be replaced by an IS_ERR test. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // @correct_null_test@ expression x,E; statement S1, S2; @@ x = ib_create_send_mad(...) <... when != x = E if ( ( - x at p2 != NULL + ! IS_ERR ( x ) | - x at p2 == NULL + IS_ERR( x ) ) ) S1 else S2 ...> ? x = E; // Signed-off-by: Julien Brunel Signed-off-by: Julia Lawall --- drivers/infiniband/core/mad_rmpp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -u -p a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c --- a/drivers/infiniband/core/mad_rmpp.c +++ b/drivers/infiniband/core/mad_rmpp.c @@ -133,7 +133,7 @@ static void ack_recv(struct mad_rmpp_rec msg = ib_create_send_mad(&rmpp_recv->agent->agent, recv_wc->wc->src_qp, recv_wc->wc->pkey_index, 1, hdr_len, 0, GFP_KERNEL); - if (!msg) + if (IS_ERR(msg)) return; format_ack(msg, (struct ib_rmpp_mad *) recv_wc->recv_buf.mad, rmpp_recv); From rdreier at cisco.com Thu Aug 7 08:23:09 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 08:23:09 -0700 Subject: [ofa-general] Re: [PATCH] drivers/infiniband/core: Use an IS_ERR test rather than a NULL test In-Reply-To: <200808071715.54004.brunel@diku.dk> (Julien Brunel's message of "Thu, 7 Aug 2008 17:15:53 +0200") References: <200808071715.54004.brunel@diku.dk> Message-ID: good stuff, applied. thanks. From rdreier at cisco.com Thu Aug 7 08:26:11 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 08:26:11 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <4899CF0A.1060509@Voltaire.COM> (Yossi Etigin's message of "Wed, 06 Aug 2008 19:19:22 +0300") References: <4899CF0A.1060509@Voltaire.COM> Message-ID: > Instead of loop-waiting for the lock, give it up if can't lock. > Same thing is done in drivers/net/cxgb3/cxgb3_main.c. I think this is worse ... now if there's anything (*anything* at all -- even stuff related to different devices) holding the rtnl lock at the wrong time, we lose an mtu update. I haven't had a chance to look at this in detail yet, but I would really like to investigate whether we can just avoid the potential deadlock in some more elegant way. From rdreier at cisco.com Thu Aug 7 08:27:46 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 08:27:46 -0700 Subject: [ofa-general] [PATCH] mlx4_ib: Allow 4K messages for UD QPs In-Reply-To: <20080806083203.GA7768@mtls03> (Eli Cohen's message of "Wed, 6 Aug 2008 11:32:03 +0300") References: <20080806083203.GA7768@mtls03> Message-ID: > Current code limits UD QPs message size to 2K while MTU is set to 4K. > This patch sets message size to 4K. What is the impact of the current situation? Is this an enhancement for 2.6.28, or does it fix a real-world problem and should go to 2.6.27? > Signed-off-by: Alex Naslednikov > Signed-off-by: Eli Cohen Does this mean Alex was the original author? If so there should be a line like From: Alex Naslednikov before the changelog entry too. - R. From eli at dev.mellanox.co.il Thu Aug 7 08:46:44 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 7 Aug 2008 18:46:44 +0300 Subject: [ewg] Re: [ofa-general] [PATCH] mlx4_ib: Allow 4K messages for UD QPs In-Reply-To: References: <20080806083203.GA7768@mtls03> Message-ID: <20080807154644.GC2137@mtls03> On Thu, Aug 07, 2008 at 08:27:46AM -0700, Roland Dreier wrote: > > What is the impact of the current situation? Is this an enhancement for > 2.6.28, or does it fix a real-world problem and should go to 2.6.27? The impact is that if you have the HCA FW that supports 4K MTU, you won't be able to send 4K UD messages since the HCA will limit you due to this configuration. So I think we want to push this to 2.6.27 too. > > > Signed-off-by: Alex Naslednikov > > Signed-off-by: Eli Cohen > > Does this mean Alex was the original author? If so there should be a > line like > > From: Alex Naslednikov > > before the changelog entry too. > Sure, the patch is Alex's. Next time... From amar.mudrankit at gmail.com Thu Aug 7 09:31:00 2008 From: amar.mudrankit at gmail.com (Amar Mudrankit) Date: Thu, 7 Aug 2008 22:01:00 +0530 Subject: [ofa-general] OpenSM QoS Query Message-ID: Considering the following QoS configuration for OpenSM running over OFED-1.3.1 /var/cache/opensm/opensm.opts # QoS default options qos_max_vls 8 qos_high_limit 8 qos_vlarb_high 1:1,2:1 qos_vlarb_low 1:1 qos_sl2vl 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 /etc/opensm/qos-policy.conf qos-ulps default : 0 ipoib : 1 srp, target-port-guid : 2 end-qos-ulps IPoIB running in connected mode with MTU of 65520 bytes. In this scenario, do we expect SRP to starve and will not get an opportunity to send data? This is because, if the packet size of IPoIB data is 64K, the Limit of High Priority(8 * 4KB) will be eaten up by IPoIB itself so that next active table would be low priority VL arbitration table. In the low priority VL arbitration table also, we have configured IPoIB, which will send IPoIB data. Eventually when high priority table will become active again, 64K ipoib data will be sent. This will not ever schedule service level 2 (of that of SRP). Is this understanding correct? or When the second time High priority VL arbitration table becomes active, the current pointer will be pointing to service level 2 in high priority table and SRP data can flow? I am not sure if this second possibility is valid. Thanks and Regards, Amar From rdreier at cisco.com Thu Aug 7 09:41:26 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 09:41:26 -0700 Subject: [ofa-general] [PATCH] Use vmalloc to alloc the rx_ring In-Reply-To: <1217519029.9162.3.camel@dumpserver> (David J. Wilder's message of "Thu, 31 Jul 2008 08:43:49 -0700") References: <1217483438.29436.11.camel@dumpserver> <1217519029.9162.3.camel@dumpserver> Message-ID: What is the severity of this issue? Is this a patch for 2.6.27? > - rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); > - if (!rx->rx_ring) > + rx->rx_ring = vmalloc( ipoib_recvq_size * sizeof *rx->rx_ring); no space after '(' here. > + > + if (!rx->rx_ring){ > + printk(KERN_WARNING "ipoib_cm:Allocation of rx_ring failed, %s", > + "try using a lower value of recv_queue_size.\n"); > return -ENOMEM; > + } > > t = kmalloc(sizeof *t, GFP_KERNEL); > if (!t) { Seems you are replacing kcalloc with vmalloc, but I don't see anything that clears the memory you allocate. > + priv->cm.srq_ring = vmalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring); > + > if (!priv->cm.srq_ring) { > printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", > priv->ca->name, ipoib_recvq_size); > ib_destroy_srq(priv->cm.srq); > priv->cm.srq = NULL; > } > + memset(priv->cm.srq_ring, 0, > + ipoib_recvq_size * sizeof *priv->cm.srq_ring); And here it seems if the allocation fail, you go on to zero out the ring anyway. (Not to mention trailing whitespace on the memset line :) - R. From aj.guillon at gmail.com Thu Aug 7 10:09:16 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 7 Aug 2008 13:09:16 -0400 Subject: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help Message-ID: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> Hello List, I've been reading the Infiniband Architecture specification, and I'm ready to write my first RDMA-enabled C++ library. I have looked at some sample code provided by librdmacm, and I'm having a problem going from the big picture of what I want to do to the finer details. I want to remotely access memory on various compute nodes, and the Infiniband spec does everything I need. However upon looking at librdmacm, and the Infiniband verbs I don't see how I actually write to remote memory locations. Also I don't completely understand why I am responsible for creating QPs in userspace, if everything is supposed to be handled by hardware... although I suppose I still have to allocate memory for the hardware to actually use. I'm looking for some help from this list with a high-level overview of what I have to do to establish a connection between two nodes, and how I would write an object to remote memory, or read an object from remote memory. My current understanding is that I would use librdmacm to create a connection between the nodes that I want to communicate, and then use the inifinband verbs themselves to read/write memory, and to establish a memory window. I'm developing on Linux. Thanks a lot! AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Thu Aug 7 10:44:06 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 7 Aug 2008 20:44:06 +0300 Subject: [ofa-general] [PATCH] osmtest: fix qpn encoding in osmtest_informinfo_request() Message-ID: <20080807174406.GH14872@sashak.voltaire.com> In osmtest_informinfo_request() function qpn was wrongly encoded when used as part of InformInfo. For this reason osmtest didn't work on a big endian machines. Signed-off-by: Sasha Khapyorsky --- opensm/osmtest/osmtest.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/osmtest/osmtest.c b/opensm/osmtest/osmtest.c index b173e49..5545c8f 100644 --- a/opensm/osmtest/osmtest.c +++ b/opensm/osmtest/osmtest.c @@ -4733,7 +4733,7 @@ osmtest_informinfo_request(IN osmtest_t * const p_osmt, rec.subscribe = (uint8_t) p_inform_info_opt->subscribe; if (p_inform_info_opt->qpn) { rec.g_or_v.generic.qpn_resp_time_val = - cl_hton32(p_inform_info_opt->qpn) >> 8; + cl_hton32(p_inform_info_opt->qpn << 8); user.comp_mask |= IB_IIR_COMPMASK_QPN; } if (p_inform_info_opt->trap) { -- 1.5.4.rc2.60.gb2e62 From hal.rosenstock at gmail.com Thu Aug 7 11:02:53 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 7 Aug 2008 14:02:53 -0400 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: <489AE265.2050109@dev.mellanox.co.il> References: <489AE265.2050109@dev.mellanox.co.il> Message-ID: On Thu, Aug 7, 2008 at 7:54 AM, Yevgeny Kliteynik wrote: > Wen Hao Wang wrote: >> >> Hi, Yevgeny: >> >> It seems the Cisco switch has subnet manager running. >> >> ... >> >> >By default, osmtest runs all validation tests, which is similar >> >to 'osmtest -f a'. This flow expects to get an input inventory file. >> >You should first run 'osmtest -f c' to create such file, and then >> >'osmtest' or 'osmtest -f a' to run the tests. >> >See 'man osmtest' for more details. >> >> >> "osmtest -f c" failed to create the inventory file. > > Here's what I see in the osmtest log: > > Aug 07 04:56:53 400669 [42909940] 0x01 -> umad_receiver: ERR 5409: send > completed with error (method=0x12 attr=0x35 trans_id=0x2900000004) -- > dropping > Aug 07 04:56:53 400674 [42909940] 0x01 -> umad_receiver: ERR 5410: class 0x3 > LID 0x2 > Aug 07 04:56:53 400687 [42909940] 0x01 -> osmtest_query_res_cb: ERR 0003: > Error on query (IB_TIMEOUT) > > Attribute ID 0x35 is PathRecord. > I don't know why didn't the embedded SM answer the PathRecord query > from the osmtest. I've never try to run osmtest against non-opensm > subnet managers, and I don't know if someone did it before. Sorry. > > Perhaps someone could comment on that... osmtest uses a non compliant query to get all the paths and likely only OpenSM supports this extension. -- Hal > > -- Yevgeny > >> >> Wen Hao Wang >> Email: wangwhao at cn.ibm.com >> > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From dotanba at gmail.com Thu Aug 7 12:27:08 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 07 Aug 2008 21:27:08 +0200 Subject: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> Message-ID: <489B4C8C.40606@gmail.com> Adrien Guillon wrote: > Hello List, > > I've been reading the Infiniband Architecture specification, and I'm > ready to write my first RDMA-enabled C++ library. I have looked at > some sample code provided by librdmacm, and I'm having a problem going > from the big picture of what I want to do to the finer details. > > I want to remotely access memory on various compute nodes, and the > Infiniband spec does everything I need. However upon looking at > librdmacm, and the Infiniband verbs I don't see how I actually write > to remote memory locations. Also I don't completely understand why I > am responsible for creating QPs in userspace, if everything is > supposed to be handled by hardware... although I suppose I still have > to allocate memory for the hardware to actually use. If one wishes to write to remote address: 1) the remote QP access right should support incoming RDMA Write 2) the remote side MR should support Remote Write 3) locally, there is a need to post Send Request with RDMA Write operation (and then poll the completion, if such was created) > > I'm looking for some help from this list with a high-level overview of > what I have to do to establish a connection between two nodes, and how > I would write an object to remote memory, or read an object from > remote memory. My current understanding is that I would use librdmacm > to create a connection between the nodes that I want to communicate, > and then use the inifinband verbs themselves to read/write memory, and > to establish a memory window. You are absolutely right. Please notice that RDMA Write doesn't consume any Receive Request in the Receiver side. I believe that basin the ucma examples with minor changes (that i wrote above) can be written easily. > > I'm developing on Linux. Which is a great OS :) > > Thanks a lot! You are welcome Dotan > > AJ > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From dotanba at gmail.com Thu Aug 7 12:31:34 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 07 Aug 2008 21:31:34 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808070547m7a90abbcsbbd24cbc42e202d8@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> <4899D49C.2040204@gmail.com> <6978b4af0808061031w116cf699oea91ba299a695866@mail.gmail.com> <6978b4af0808070547m7a90abbcsbbd24cbc42e202d8@mail.gmail.com> Message-ID: <489B4D96.8090607@gmail.com> Rui Machado wrote: >>> I have a feeling that you refer to the value of max_mr (am i right?) >>> > > >> :) yep sorry. >> The value for max_mr_size is 18446744073709551615 (can this one be ? ) >> Again, how do I decode this? >> > > > >>>> Mellanox >>>> ca type:25218 (vendor_part_id) >>>> fw_version : 5.1.400 (fw_ver) >>>> hw_version : a0 (hw_ver) >>>> >>>> >>> The module parameter "num_mtt" control the size of the above described >>> table. >>> >> Ok. And is there a limit? >> And out of curiosity, how does this calculation gets done? I mean, can >> I take the values and say: Ok, with this num_mtt we can go up to X? >> I guess there is an upper limit (device specific) but since i don't have an available data sheet for this device i can't tell you what it is. > > Tried with the module parameter but not sucessfully. I get errors like > > ib_mthca 0000:23:00.0: Failed to initialize memory region table, aborting. > or > ib_mthca: Invalid value 1048580 for num_mtt in module parameter. > ib_mthca: Corrected num_mtt to 2097152. > > I guess I'm just shooting in the dark :) > I remember that i saw that in the past (i think that the problem was kmalloc of more than 128K..) Can you try to decrease the number of the resources that you don't need (MRs/CQs/QPs/SRQs) using the module parameters, i think that this MAY help you. > And is there a relation to the max_mr_size above? > No, this limit is the maximum block size that the device can handle. > Thanks for the patience > ;) > You are welcome :) Dotan From aj.guillon at gmail.com Thu Aug 7 11:44:41 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 7 Aug 2008 14:44:41 -0400 Subject: ***SPAM*** Re: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <489B4C8C.40606@gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> Message-ID: <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> Could you give me the verb that is used to write to remote memory? I just don't seem to be able to find it, but each time I skim through the spec I learn a bit more :-) "I believe that basin the ucma examples with minor changes (that i wrote above) can be written easily." Could you clarify what you meant? Thanks for the response! AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruimario at gmail.com Thu Aug 7 11:51:56 2008 From: ruimario at gmail.com (Rui Machado) Date: Thu, 7 Aug 2008 20:51:56 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <489B4D96.8090607@gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> <4899D49C.2040204@gmail.com> <6978b4af0808061031w116cf699oea91ba299a695866@mail.gmail.com> <6978b4af0808070547m7a90abbcsbbd24cbc42e202d8@mail.gmail.com> <489B4D96.8090607@gmail.com> Message-ID: <6978b4af0808071151v60f96226n97b3bcec5b028739@mail.gmail.com> Withouth wanting to abuse on your generosity :) : >>> >>> :) yep sorry. >>> The value for max_mr_size is 18446744073709551615 (can this one be ? ) >>> Again, how do I decode this? >> What about the huge number above? It is much bigger than 16GB (my current limit). Does this mean my only take should be on the num_mtt parameter and try to do as you suggested? (reducing on the rest) >> >>> >>> Ok. And is there a limit? >>> And out of curiosity, how does this calculation gets done? I mean, can >>> I take the values and say: Ok, with this num_mtt we can go up to X? >>> > > I guess there is an upper limit (device specific) but since i don't have an > available data sheet for this device > i can't tell you what it is. >> I guess you can't find that online or? >> Tried with the module parameter but not sucessfully. I get errors like >> >> ib_mthca 0000:23:00.0: Failed to initialize memory region table, aborting. >> or >> ib_mthca: Invalid value 1048580 for num_mtt in module parameter. >> ib_mthca: Corrected num_mtt to 2097152. >> >> I guess I'm just shooting in the dark :) >> > > I remember that i saw that in the past (i think that the problem was kmalloc > of more than 128K..) > Can you try to decrease the number of the resources that you don't need > (MRs/CQs/QPs/SRQs) using the > module parameters, i think that this MAY help you. I will try! Thanks for the generosity :D From hal.rosenstock at gmail.com Thu Aug 7 11:54:43 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Thu, 7 Aug 2008 14:54:43 -0400 Subject: [ofa-general] OpenSM QoS Query In-Reply-To: References: Message-ID: On Thu, Aug 7, 2008 at 12:31 PM, Amar Mudrankit wrote: > Considering the following QoS configuration for OpenSM running over OFED-1.3.1 > > > /var/cache/opensm/opensm.opts > > # QoS default options > qos_max_vls 8 > qos_high_limit 8 > qos_vlarb_high 1:1,2:1 > qos_vlarb_low 1:1 > qos_sl2vl 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > > > /etc/opensm/qos-policy.conf > > qos-ulps > > default : 0 > ipoib : 1 > srp, target-port-guid : 2 > > end-qos-ulps > > IPoIB running in connected mode with MTU of 65520 bytes. > > In this scenario, do we expect SRP to starve and will not get an > opportunity to send data? > > This is because, if the packet size of IPoIB data is 64K, the Limit of > High Priority(8 * 4KB) will be eaten up by IPoIB itself so that next > active table would be low priority VL arbitration table. In the low > priority VL arbitration table also, we have configured IPoIB, which > will send IPoIB data. Eventually when high priority table will become > active again, 64K ipoib data will be sent. This will not ever schedule > service level 2 (of that of SRP). Is this understanding correct? > > or > > When the second time High priority VL arbitration table becomes > active, the current pointer will be pointing to service level 2 in ^^^^^^^^^^^^^ VL > high priority table and SRP data can flow? I am not sure if this > second possibility is valid. The arbitration is supposed to be "fair weighted". See 7.6.9.2.4 ARBITRATION RULES WITHIN THE HIGH AND LOW COMPONENTS IBA 1.2.1 vol 1 p, 194-5 for more details. It states: "Within each High or Low Priority table, weighted fair arbitration is used, with the order of entries in each table specifying the order of VL scheduling, and the weighting value specifying the amount of bandwidth allocated to that entry. Each entry in the table is processed in order. A separate pointer and available weight count is maintained for each of the two tables. The pointers identify the current entry in the table, while the available weight count indicates the amount of weight that the current entry has available for data packet transmission" and: "Further, implementations are not required to implement the pointers, available weight counter and HighPriCounter. They must, however, behave in a manner equivalent to that described in this section." So FWIW I think the right answer is 2. BTW, this is an arbiter question. The SM merely sets up the relevant tables. -- Hal > > Thanks and Regards, > Amar > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From rdreier at cisco.com Thu Aug 7 12:12:51 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 12:12:51 -0700 Subject: ***SPAM*** Re: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> (Adrien Guillon's message of "Thu, 7 Aug 2008 14:44:41 -0400") References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> Message-ID: > Could you give me the verb that is used to write to remote memory? I just > don't seem to be able to find it, but each time I skim through the spec I > learn a bit more :-) To write to remote memory, you post a work request to a send queue with opcode "RDMA write." This requires supplying the remote destination memory key and virtual address in the work request structure, and the local data to be copied in the gather/scatter list. - R. From dotanba at gmail.com Thu Aug 7 13:15:01 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 07 Aug 2008 22:15:01 +0200 Subject: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> Message-ID: <489B57C5.50100@gmail.com> Adrien Guillon wrote: > Could you give me the verb that is used to write to remote memory? I > just don't seem to be able to find it, but each time I skim through > the spec I learn a bit more :-) Yes, the verb is : ibv_post_send, you should use the RDMA Write opcode. (it add a job to the work queue) > > "I believe that basin the ucma examples with minor changes (that i > wrote above) can be written easily." In the rdcmacm + libibverbs + performance which come as part of OFED there are code examples that connect the QPs using the rdmacm and post RDMA Write, maybe there isn't a single example that does all of it, but combination of two must do it.... What i meant was that taking an example from the rdmacm and changing it to post RDMA Write instead of SENDs should not be that difficult ... > > Could you clarify what you meant? > > Thanks for the response! You are welcome. Dotan > > AJ From dotanba at gmail.com Thu Aug 7 13:17:05 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 07 Aug 2008 22:17:05 +0200 Subject: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808071151v60f96226n97b3bcec5b028739@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808060927ga973867ge65b896d419fcd53@mail.gmail.com> <4899D49C.2040204@gmail.com> <6978b4af0808061031w116cf699oea91ba299a695866@mail.gmail.com> <6978b4af0808070547m7a90abbcsbbd24cbc42e202d8@mail.gmail.com> <489B4D96.8090607@gmail.com> <6978b4af0808071151v60f96226n97b3bcec5b028739@mail.gmail.com> Message-ID: <489B5841.9000601@gmail.com> Rui Machado wrote: > Withouth wanting to abuse on your generosity :) : > > >>>> :) yep sorry. >>>> The value for max_mr_size is 18446744073709551615 (can this one be ? ) >>>> Again, how do I decode this? >>>> > What about the huge number above? It is much bigger than 16GB (my > current limit). > Does this mean my only take should be on the num_mtt parameter and try > to do as you suggested? (reducing on the rest) > This number is the size of the maximum continigues block which can be registered, it is a number of bytes (you can decode it in hex, and it will look like 0xffffffffffffffff) > >>>> Ok. And is there a limit? >>>> And out of curiosity, how does this calculation gets done? I mean, can >>>> I take the values and say: Ok, with this num_mtt we can go up to X? >>>> >>>> >> I guess there is an upper limit (device specific) but since i don't have an >> available data sheet for this device >> i can't tell you what it is. >> > > I guess you can't find that online or? > not that i know of (sorry) > >>> Tried with the module parameter but not sucessfully. I get errors like >>> >>> ib_mthca 0000:23:00.0: Failed to initialize memory region table, aborting. >>> or >>> ib_mthca: Invalid value 1048580 for num_mtt in module parameter. >>> ib_mthca: Corrected num_mtt to 2097152. >>> >>> I guess I'm just shooting in the dark :) >>> >>> >> I remember that i saw that in the past (i think that the problem was kmalloc >> of more than 128K..) >> Can you try to decrease the number of the resources that you don't need >> (MRs/CQs/QPs/SRQs) using the >> module parameters, i think that this MAY help you. >> > > I will try! > > Thanks for the generosity :D > Don't mention it... Dotan From jeff at splitrockpr.com Thu Aug 7 12:40:01 2008 From: jeff at splitrockpr.com (Jeffrey Scott) Date: Thu, 07 Aug 2008 12:40:01 -0700 Subject: [ofa-general] IBTA Technical Forum '08 agenda now available Message-ID: Hello OFA Members. We are rapidly approaching this year's IBTA Technical Forum; it's just six weeks away! The theme this year is "InfiniBand and the Enterprise Data Center" and the IBTA has put together a compelling agenda with end-user presentations from General Motors, France Telecom and others, as well as an analyst presentation from Gartner and an interactive panel discussion on the future of InfiniBand. Please see below for more information; register now to receive the early bird discount. Date: Monday, September 15, 2008 Time: 8am - 5pm with networking reception immediately following Location: Harrah's Las Vegas Register: www.regonline.com/IBTATechForum08 Rate: Early bird rate is $249; after September 1 the rate increases to $299 Agenda: http://www.infinibandta.org/events/IBTATechForum08_ The IBTA needs your help spreading the word! The OFA is one of the sponsors for the networking reception taking place immediately following the technical forum. We would like to see the OFA well represented. The IBTA's Marketing Working Group has created a formal invitation (please see the attached) for you to forward to colleagues/vendors/partners/customers. Please assist the IBTA in spreading the word to the entire InfiniBand community, and plan on joining us in Las Vegas! If you have any questions, please contact Samantha Spears at 206-322-1167 x115 or samanthas at owenmedia.com. ----------------------------------- Jeffrey Scott Split Rock Communications 408-884-4017 408-348-3651 Mobile 408-884-3900 Fax www.SplitRockPR.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: IBTA TechForum 08 Invite.pdf Type: application/pdf Size: 2029867 bytes Desc: not available URL: From pmorreale at novell.com Thu Aug 7 12:44:13 2008 From: pmorreale at novell.com (Peter W. Morreale) Date: Thu, 07 Aug 2008 19:44:13 +0000 Subject: [ofa-general] Converting the Linux.au client/server example to a single post_send... Message-ID: <1218138253.8764.34.camel@hermosa.site> Roland, I am attempting to convert your client/server RDMA example code to use a single ibv_post_send() instead of the two calls the client code currently uses. I'm using the example code from your blog: http://digitalvampire.org/blog/ What do I have to change? Right now the client is hanging waiting for wr_id of 0 to complete, and the server is hanging waiting in ibv_get_cq_event(). I have verified that wr_id == 1 has completed. I seem to have a mismatch between events. I don't understand why the server apparently is waiting for another completion. Separate, but related, I do not understand why the server code does a ibv_post_recv() with the sge.addr set to (buf + sizeof(uint32_t). This is apparently setting the address to the second word of the buffer. Shouldn't this be the first word? Perhaps I misunderstand, but it appears that you are telling the server to start the receive into the second word. (which implies the data transfer would be out of bounds of buf, clearly not the case as the original example works.. ???) Thanks much, -PWM From unfounta1952 at provigent.com Thu Aug 7 12:51:28 2008 From: unfounta1952 at provigent.com (Daily Top 10) Date: Thu, 7 Aug 2008 16:51:28 -0300 Subject: [ofa-general] CNN.com Daily Top 10 Message-ID: <20080801155902.cnn-dailytop10@mail.cnn.com> >+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= >THE DAILY TOP 10 >from CNN.com >Top videos and stories as of: Aug 1, 2008 3:58 PM EDT >+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= TOP 10 VIDEOS 1. PARIS HILTON TAKES ON MCCAIN http://www.cnn.com/video/partners/email/index.html?url=/video/politics/2008/08/06/wynter.paris.hilton.ad.cnn Paris Hilton swings back at Republican presidential candidate John McCain. Kareen Wynter reports. 2. BIKINI BARISTA STAND CLOSED http://www.cnn.com/video/partners/email/index.html?url=/video/living/2008/08/06/pkg.bikini.baristas.barred.kiro 3. TOT GRANDMA REACTS TO CHARGES http://www.cnn.com/video/partners/email/index.html?url=/video/crime/2008/08/06/grace.mom.charged.cnn 4. MAGGOTS: THE NEW ANTIBIOTIC? http://www.cnn.com/video/partners/email/index.html?url=/video/health/2008/08/06/mcginty.uk.maggot.antibiotic.itn 5. ON YOUR MARK, GET SET, FLOAT! http://www.cnn.com/video/partners/email/index.html?url=/video/living/2008/08/06/vo.in.balloon.race.wxin 6. 'LOVE TRIANGLE MURDER' http://www.cnn.com/video/partners/email/index.html?url=/video/crime/2008/08/06/dcl.london.in.session.trial.update.insession 7. TERI GARR'S DECADES OF HIDING http://www.cnn.com/video/partners/email/index.html?url=/video/health/2008/08/06/gupta.teri.garr.cnn 8. FAKE SEX AD DRAWS LAWSUIT http://www.cnn.com/video/partners/email/index.html?url=/video/tech/2008/08/06/dnt.wa.sex.ad.lawsuit.komo 9. 'WE NEED AN ECONOMIC SURGE' http://www.cnn.com/video/partners/email/index.html?url=/video/politics/2008/08/06/sot.mccain.econo.surge.cnn 10. OBAMA GOT HIT IN THE HEAD? http://www.cnn.com/video/partners/email/index.html?url=/video/politics/2008/08/06/sot.obama.asked.why.running.cnn TOP 10 STORIES 1. N. DAKOTA'S REAL-LIFE JED CLAMPETT http://www.cnn.com/2008/LIVING/wayoflife/08/05/oil.boomtown/index.html In the midst of a N. Dakota oil boom, a man born during the Great Depression is making a fortune after striking oil on his property. 2. OBAMA'S UPHILL POLLING BATTLE http://www.cnn.com/2008/POLITICS/08/06/obama.polls/index.html 3. D.A.: SEX CASE ABOUT 'PURE EVIL' http://www.cnn.com/2008/CRIME/08/06/sex.club.trial.ap/index.html 4. ANOTHER SIDE OF AMY RAY http://www.cnn.com/2008/SHOWBIZ/Music/08/06/amy.ray/index.html 5. COMMENTARY: BILL CLINTON'S UPSET http://www.cnn.com/2008/US/08/06/martin.billclinton/index.html 6. WOMEN DRIVEN TO DONATE EGGS http://www.cnn.com/2008/HEALTH/08/05/selling.eggs/index.html 7. ANTHRAX SUSPECT CALLED MISLEADING http://www.cnn.com/2008/CRIME/08/06/anthrax.case/index.html 8. MOUNTAIN LION IN BEDROOM KILLS DOG http://www.cnn.com/2008/US/08/06/mountain.lion.ap/index.html 9. REPORTS: MISTAKES DOOMED UTAH MINE http://www.cnn.com/2008/US/08/06/utah.mine.anniv.ap/index.html 10. BIN LADEN'S FORMER DRIVER GUILTY http://www.cnn.com/2008/CRIME/08/06/hamdan.trial/index.html CNN, The Most Trusted Name in News > Cable News Network LP, LLLP. < > One CNN Center, Atlanta, Georgia 30303 < > 2008 Cable News Network LP, LLLP. < > A Time Warner Company. < > All Rights Reserved. < ========================================================= = Please send comments or suggestions by going to = = http://www.cnn.com/feedback/ = = = = Read our privacy guidelines by going to = = http://www.cnn.com/privacy.html = ========================================================= You have agreed to receive this email from CNN.com as a result of your CNN.com preference settings. To manage your settings, go to: http://www.cnn.com/linkto/bn.manage.html To unsubscribe from the Daily Top 10, go to http://cgi.cnn.com/m/clik?e=general at openib.org&l=cnn-dailytop10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj.guillon at gmail.com Thu Aug 7 12:57:35 2008 From: aj.guillon at gmail.com (AJ Guillon) Date: Thu, 7 Aug 2008 15:57:35 -0400 Subject: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <489B57C5.50100@gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> <489B57C5.50100@gmail.com> Message-ID: <4D83AFFB-0CDF-464D-A4D5-9F91396671FC@gmail.com> > > Yes, the verb is : ibv_post_send, you should use the RDMA Write > opcode. > (it add a job to the work queue Thanks. When I use RDMA read/write on a remote system, does that remote system have to do anything (like poll for events) or are the RDMA operations truly transparent to the remote host? Thanks a lot AJ >> From hoot at ptpnow.com Thu Aug 7 12:57:11 2008 From: hoot at ptpnow.com (Hoot Thompson) Date: Thu, 7 Aug 2008 15:57:11 -0400 Subject: [ofa-general] ib_ipoib: Unknown symbol icmpv6_send Message-ID: <002301c8f8c7$c86790f0$640fa8c0@ptpdesk> I've seen other postings noting a similar error but have not seen a resolution. When trying to load the ib_ipoib module I get the following error..... ib_ipoib: Unknown symbol icmpv6_send. How do I clear this error? It's a SuSE 10 SP1 system. Thanks From dotanba at gmail.com Thu Aug 7 14:04:20 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 07 Aug 2008 23:04:20 +0200 Subject: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <4D83AFFB-0CDF-464D-A4D5-9F91396671FC@gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> <489B57C5.50100@gmail.com> <4D83AFFB-0CDF-464D-A4D5-9F91396671FC@gmail.com> Message-ID: <489B6354.5020009@gmail.com> AJ Guillon wrote: >> >> Yes, the verb is : ibv_post_send, you should use the RDMA Write opcode. >> (it add a job to the work queue > > Thanks. When I use RDMA read/write on a remote system, does that > remote system have to do anything (like poll for events) or are the > RDMA operations truly transparent to the remote host? No, before that post the receiver should have created a QP that support this operation, a MR that support this operation and keep the resources alive until the sender will finish all of the read/write (this usually being done by syncing between the two sides before starting the resources with the SEND opcode or by rdmacm connection tear down functions ) Dotan From aj.guillon at gmail.com Thu Aug 7 13:14:48 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 7 Aug 2008 16:14:48 -0400 Subject: ***SPAM*** Re: [ofa-general] RDMA n00b: Remote Memory Access and Connection Setup Help In-Reply-To: <489B6354.5020009@gmail.com> References: <9870a2060808071009y7e7ed1cdxf2234509ae7ee90a@mail.gmail.com> <489B4C8C.40606@gmail.com> <9870a2060808071144h7f27e073w1960f0bb5ecb1b00@mail.gmail.com> <489B57C5.50100@gmail.com> <4D83AFFB-0CDF-464D-A4D5-9F91396671FC@gmail.com> <489B6354.5020009@gmail.com> Message-ID: <9870a2060808071314n75153307k6421866b09aecd9d@mail.gmail.com> Okay thanks for all your help, I have enough to start my programming! AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Thu Aug 7 14:15:26 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 07 Aug 2008 14:15:26 -0700 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get the following fixes: Alex Naslednikov (1): IB/mlx4: Allow 4K messages for UD QPs Alexander Beregalov (1): IB/ipath: Fix printk format warnings Julien Brunel (1): IB/mad: Test ib_create_send_mad() return with IS_ERR(), not == NULL Roland Dreier (3): IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr() RDMA/cma: Remove padding arrays by using struct sockaddr_storage Merge branches 'cma', 'cxgb3', 'ipath', 'ipoib', 'mad' and 'mlx4' into for-linus Steve Wise (3): RDMA/cxgb3: Fix QP capabilities RDMA/cxgb3: Fix up MW access rights RDMA/cxgb3: Fix deadlock initializing iw_cxgb3 device Vegard Nossum (1): IB/ipath: Use unsigned long for irq flags Yevgeny Petrilin (1): mlx4_core: Add ethernet fields to CQE struct drivers/infiniband/core/cma.c | 37 +++++++++++++-------------- drivers/infiniband/core/mad_rmpp.c | 2 +- drivers/infiniband/core/ucma.c | 14 ++++------ drivers/infiniband/hw/cxgb3/cxio_hal.c | 6 ++-- drivers/infiniband/hw/cxgb3/iwch_provider.c | 28 ++------------------ drivers/infiniband/hw/cxgb3/iwch_provider.h | 7 +++++ drivers/infiniband/hw/cxgb3/iwch_qp.c | 25 ++++++------------ drivers/infiniband/hw/ipath/ipath_driver.c | 5 ++- drivers/infiniband/hw/ipath/ipath_iba7220.c | 7 +++-- drivers/infiniband/hw/ipath/ipath_intr.c | 12 ++++++--- drivers/infiniband/hw/ipath/ipath_verbs.c | 6 ++-- drivers/infiniband/hw/mlx4/cq.c | 33 +++++++++++------------ drivers/infiniband/hw/mlx4/qp.c | 2 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 2 +- include/linux/mlx4/cq.h | 36 +++++++++++++++++-------- include/rdma/rdma_cm.h | 8 +---- 16 files changed, 108 insertions(+), 122 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index e980ff3..d951896 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -155,9 +155,7 @@ struct cma_multicast { } multicast; struct list_head list; void *context; - struct sockaddr addr; - u8 pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage addr; }; struct cma_work { @@ -786,8 +784,8 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv, cma_cancel_route(id_priv); break; case CMA_LISTEN: - if (cma_any_addr(&id_priv->id.route.addr.src_addr) && - !id_priv->cma_dev) + if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr) + && !id_priv->cma_dev) cma_cancel_listens(id_priv); break; default: @@ -1026,7 +1024,7 @@ static struct rdma_id_private *cma_new_conn_id(struct rdma_cm_id *listen_id, rt->path_rec[1] = *ib_event->param.req_rcvd.alternate_path; ib_addr_set_dgid(&rt->addr.dev_addr, &rt->path_rec[0].dgid); - ret = rdma_translate_ip(&id->route.addr.src_addr, + ret = rdma_translate_ip((struct sockaddr *) &id->route.addr.src_addr, &id->route.addr.dev_addr); if (ret) goto destroy_id; @@ -1064,7 +1062,7 @@ static struct rdma_id_private *cma_new_udp_id(struct rdma_cm_id *listen_id, cma_save_net_info(&id->route.addr, &listen_id->route.addr, ip_ver, port, src, dst); - ret = rdma_translate_ip(&id->route.addr.src_addr, + ret = rdma_translate_ip((struct sockaddr *) &id->route.addr.src_addr, &id->route.addr.dev_addr); if (ret) goto err; @@ -1377,7 +1375,7 @@ static int cma_ib_listen(struct rdma_id_private *id_priv) if (IS_ERR(id_priv->cm_id.ib)) return PTR_ERR(id_priv->cm_id.ib); - addr = &id_priv->id.route.addr.src_addr; + addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr; svc_id = cma_get_service_id(id_priv->id.ps, addr); if (cma_any_addr(addr)) ret = ib_cm_listen(id_priv->cm_id.ib, svc_id, 0, NULL); @@ -1443,7 +1441,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv, dev_id_priv->state = CMA_ADDR_BOUND; memcpy(&id->route.addr.src_addr, &id_priv->id.route.addr.src_addr, - ip_addr_size(&id_priv->id.route.addr.src_addr)); + ip_addr_size((struct sockaddr *) &id_priv->id.route.addr.src_addr)); cma_attach_to_dev(dev_id_priv, cma_dev); list_add_tail(&dev_id_priv->listen_list, &id_priv->listen_list); @@ -1563,13 +1561,14 @@ static int cma_query_ib_route(struct rdma_id_private *id_priv, int timeout_ms, path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(&addr->dev_addr)); path_rec.numb_path = 1; path_rec.reversible = 1; - path_rec.service_id = cma_get_service_id(id_priv->id.ps, &addr->dst_addr); + path_rec.service_id = cma_get_service_id(id_priv->id.ps, + (struct sockaddr *) &addr->dst_addr); comp_mask = IB_SA_PATH_REC_DGID | IB_SA_PATH_REC_SGID | IB_SA_PATH_REC_PKEY | IB_SA_PATH_REC_NUMB_PATH | IB_SA_PATH_REC_REVERSIBLE | IB_SA_PATH_REC_SERVICE_ID; - if (addr->src_addr.sa_family == AF_INET) { + if (addr->src_addr.ss_family == AF_INET) { path_rec.qos_class = cpu_to_be16((u16) id_priv->tos); comp_mask |= IB_SA_PATH_REC_QOS_CLASS; } else { @@ -1848,7 +1847,7 @@ static int cma_resolve_loopback(struct rdma_id_private *id_priv) ib_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid); ib_addr_set_dgid(&id_priv->id.route.addr.dev_addr, &gid); - if (cma_zero_addr(&id_priv->id.route.addr.src_addr)) { + if (cma_zero_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr)) { src_in = (struct sockaddr_in *)&id_priv->id.route.addr.src_addr; dst_in = (struct sockaddr_in *)&id_priv->id.route.addr.dst_addr; src_in->sin_family = dst_in->sin_family; @@ -1897,7 +1896,7 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, if (cma_any_addr(dst_addr)) ret = cma_resolve_loopback(id_priv); else - ret = rdma_resolve_ip(&addr_client, &id->route.addr.src_addr, + ret = rdma_resolve_ip(&addr_client, (struct sockaddr *) &id->route.addr.src_addr, dst_addr, &id->route.addr.dev_addr, timeout_ms, addr_handler, id_priv); if (ret) @@ -2021,11 +2020,11 @@ static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv) * We don't support binding to any address if anyone is bound to * a specific address on the same port. */ - if (cma_any_addr(&id_priv->id.route.addr.src_addr)) + if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr)) return -EADDRNOTAVAIL; hlist_for_each_entry(cur_id, node, &bind_list->owners, node) { - if (cma_any_addr(&cur_id->id.route.addr.src_addr)) + if (cma_any_addr((struct sockaddr *) &cur_id->id.route.addr.src_addr)) return -EADDRNOTAVAIL; cur_sin = (struct sockaddr_in *) &cur_id->id.route.addr.src_addr; @@ -2060,7 +2059,7 @@ static int cma_get_port(struct rdma_id_private *id_priv) } mutex_lock(&lock); - if (cma_any_port(&id_priv->id.route.addr.src_addr)) + if (cma_any_port((struct sockaddr *) &id_priv->id.route.addr.src_addr)) ret = cma_alloc_any_port(ps, id_priv); else ret = cma_use_port(ps, id_priv); @@ -2232,7 +2231,7 @@ static int cma_resolve_ib_udp(struct rdma_id_private *id_priv, req.path = route->path_rec; req.service_id = cma_get_service_id(id_priv->id.ps, - &route->addr.dst_addr); + (struct sockaddr *) &route->addr.dst_addr); req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8); req.max_cm_retries = CMA_MAX_CM_RETRIES; @@ -2283,7 +2282,7 @@ static int cma_connect_ib(struct rdma_id_private *id_priv, req.alternate_path = &route->path_rec[1]; req.service_id = cma_get_service_id(id_priv->id.ps, - &route->addr.dst_addr); + (struct sockaddr *) &route->addr.dst_addr); req.qp_num = id_priv->qp_num; req.qp_type = IB_QPT_RC; req.starting_psn = id_priv->seq_num; @@ -2667,7 +2666,7 @@ static int cma_join_ib_multicast(struct rdma_id_private *id_priv, if (ret) return ret; - cma_set_mgid(id_priv, &mc->addr, &rec.mgid); + cma_set_mgid(id_priv, (struct sockaddr *) &mc->addr, &rec.mgid); if (id_priv->id.ps == RDMA_PS_UDP) rec.qkey = cpu_to_be32(RDMA_UDP_QKEY); ib_addr_get_sgid(dev_addr, &rec.port_gid); diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c index d0ef7d6..3af2b84 100644 --- a/drivers/infiniband/core/mad_rmpp.c +++ b/drivers/infiniband/core/mad_rmpp.c @@ -133,7 +133,7 @@ static void ack_recv(struct mad_rmpp_recv *rmpp_recv, msg = ib_create_send_mad(&rmpp_recv->agent->agent, recv_wc->wc->src_qp, recv_wc->wc->pkey_index, 1, hdr_len, 0, GFP_KERNEL); - if (!msg) + if (IS_ERR(msg)) return; format_ack(msg, (struct ib_rmpp_mad *) recv_wc->recv_buf.mad, rmpp_recv); diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c index b41dd26..3ddacf3 100644 --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -81,9 +81,7 @@ struct ucma_multicast { u64 uid; struct list_head list; - struct sockaddr addr; - u8 pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage addr; }; struct ucma_event { @@ -603,11 +601,11 @@ static ssize_t ucma_query_route(struct ucma_file *file, return PTR_ERR(ctx); memset(&resp, 0, sizeof resp); - addr = &ctx->cm_id->route.addr.src_addr; + addr = (struct sockaddr *) &ctx->cm_id->route.addr.src_addr; memcpy(&resp.src_addr, addr, addr->sa_family == AF_INET ? sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6)); - addr = &ctx->cm_id->route.addr.dst_addr; + addr = (struct sockaddr *) &ctx->cm_id->route.addr.dst_addr; memcpy(&resp.dst_addr, addr, addr->sa_family == AF_INET ? sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6)); @@ -913,7 +911,7 @@ static ssize_t ucma_join_multicast(struct ucma_file *file, mc->uid = cmd.uid; memcpy(&mc->addr, &cmd.addr, sizeof cmd.addr); - ret = rdma_join_multicast(ctx->cm_id, &mc->addr, mc); + ret = rdma_join_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr, mc); if (ret) goto err2; @@ -929,7 +927,7 @@ static ssize_t ucma_join_multicast(struct ucma_file *file, return 0; err3: - rdma_leave_multicast(ctx->cm_id, &mc->addr); + rdma_leave_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr); ucma_cleanup_mc_events(mc); err2: mutex_lock(&mut); @@ -975,7 +973,7 @@ static ssize_t ucma_leave_multicast(struct ucma_file *file, goto out; } - rdma_leave_multicast(mc->ctx->cm_id, &mc->addr); + rdma_leave_multicast(mc->ctx->cm_id, (struct sockaddr *) &mc->addr); mutex_lock(&mc->ctx->file->mut); ucma_cleanup_mc_events(mc); list_del(&mc->list); diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index f6d5747..4dcf08b 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -725,9 +725,9 @@ static int __cxio_tpt_op(struct cxio_rdev *rdev_p, u32 reset_tpt_entry, V_TPT_STAG_TYPE(type) | V_TPT_PDID(pdid)); BUG_ON(page_size >= 28); tpt.flags_pagesize_qpid = cpu_to_be32(V_TPT_PERM(perm) | - F_TPT_MW_BIND_ENABLE | - V_TPT_ADDR_TYPE((zbva ? TPT_ZBTO : TPT_VATO)) | - V_TPT_PAGE_SIZE(page_size)); + ((perm & TPT_MW_BIND) ? F_TPT_MW_BIND_ENABLE : 0) | + V_TPT_ADDR_TYPE((zbva ? TPT_ZBTO : TPT_VATO)) | + V_TPT_PAGE_SIZE(page_size)); tpt.rsvd_pbl_addr = reset_tpt_entry ? 0 : cpu_to_be32(V_TPT_PBL_ADDR(PBL_OFF(rdev_p, pbl_addr)>>3)); tpt.len = cpu_to_be32(len); diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index b89640a..eb778bf 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -1187,28 +1187,6 @@ static ssize_t show_rev(struct device *dev, struct device_attribute *attr, return sprintf(buf, "%d\n", iwch_dev->rdev.t3cdev_p->type); } -static int fw_supports_fastreg(struct iwch_dev *iwch_dev) -{ - struct ethtool_drvinfo info; - struct net_device *lldev = iwch_dev->rdev.t3cdev_p->lldev; - char *cp, *next; - unsigned fw_maj, fw_min; - - rtnl_lock(); - lldev->ethtool_ops->get_drvinfo(lldev, &info); - rtnl_unlock(); - - next = info.fw_version+1; - cp = strsep(&next, "."); - sscanf(cp, "%i", &fw_maj); - cp = strsep(&next, "."); - sscanf(cp, "%i", &fw_min); - - PDBG("%s maj %u min %u\n", __func__, fw_maj, fw_min); - - return fw_maj > 6 || (fw_maj == 6 && fw_min > 0); -} - static ssize_t show_fw_ver(struct device *dev, struct device_attribute *attr, char *buf) { struct iwch_dev *iwch_dev = container_of(dev, struct iwch_dev, @@ -1325,12 +1303,12 @@ int iwch_register_device(struct iwch_dev *dev) memset(&dev->ibdev.node_guid, 0, sizeof(dev->ibdev.node_guid)); memcpy(&dev->ibdev.node_guid, dev->rdev.t3cdev_p->lldev->dev_addr, 6); dev->ibdev.owner = THIS_MODULE; - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | + IB_DEVICE_MEM_WINDOW | + IB_DEVICE_MEM_MGT_EXTENSIONS; /* cxgb3 supports STag 0. */ dev->ibdev.local_dma_lkey = 0; - if (fw_supports_fastreg(dev)) - dev->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS; dev->ibdev.uverbs_cmd_mask = (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h b/drivers/infiniband/hw/cxgb3/iwch_provider.h index f5ceca0..a237d49 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.h +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h @@ -293,9 +293,16 @@ static inline u32 iwch_ib_to_tpt_access(int acc) return (acc & IB_ACCESS_REMOTE_WRITE ? TPT_REMOTE_WRITE : 0) | (acc & IB_ACCESS_REMOTE_READ ? TPT_REMOTE_READ : 0) | (acc & IB_ACCESS_LOCAL_WRITE ? TPT_LOCAL_WRITE : 0) | + (acc & IB_ACCESS_MW_BIND ? TPT_MW_BIND : 0) | TPT_LOCAL_READ; } +static inline u32 iwch_ib_to_tpt_bind_access(int acc) +{ + return (acc & IB_ACCESS_REMOTE_WRITE ? TPT_REMOTE_WRITE : 0) | + (acc & IB_ACCESS_REMOTE_READ ? TPT_REMOTE_READ : 0); +} + enum iwch_mmid_state { IWCH_STAG_STATE_VALID, IWCH_STAG_STATE_INVALID diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 9a3be3a..3e4585c 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -565,7 +565,7 @@ int iwch_bind_mw(struct ib_qp *qp, wqe->bind.type = TPT_VATO; /* TBD: check perms */ - wqe->bind.perms = iwch_ib_to_tpt_access(mw_bind->mw_access_flags); + wqe->bind.perms = iwch_ib_to_tpt_bind_access(mw_bind->mw_access_flags); wqe->bind.mr_stag = cpu_to_be32(mw_bind->mr->lkey); wqe->bind.mw_stag = cpu_to_be32(mw->rkey); wqe->bind.mw_len = cpu_to_be32(mw_bind->length); @@ -879,20 +879,13 @@ static int rdma_init(struct iwch_dev *rhp, struct iwch_qp *qhp, (qhp->attr.mpa_attr.xmit_marker_enabled << 1) | (qhp->attr.mpa_attr.crc_enabled << 2); - /* - * XXX - The IWCM doesn't quite handle getting these - * attrs set before going into RTS. For now, just turn - * them on always... - */ -#if 0 - init_attr.qpcaps = qhp->attr.enableRdmaRead | - (qhp->attr.enableRdmaWrite << 1) | - (qhp->attr.enableBind << 2) | - (qhp->attr.enable_stag0_fastreg << 3) | - (qhp->attr.enable_stag0_fastreg << 4); -#else - init_attr.qpcaps = 0x1f; -#endif + init_attr.qpcaps = uP_RI_QP_RDMA_READ_ENABLE | + uP_RI_QP_RDMA_WRITE_ENABLE | + uP_RI_QP_BIND_ENABLE; + if (!qhp->ibqp.uobject) + init_attr.qpcaps |= uP_RI_QP_STAG0_ENABLE | + uP_RI_QP_FAST_REGISTER_ENABLE; + init_attr.tcp_emss = qhp->ep->emss; init_attr.ord = qhp->attr.max_ord; init_attr.ird = qhp->attr.max_ird; @@ -900,8 +893,6 @@ static int rdma_init(struct iwch_dev *rhp, struct iwch_qp *qhp, init_attr.qp_dma_size = (1UL << qhp->wq.size_log2); init_attr.rqe_count = iwch_rqes_posted(qhp); init_attr.flags = qhp->attr.mpa_attr.initiator ? MPA_INITIATOR : 0; - if (!qhp->ibqp.uobject) - init_attr.flags |= PRIV_QP; if (peer2peer) { init_attr.rtr_type = RTR_READ; if (init_attr.ord == 0 && qhp->attr.mpa_attr.initiator) diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index daad09a..ad0aab6 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -1259,7 +1259,7 @@ reloop: */ ipath_cdbg(ERRPKT, "Error Pkt, but no eflags! egrbuf" " %x, len %x hdrq+%x rhf: %Lx\n", - etail, tlen, l, + etail, tlen, l, (unsigned long long) le64_to_cpu(*(__le64 *) rhf_addr)); if (ipath_debug & __IPATH_ERRPKTDBG) { u32 j, *d, dw = rsize-2; @@ -1457,7 +1457,8 @@ static void ipath_reset_availshadow(struct ipath_devdata *dd) 0xaaaaaaaaaaaaaaaaULL); /* All BUSY bits in qword */ if (oldval != dd->ipath_pioavailshadow[i]) ipath_dbg("shadow[%d] was %Lx, now %lx\n", - i, oldval, dd->ipath_pioavailshadow[i]); + i, (unsigned long long) oldval, + dd->ipath_pioavailshadow[i]); } spin_unlock_irqrestore(&ipath_pioavail_lock, flags); } diff --git a/drivers/infiniband/hw/ipath/ipath_iba7220.c b/drivers/infiniband/hw/ipath/ipath_iba7220.c index fb70712..85b2cd0 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba7220.c +++ b/drivers/infiniband/hw/ipath/ipath_iba7220.c @@ -1032,7 +1032,7 @@ static int ipath_7220_bringup_serdes(struct ipath_devdata *dd) ipath_cdbg(VERBOSE, "done: xgxs=%llx from %llx\n", (unsigned long long) ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig), - prev_val); + (unsigned long long) prev_val); guid = be64_to_cpu(dd->ipath_guid); @@ -1042,7 +1042,8 @@ static int ipath_7220_bringup_serdes(struct ipath_devdata *dd) ipath_dbg("No GUID for heartbeat, faking %llx\n", (unsigned long long)guid); } else - ipath_cdbg(VERBOSE, "Wrote %llX to HRTBT_GUID\n", guid); + ipath_cdbg(VERBOSE, "Wrote %llX to HRTBT_GUID\n", + (unsigned long long) guid); ipath_write_kreg(dd, dd->ipath_kregs->kr_hrtbt_guid, guid); return ret; } @@ -2505,7 +2506,7 @@ done: if (dd->ipath_flags & IPATH_IB_AUTONEG_INPROG) { ipath_dbg("Did not get to DDR INIT (%x) after %Lu msecs\n", ipath_ib_state(dd, dd->ipath_lastibcstat), - jiffies_to_msecs(jiffies)-startms); + (unsigned long long) jiffies_to_msecs(jiffies)-startms); dd->ipath_flags &= ~IPATH_IB_AUTONEG_INPROG; if (dd->ipath_autoneg_tries == IPATH_AUTONEG_TRIES) { dd->ipath_flags |= IPATH_IB_AUTONEG_FAILED; diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c index 26900b3..6c21b4b 100644 --- a/drivers/infiniband/hw/ipath/ipath_intr.c +++ b/drivers/infiniband/hw/ipath/ipath_intr.c @@ -356,9 +356,10 @@ static void handle_e_ibstatuschanged(struct ipath_devdata *dd, dd->ipath_cregs->cr_iblinkerrrecovcnt); if (linkrecov != dd->ipath_lastlinkrecov) { ipath_dbg("IB linkrecov up %Lx (%s %s) recov %Lu\n", - ibcs, ib_linkstate(dd, ibcs), + (unsigned long long) ibcs, + ib_linkstate(dd, ibcs), ipath_ibcstatus_str[ltstate], - linkrecov); + (unsigned long long) linkrecov); /* and no more until active again */ dd->ipath_lastlinkrecov = 0; ipath_set_linkstate(dd, IPATH_IB_LINKDOWN); @@ -1118,9 +1119,11 @@ irqreturn_t ipath_intr(int irq, void *data) if (unlikely(istat & ~dd->ipath_i_bitsextant)) ipath_dev_err(dd, "interrupt with unknown interrupts %Lx set\n", + (unsigned long long) istat & ~dd->ipath_i_bitsextant); else if (istat & ~INFINIPATH_I_ERROR) /* errors do own printing */ - ipath_cdbg(VERBOSE, "intr stat=0x%Lx\n", istat); + ipath_cdbg(VERBOSE, "intr stat=0x%Lx\n", + (unsigned long long) istat); if (istat & INFINIPATH_I_ERROR) { ipath_stats.sps_errints++; @@ -1128,7 +1131,8 @@ irqreturn_t ipath_intr(int irq, void *data) dd->ipath_kregs->kr_errorstatus); if (!estat) dev_info(&dd->pcidev->dev, "error interrupt (%Lx), " - "but no error bits set!\n", istat); + "but no error bits set!\n", + (unsigned long long) istat); else if (estat == -1LL) /* * should we try clearing all, or hope next read diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index 55c7188..b766e40 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -1021,7 +1021,7 @@ static void sdma_complete(void *cookie, int status) struct ipath_verbs_txreq *tx = cookie; struct ipath_qp *qp = tx->qp; struct ipath_ibdev *dev = to_idev(qp->ibqp.device); - unsigned int flags; + unsigned long flags; enum ib_wc_status ibs = status == IPATH_SDMA_TXREQ_S_OK ? IB_WC_SUCCESS : IB_WC_WR_FLUSH_ERR; @@ -1051,7 +1051,7 @@ static void sdma_complete(void *cookie, int status) static void decrement_dma_busy(struct ipath_qp *qp) { - unsigned int flags; + unsigned long flags; if (atomic_dec_and_test(&qp->s_dma_busy)) { spin_lock_irqsave(&qp->s_lock, flags); @@ -1221,7 +1221,7 @@ static int ipath_verbs_send_pio(struct ipath_qp *qp, unsigned flush_wc; u32 control; int ret; - unsigned int flags; + unsigned long flags; piobuf = ipath_getpiobuf(dd, plen, NULL); if (unlikely(piobuf == NULL)) { diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index a146457..d0866a3 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -515,17 +515,17 @@ static void mlx4_ib_handle_error_cqe(struct mlx4_err_cqe *cqe, wc->vendor_err = cqe->vendor_err_syndrome; } -static int mlx4_ib_ipoib_csum_ok(__be32 status, __be16 checksum) +static int mlx4_ib_ipoib_csum_ok(__be16 status, __be16 checksum) { - return ((status & cpu_to_be32(MLX4_CQE_IPOIB_STATUS_IPV4 | - MLX4_CQE_IPOIB_STATUS_IPV4F | - MLX4_CQE_IPOIB_STATUS_IPV4OPT | - MLX4_CQE_IPOIB_STATUS_IPV6 | - MLX4_CQE_IPOIB_STATUS_IPOK)) == - cpu_to_be32(MLX4_CQE_IPOIB_STATUS_IPV4 | - MLX4_CQE_IPOIB_STATUS_IPOK)) && - (status & cpu_to_be32(MLX4_CQE_IPOIB_STATUS_UDP | - MLX4_CQE_IPOIB_STATUS_TCP)) && + return ((status & cpu_to_be16(MLX4_CQE_STATUS_IPV4 | + MLX4_CQE_STATUS_IPV4F | + MLX4_CQE_STATUS_IPV4OPT | + MLX4_CQE_STATUS_IPV6 | + MLX4_CQE_STATUS_IPOK)) == + cpu_to_be16(MLX4_CQE_STATUS_IPV4 | + MLX4_CQE_STATUS_IPOK)) && + (status & cpu_to_be16(MLX4_CQE_STATUS_UDP | + MLX4_CQE_STATUS_TCP)) && checksum == cpu_to_be16(0xffff); } @@ -582,17 +582,17 @@ repoll: } if (!*cur_qp || - (be32_to_cpu(cqe->my_qpn) & 0xffffff) != (*cur_qp)->mqp.qpn) { + (be32_to_cpu(cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK) != (*cur_qp)->mqp.qpn) { /* * We do not have to take the QP table lock here, * because CQs will be locked while QPs are removed * from the table. */ mqp = __mlx4_qp_lookup(to_mdev(cq->ibcq.device)->dev, - be32_to_cpu(cqe->my_qpn)); + be32_to_cpu(cqe->vlan_my_qpn)); if (unlikely(!mqp)) { printk(KERN_WARNING "CQ %06x with entry for unknown QPN %06x\n", - cq->mcq.cqn, be32_to_cpu(cqe->my_qpn) & 0xffffff); + cq->mcq.cqn, be32_to_cpu(cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK); return -EINVAL; } @@ -692,14 +692,13 @@ repoll: } wc->slid = be16_to_cpu(cqe->rlid); - wc->sl = cqe->sl >> 4; + wc->sl = be16_to_cpu(cqe->sl_vid >> 12); g_mlpath_rqpn = be32_to_cpu(cqe->g_mlpath_rqpn); wc->src_qp = g_mlpath_rqpn & 0xffffff; wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IB_WC_GRH : 0; wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) & 0x7f; - wc->csum_ok = mlx4_ib_ipoib_csum_ok(cqe->ipoib_status, - cqe->checksum); + wc->csum_ok = mlx4_ib_ipoib_csum_ok(cqe->status, cqe->checksum); } return 0; @@ -767,7 +766,7 @@ void __mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq) */ while ((int) --prod_index - (int) cq->mcq.cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->ibcq.cqe); - if ((be32_to_cpu(cqe->my_qpn) & 0xffffff) == qpn) { + if ((be32_to_cpu(cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK) == qpn) { if (srq && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) mlx4_ib_free_srq_wqe(srq, be16_to_cpu(cqe->wqe_index)); ++nfreed; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index f7bc7dd..f29dbb7 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -902,7 +902,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, context->mtu_msgmax = (IB_MTU_4096 << 5) | ilog2(dev->dev->caps.max_gso_sz); else - context->mtu_msgmax = (IB_MTU_4096 << 5) | 11; + context->mtu_msgmax = (IB_MTU_4096 << 5) | 12; } else if (attr_mask & IB_QP_PATH_MTU) { if (attr->path_mtu < IB_MTU_256 || attr->path_mtu > IB_MTU_4096) { printk(KERN_ERR "path MTU (%u) is invalid\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 0f2d304..7ebc400 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -337,7 +337,7 @@ static void ipoib_cm_init_rx_wr(struct net_device *dev, sge[i].length = PAGE_SIZE; wr->next = NULL; - wr->sg_list = priv->cm.rx_sge; + wr->sg_list = sge; wr->num_sge = priv->cm.num_frags; } diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 071cf96..6f65b2c 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -39,17 +39,18 @@ #include struct mlx4_cqe { - __be32 my_qpn; + __be32 vlan_my_qpn; __be32 immed_rss_invalid; __be32 g_mlpath_rqpn; - u8 sl; - u8 reserved1; + __be16 sl_vid; __be16 rlid; - __be32 ipoib_status; + __be16 status; + u8 ipv6_ext_mask; + u8 badfcs_enc; __be32 byte_cnt; __be16 wqe_index; __be16 checksum; - u8 reserved2[3]; + u8 reserved[3]; u8 owner_sr_opcode; }; @@ -64,6 +65,11 @@ struct mlx4_err_cqe { }; enum { + MLX4_CQE_VLAN_PRESENT_MASK = 1 << 29, + MLX4_CQE_QPN_MASK = 0xffffff, +}; + +enum { MLX4_CQE_OWNER_MASK = 0x80, MLX4_CQE_IS_SEND_MASK = 0x40, MLX4_CQE_OPCODE_MASK = 0x1f @@ -86,13 +92,19 @@ enum { }; enum { - MLX4_CQE_IPOIB_STATUS_IPV4 = 1 << 22, - MLX4_CQE_IPOIB_STATUS_IPV4F = 1 << 23, - MLX4_CQE_IPOIB_STATUS_IPV6 = 1 << 24, - MLX4_CQE_IPOIB_STATUS_IPV4OPT = 1 << 25, - MLX4_CQE_IPOIB_STATUS_TCP = 1 << 26, - MLX4_CQE_IPOIB_STATUS_UDP = 1 << 27, - MLX4_CQE_IPOIB_STATUS_IPOK = 1 << 28, + MLX4_CQE_STATUS_IPV4 = 1 << 6, + MLX4_CQE_STATUS_IPV4F = 1 << 7, + MLX4_CQE_STATUS_IPV6 = 1 << 8, + MLX4_CQE_STATUS_IPV4OPT = 1 << 9, + MLX4_CQE_STATUS_TCP = 1 << 10, + MLX4_CQE_STATUS_UDP = 1 << 11, + MLX4_CQE_STATUS_IPOK = 1 << 12, +}; + +enum { + MLX4_CQE_LLC = 1, + MLX4_CQE_SNAP = 1 << 1, + MLX4_CQE_BAD_FCS = 1 << 4, }; static inline void mlx4_cq_arm(struct mlx4_cq *cq, u32 cmd, diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h index df7faf0..c6b2962 100644 --- a/include/rdma/rdma_cm.h +++ b/include/rdma/rdma_cm.h @@ -71,12 +71,8 @@ enum rdma_port_space { }; struct rdma_addr { - struct sockaddr src_addr; - u8 src_pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; - struct sockaddr dst_addr; - u8 dst_pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage src_addr; + struct sockaddr_storage dst_addr; struct rdma_dev_addr dev_addr; }; From weiny2 at llnl.gov Thu Aug 7 14:53:26 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 7 Aug 2008 14:53:26 -0700 Subject: [ofa-general] [PATCH v2] Add a Node Description check on light sweep to ensure that the ND has been found for each node. Message-ID: <20080807145326.1d91604c.weiny2@llnl.gov> >From 123a950a8bf0fc43331a1e715f0cdd756529437c Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Wed, 30 Jul 2008 17:28:30 -0700 Subject: [PATCH] Add a Node Description check on light sweep to ensure that the ND has been found for each node. This case covers the condition where a ND message is dropped/lost for some reason and OpenSM is left with a valid configured node which is not named correctly. This is not the same as a node which has changed it's Node Descriptioin. In this case the node needs to send a trap. Signed-off-by: Ira Weiny --- opensm/include/opensm/osm_base.h | 11 ++++++++ opensm/opensm/osm_node.c | 2 +- opensm/opensm/osm_state_mgr.c | 53 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 1 deletions(-) diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 3793804..2e8def7 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -640,6 +640,17 @@ BEGIN_C_DECLS */ #define OSM_NO_PATH 0xFF /**********/ +/****d* OpenSM: Base/OSM_NODE_DESC_UNKNOWN +* NAME +* OSM_NODE_DESC_UNKNOWN +* +* DESCRIPTION +* Value indicating the Node Description is not set and is "unknown" +* +* SYNOPSIS +*/ +#define OSM_NODE_DESC_UNKNOWN "" +/**********/ /****d* OpenSM: Base/osm_thread_state_t * NAME * osm_thread_state_t diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index d99c656..123feb8 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -136,7 +136,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) osm_node_init_physp(p_node, p_madw); if (p_ni->node_type == IB_NODE_TYPE_SWITCH) node_init_physp0(p_node, p_madw); - p_node->print_desc = strdup(""); + p_node->print_desc = strdup(OSM_NODE_DESC_UNKNOWN); return (p_node); } diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 3cdb2cf..ef4bddd 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -516,6 +516,53 @@ static void query_sm_info(cl_map_item_t *item, void *cxt) } /********************************************************************** + During a light sweep check each node to see if the node descriptor is valid + if not issue a ND query. +**********************************************************************/ +static void __osm_state_mgr_get_node_desc(IN cl_map_item_t * const p_object, + IN void *context) +{ + osm_physp_t *p_physp = NULL; + osm_node_t *const p_node = (osm_node_t *) p_object; + ib_api_status_t status = IB_SUCCESS; + osm_madw_context_t mad_context; + osm_sm_t *sm = (osm_sm_t *)context; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_node); + + if (p_node->print_desc && strcmp(p_node->print_desc, OSM_NODE_DESC_UNKNOWN)) + /* if ND is valid, do nothing */ + goto exit; + + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "ERR 3314: Unknown node description \"%s\" for node " + "0x%016" PRIx64 ". Reissuing ND query\n", + p_node->print_desc ? p_node->print_desc : OSM_NODE_DESC_UNKNOWN, + cl_ntoh64(osm_node_get_node_guid (p_node))); + + /* get a physp to request from. */ + p_physp = osm_node_get_any_physp_ptr(p_node); + + mad_context.nd_context.node_guid = osm_node_get_node_guid(p_node); + + status = osm_req_get(sm, + osm_physp_get_dr_path_ptr(p_physp), + IB_MAD_ATTR_NODE_DESC, + 0, CL_DISP_MSGID_NONE, &mad_context); + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "__osm_state_mgr_get_node_desc: ERR 3315: " + "Failure initiating NodeDescription request (%s)\n", + ib_get_err_str(status)); + +exit: + OSM_LOG_EXIT(sm->p_log); +} + + +/********************************************************************** Initiates a lightweight sweep of the subnet. Used during normal sweeps after the subnet is up. **********************************************************************/ @@ -524,6 +571,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) ib_api_status_t status = IB_SUCCESS; osm_bind_handle_t h_bind; cl_qmap_t *p_sw_tbl; + cl_qmap_t *p_node_tbl; cl_map_item_t *p_next; osm_node_t *p_node; osm_physp_t *p_physp; @@ -532,6 +580,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) OSM_LOG_ENTER(sm->p_log); p_sw_tbl = &sm->p_subn->sw_guid_tbl; + p_node_tbl = &sm->p_subn->node_guid_tbl; /* * First, get the bind handle. @@ -550,6 +599,10 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) cl_qmap_apply_func(p_sw_tbl, __osm_state_mgr_get_sw_info, sm); CL_PLOCK_RELEASE(sm->p_lock); + CL_PLOCK_ACQUIRE(sm->p_lock); + cl_qmap_apply_func(p_node_tbl, __osm_state_mgr_get_node_desc, sm); + CL_PLOCK_RELEASE(sm->p_lock); + /* now scan the list of physical ports that were not down but have no remote port */ CL_PLOCK_ACQUIRE(sm->p_lock); p_next = cl_qmap_head(&sm->p_subn->node_guid_tbl); -- 1.5.4.5 From wangwhao at cn.ibm.com Fri Aug 8 00:29:09 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Fri, 8 Aug 2008 15:29:09 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: Message-ID: >osmtest uses a non compliant query to get all the paths and likely >only OpenSM supports this extension. > >-- Hal OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 installed. Maybe first I need to find out how to diable embedded SM on the switch. By the way, how can I know which standby/slave SMs exist in my cluster? Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From wangwhao at cn.ibm.com Fri Aug 8 00:31:06 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Fri, 8 Aug 2008 15:31:06 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: <489AE265.2050109@dev.mellanox.co.il> Message-ID: >Here's what I see in the osmtest log: > >Aug 07 04:56:53 400669 [42909940] 0x01 -> umad_receiver: ERR 5409: send completed with error (method=0x12 attr=0x35 trans_id=0x2900000004) -- dropping >Aug 07 04:56:53 400674 [42909940] 0x01 -> umad_receiver: ERR 5410: class 0x3 LID 0x2 >Aug 07 04:56:53 400687 [42909940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_TIMEOUT) > >Attribute ID 0x35 is PathRecord. >I don't know why didn't the embedded SM answer the PathRecord query >from the osmtest. I've never try to run osmtest against non-opensm >subnet managers, and I don't know if someone did it before. Sorry. > >Perhaps someone could comment on that... > >-- Yevgeny OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 installed. Maybe first I need to find out how to diable embedded SM on the switch. By the way, how can I know which standby/slave SMs exist in my cluster, if all the SM are open SMs? Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajouri.jammu at gmail.com Fri Aug 8 00:24:16 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Fri, 8 Aug 2008 00:24:16 -0700 Subject: [ofa-general] Re: OSU mpi latency > 100usec and > 1msec Message-ID: <3307cdf90808080024g571bb74fta5c8f331591253e7@mail.gmail.com> Correction: I meant to say > 100usec instead of 100msec in the subject line On Fri, Aug 8, 2008 at 12:14 AM, Rajouri Jammu wrote: > Hi, > I modified/instrumented the OSU MP latency benchmark to measure time taken > by each transaction in order to get min and max latencies, in addition to > the average that's reported currently. > I noticed that some of the transactions, albeit few, took > 100usec and > > 1msec. > > Does anybody have any ideas about what could be causing such large round > trip times (>1msec) for a few transactions while the average looks pretty > good ( 10usec ranges) ? > Is it network or system issues? > > Here is a snapshot of the output and attached is the modified code. > > i'm using OFED 1.3, CentOS 5 and openmpi-1.2.5. > > Any insights or ideas would very helpful. Thanks in advance. > > Below shows # of transactions over 100usec and 1msec. > Iteration count was set to 60000 for each test. > Latency is round trip time. > > -------------------------------------------------------------------------- > # OSU MPI Latency Test v3.0 > # Size Latency (us) > 0 13.30 over_100usec: 13 over_1msec: 2 i 601000 > 1 13.51 over_100usec: 12 over_1msec: 0 i 601000 > 2 13.51 over_100usec: 13 over_1msec: 1 i 601000 > 4 13.81 over_100usec: 42 over_1msec: 33 i 601000 > 8 13.92 over_100usec: 36 over_1msec: 25 i 601000 > 16 13.90 over_100usec: 10 over_1msec: 0 i 601000 > 32 14.14 over_100usec: 54 over_1msec: 44 i 601000 > 64 14.32 over_100usec: 11 over_1msec: 1 i 601000 > 128 15.38 over_100usec: 10 over_1msec: 0 i 601000 > 256 15.94 over_100usec: 14 over_1msec: 2 i 601000 > 512 16.74 *over_100usec: 77 over_1msec: 65 *i > 601000 > 1024 21.07 over_100usec: 17 over_1msec: 0 i 601000 > 2048 24.05 over_100usec: 17 over_1msec: 1 i 601000 > 4096 29.99 over_100usec: 37 over_1msec: 5 i 601000 > 8192 41.71 over_100usec: 39 over_1msec: 0 i 601000 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhefglwf at bmarch.com.au Fri Aug 8 00:30:00 2008 From: bhefglwf at bmarch.com.au (Stephanie Holcomb) Date: Fri, 8 Aug 2008 10:30:00 +0300 Subject: [ofa-general] Re: Message-ID: <01c8f941$b5e60400$1da8dd53@bhefglwf> IPTV – Die Zukunft des Fernsehens hat begonnen IPTV – The future for television is now To receive your favourite tv channelworldwide – IPTV does it. Not on your tv screen, but on your tv! And maxx-tv AGis one of the first-movers in this growing market. Due to the company, they will even start toenter the world’s largest tv market USA in Fall 2008. German program for morethan 40 million German-speaking people. And on top: The costs of the maketingcampaign are covered by the ste-top-box producer! This given, we recommend: BUY (SPECULATIVE) Maxx-TV AGWKN: A0M0KX, Symbol: M55FrankfurtStock ExchangeShare price: ? 0.186 months-target: ? 0.35 / 0.45 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Fri Aug 8 02:45:20 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 8 Aug 2008 02:45:20 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080808-0200 daily build status Message-ID: <20080808094520.CC74EE608BE@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.24 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-1.2798.fc6_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-1.2798.fc6' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.20 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.20_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.20' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: include/asm/apic.h:47: warning: value computed is not used /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1840: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.17_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080808-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From hal.rosenstock at gmail.com Fri Aug 8 05:57:57 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 08:57:57 -0400 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: On Fri, Aug 8, 2008 at 3:29 AM, Wen Hao Wang wrote: >>osmtest uses a non compliant query to get all the paths and likely >>only OpenSM supports this extension. >> >>-- Hal > > OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 > installed. Maybe first I need to find out how to diable embedded SM on the > switch. Yes, you should not run a mix of different flavor SMs in a subnet so if you want to run OpenSM, you need to disable Cisco/Topspin SM. > By the way, how can I know which standby/slave SMs exist in my > cluster? saquery -s will show all SMs (ports with isSM or isSMDisabled capability For more detail, you can run sminfo on all these. -- Hal > Wen Hao Wang > Email: wangwhao at cn.ibm.com From hal.rosenstock at gmail.com Fri Aug 8 05:58:15 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 08:58:15 -0400 Subject: [ofa-general] [PATCH v2] Add a Node Description check on light sweep to ensure that the ND has been found for each node. In-Reply-To: <20080807145326.1d91604c.weiny2@llnl.gov> References: <20080807145326.1d91604c.weiny2@llnl.gov> Message-ID: On Thu, Aug 7, 2008 at 5:53 PM, Ira Weiny wrote: > >From 123a950a8bf0fc43331a1e715f0cdd756529437c Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 30 Jul 2008 17:28:30 -0700 > Subject: [PATCH] Add a Node Description check on light sweep to ensure that the ND has been > found for each node. This case covers the condition where a ND message is > dropped/lost for some reason and OpenSM is left with a valid configured node > which is not named correctly. > > This is not the same as a node which has changed it's Node Descriptioin. In > this case the node needs to send a trap. Nit below. > > Signed-off-by: Ira Weiny > --- > opensm/include/opensm/osm_base.h | 11 ++++++++ > opensm/opensm/osm_node.c | 2 +- > opensm/opensm/osm_state_mgr.c | 53 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 65 insertions(+), 1 deletions(-) > > diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h > index 3793804..2e8def7 100644 > --- a/opensm/include/opensm/osm_base.h > +++ b/opensm/include/opensm/osm_base.h > @@ -640,6 +640,17 @@ BEGIN_C_DECLS > */ > #define OSM_NO_PATH 0xFF > /**********/ > +/****d* OpenSM: Base/OSM_NODE_DESC_UNKNOWN > +* NAME > +* OSM_NODE_DESC_UNKNOWN > +* > +* DESCRIPTION > +* Value indicating the Node Description is not set and is "unknown" > +* > +* SYNOPSIS > +*/ > +#define OSM_NODE_DESC_UNKNOWN "" > +/**********/ > /****d* OpenSM: Base/osm_thread_state_t > * NAME > * osm_thread_state_t > diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c > index d99c656..123feb8 100644 > --- a/opensm/opensm/osm_node.c > +++ b/opensm/opensm/osm_node.c > @@ -136,7 +136,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) > osm_node_init_physp(p_node, p_madw); > if (p_ni->node_type == IB_NODE_TYPE_SWITCH) > node_init_physp0(p_node, p_madw); > - p_node->print_desc = strdup(""); > + p_node->print_desc = strdup(OSM_NODE_DESC_UNKNOWN); > > return (p_node); > } > diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c > index 3cdb2cf..ef4bddd 100644 > --- a/opensm/opensm/osm_state_mgr.c > +++ b/opensm/opensm/osm_state_mgr.c > @@ -516,6 +516,53 @@ static void query_sm_info(cl_map_item_t *item, void *cxt) > } > > /********************************************************************** > + During a light sweep check each node to see if the node descriptor is valid > + if not issue a ND query. > +**********************************************************************/ > +static void __osm_state_mgr_get_node_desc(IN cl_map_item_t * const p_object, > + IN void *context) > +{ > + osm_physp_t *p_physp = NULL; > + osm_node_t *const p_node = (osm_node_t *) p_object; > + ib_api_status_t status = IB_SUCCESS; > + osm_madw_context_t mad_context; > + osm_sm_t *sm = (osm_sm_t *)context; > + > + OSM_LOG_ENTER(sm->p_log); > + > + CL_ASSERT(p_node); > + > + if (p_node->print_desc && strcmp(p_node->print_desc, OSM_NODE_DESC_UNKNOWN)) > + /* if ND is valid, do nothing */ > + goto exit; > + > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > + "ERR 3314: Unknown node description \"%s\" for node " > + "0x%016" PRIx64 ". Reissuing ND query\n", > + p_node->print_desc ? p_node->print_desc : OSM_NODE_DESC_UNKNOWN, > + cl_ntoh64(osm_node_get_node_guid (p_node))); > + > + /* get a physp to request from. */ > + p_physp = osm_node_get_any_physp_ptr(p_node); > + > + mad_context.nd_context.node_guid = osm_node_get_node_guid(p_node); > + > + status = osm_req_get(sm, > + osm_physp_get_dr_path_ptr(p_physp), > + IB_MAD_ATTR_NODE_DESC, > + 0, CL_DISP_MSGID_NONE, &mad_context); > + if (status != IB_SUCCESS) > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > + "__osm_state_mgr_get_node_desc: ERR 3315: " > + "Failure initiating NodeDescription request (%s)\n", > + ib_get_err_str(status)); Aren't error codes 3314 and 3315 already taken ? -- Hal > + > +exit: > + OSM_LOG_EXIT(sm->p_log); > +} > + > + > +/********************************************************************** > Initiates a lightweight sweep of the subnet. > Used during normal sweeps after the subnet is up. > **********************************************************************/ > @@ -524,6 +571,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > ib_api_status_t status = IB_SUCCESS; > osm_bind_handle_t h_bind; > cl_qmap_t *p_sw_tbl; > + cl_qmap_t *p_node_tbl; > cl_map_item_t *p_next; > osm_node_t *p_node; > osm_physp_t *p_physp; > @@ -532,6 +580,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > OSM_LOG_ENTER(sm->p_log); > > p_sw_tbl = &sm->p_subn->sw_guid_tbl; > + p_node_tbl = &sm->p_subn->node_guid_tbl; > > /* > * First, get the bind handle. > @@ -550,6 +599,10 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > cl_qmap_apply_func(p_sw_tbl, __osm_state_mgr_get_sw_info, sm); > CL_PLOCK_RELEASE(sm->p_lock); > > + CL_PLOCK_ACQUIRE(sm->p_lock); > + cl_qmap_apply_func(p_node_tbl, __osm_state_mgr_get_node_desc, sm); > + CL_PLOCK_RELEASE(sm->p_lock); > + > /* now scan the list of physical ports that were not down but have no remote port */ > CL_PLOCK_ACQUIRE(sm->p_lock); > p_next = cl_qmap_head(&sm->p_subn->node_guid_tbl); > -- > 1.5.4.5 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From rajouri.jammu at gmail.com Fri Aug 8 00:14:56 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Fri, 8 Aug 2008 00:14:56 -0700 Subject: [ofa-general] ***SPAM*** OSU mpi latency > 100msec and > 1msec In-Reply-To: <3307cdf90808080010q488486ffqeea6f24eb278f0f3@mail.gmail.com> References: <3307cdf90808080010q488486ffqeea6f24eb278f0f3@mail.gmail.com> Message-ID: <3307cdf90808080014t573130f4h1534fd2905a52cd3@mail.gmail.com> Hi, I modified/instrumented the OSU MP latency benchmark to measure time taken by each transaction in order to get min and max latencies, in addition to the average that's reported currently. I noticed that some of the transactions, albeit few, took > 100usec and > 1msec. Does anybody have any ideas about what could be causing such large round trip times (>1msec) for a few transactions while the average looks pretty good ( 10usec ranges) ? Is it network or system issues? Here is a snapshot of the output and attached is the modified code. i'm using OFED 1.3, CentOS 5 and openmpi-1.2.5. Any insights or ideas would very helpful. Thanks in advance. Below shows # of transactions over 100usec and 1msec. Iteration count was set to 60000 for each test. Latency is round trip time. -------------------------------------------------------------------------- # OSU MPI Latency Test v3.0 # Size Latency (us) 0 13.30 over_100usec: 13 over_1msec: 2 i 601000 1 13.51 over_100usec: 12 over_1msec: 0 i 601000 2 13.51 over_100usec: 13 over_1msec: 1 i 601000 4 13.81 over_100usec: 42 over_1msec: 33 i 601000 8 13.92 over_100usec: 36 over_1msec: 25 i 601000 16 13.90 over_100usec: 10 over_1msec: 0 i 601000 32 14.14 over_100usec: 54 over_1msec: 44 i 601000 64 14.32 over_100usec: 11 over_1msec: 1 i 601000 128 15.38 over_100usec: 10 over_1msec: 0 i 601000 256 15.94 over_100usec: 14 over_1msec: 2 i 601000 512 16.74 *over_100usec: 77 over_1msec: 65 *i 601000 1024 21.07 over_100usec: 17 over_1msec: 0 i 601000 2048 24.05 over_100usec: 17 over_1msec: 1 i 601000 4096 29.99 over_100usec: 37 over_1msec: 5 i 601000 8192 41.71 over_100usec: 39 over_1msec: 0 i 601000 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: osu_latency_profile.c URL: From john.russo at qlogic.com Fri Aug 8 07:12:43 2008 From: john.russo at qlogic.com (John Russo) Date: Fri, 8 Aug 2008 09:12:43 -0500 Subject: [ofa-general] FW: An alternative solution to the node name issue in OFED 1.3.1 Message-ID: <99863D2ED484D449811D97A4C44C9CBD8902B9@EPEXCH2.qlogic.org> Issue: We have found that causes openibd to be started before networking and therefore the NodeDescription, when returned from the SM, does not always contain the hostname of the system when ibhosts is run. A solution was proposed however I wanted to give an alternative that we worked out in case you liked it and wanted to use it instead. John Russo __________________________ John F. Russo Manager, Engineering QLogic Corporation 780 Fifth Avenue, Suite 140 King of Prussia, PA 19406 Direct: 610-233-4866 Main: 610-233-4800 Fax: 610-233-4777 Cell: 610-246-9903 Email: John.Russo at qlogic.com www.qlogic.com True success is the undeniable truth that we have proved ourselves. -Joe Luppino-Esposito -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 3677 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: zzz_qlogic_0040_node_desc.patch Type: application/octet-stream Size: 5477 bytes Desc: zzz_qlogic_0040_node_desc.patch URL: From Robert at saq.co.uk Fri Aug 8 07:21:49 2008 From: Robert at saq.co.uk (Robert Dunkley) Date: Fri, 8 Aug 2008 15:21:49 +0100 Subject: [ofa-general] Centos 5.2 dmesg Error: ib_ipath: Unknown symbol ipath_init_iba7220_funcs Message-ID: Does anyone know what this error means? I'm using IPOIB OK with a Mellanox card, I installed OFED 1.3.1 on Centos 5.2 but was wondering if this is anything to worry about? Thanks, Rob I have installed the following modules from OFED 1.3.1 on Centos 5.2: Select Option [1-4]:4 Install kernel-ib? [y/N]:y Install core module? [y/N]:y Install mthca module? [y/N]:y Install mlx4 module? [y/N]:y Install cxgb3 module? [y/N]:y Install nes module? [y/N]:n Install ipath module? [y/N]:y Install ipoib module? [y/N]:y Install sdp module? [y/N]:n Install srp module? [y/N]:n Install srpt module? [y/N]:n Install rds module? [y/N]:n Install qlgc_vnic module? [y/N]:y Install iser module? (open-iscsi will also be installed) [y/N]:n Install kernel-ib-devel? [y/N]:y Install ib-bonding? [y/N]:n Install ib-bonding-debuginfo? [y/N]:n Install libibverbs? [y/N]:y Install libibverbs-devel? [y/N]:n Install libibverbs-devel-static? [y/N]:n Install libibverbs-utils? [y/N]:n Install libibverbs-debuginfo? [y/N]:n Install libmthca? [y/N]:y Install libmthca-devel-static? [y/N]:n Install libmthca-debuginfo? [y/N]:n Install libmlx4? [y/N]:y Install libmlx4-devel-static? [y/N]:n Install libmlx4-debuginfo? [y/N]:n Install libcxgb3? [y/N]:y Install libcxgb3-devel? [y/N]:n Install libcxgb3-debuginfo? [y/N]:n Install libnes? [y/N]:y Install libnes-devel-static? [y/N]:n Install libnes-debuginfo? [y/N]:n Install libipathverbs? [y/N]:y Install libipathverbs-devel? [y/N]:n Install libipathverbs-debuginfo? [y/N]:n Install libibcm? [y/N]:y Install libibcm-devel? [y/N]:n Install libibcm-debuginfo? [y/N]:n Install libibcommon? [y/N]:y Install libibcommon-devel? [y/N]:n Install libibcommon-static? [y/N]:n Install libibcommon-debuginfo? [y/N]:n Install libibumad? [y/N]:y Install libibumad-devel? [y/N]:n Install libibumad-static? [y/N]:n Install libibumad-debuginfo? [y/N]:n Install libibmad? [y/N]:y Install libibmad-devel? [y/N]:n Install libibmad-static? [y/N]:n Install libibmad-debuginfo? [y/N]:n Install ibsim? [y/N]:y Install ibsim-debuginfo? [y/N]:n Install librdmacm? [y/N]:y Install librdmacm-utils? [y/N]:y Install librdmacm-devel? [y/N]:n Install librdmacm-debuginfo? [y/N]:n Install libsdp? [y/N]:n Install libsdp-devel? [y/N]:n Install libsdp-debuginfo? [y/N]:n Install opensm? [y/N]:y Install opensm-libs? [y/N]:y Install opensm-devel? [y/N]:n Install opensm-debuginfo? [y/N]:n Install opensm-static? [y/N]:n Install dapl-v1? [y/N]:n Install dapl-v1-devel? [y/N]:n Install dapl-v2? [y/N]:n Install dapl-devel? [y/N]:n Install dapl-devel-static? [y/N]:n Install dapl-utils? [y/N]:n Install dapl-debuginfo? [y/N]:n Install perftest? [y/N]:y Install mstflint? [y/N]:y Install tvflash? [y/N]:n Install qlvnictools? [y/N]:y Install sdpnetstat? [y/N]:n Install srptools? [y/N]:n Install rds-tools? [y/N]:n Install ibutils? [y/N]:y Install infiniband-diags? [y/N]:y Install qperf? [y/N]:y Install qperf-debuginfo? [y/N]:n Install ofed-docs? [y/N]:n Install ofed-scripts? [y/N]:n Install mpi-selector? [y/N]:n Install openmpi_gcc? [y/N]:n Install mpitests_openmpi_gcc? [y/N]:n Install 32-bit packages? [y/N]:n The SAQ Group Registered Office: 18 Chapel Street, Petersfield, Hampshire. GU32 3DZ SEMTEC Limited trading as SAQ is Registered in England & Wales Company Number: 06481952 http://www.saqnet.co.uk AS29219 SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. DSL : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. Find us in http://www.thebestof.co.uk/petersfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Fri Aug 8 07:27:57 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 10:27:57 -0400 Subject: [ofa-general] FW: An alternative solution to the node name issue in OFED 1.3.1 In-Reply-To: <99863D2ED484D449811D97A4C44C9CBD8902B9@EPEXCH2.qlogic.org> References: <99863D2ED484D449811D97A4C44C9CBD8902B9@EPEXCH2.qlogic.org> Message-ID: On Fri, Aug 8, 2008 at 10:12 AM, John Russo wrote: > Issue: We have found that causes openibd to be started before networking > and therefore the NodeDescription, when returned from the SM, does not > always contain the hostname of the system when ibhosts is run. > > > > A solution was proposed however I wanted to give an alternative that we > worked out in case you liked it and wanted to use it instead. I would think setting of the NodeDescription in this manner would need to be done optionally, via a module parameter, with the default being off. Quite some time ago we had the discussion about it being a system admin policy/possible security issue to reveal or not reveal the hostname via similar mechanisms. For a similar reason, this capability was removed from ICMP. -- Hal > > > > John Russo > > > > __________________________ > John F. Russo > Manager, Engineering > QLogic Corporation > 780 Fifth Avenue, Suite 140 > King of Prussia, PA 19406 > Direct: 610-233-4866 > Main: 610-233-4800 > Fax: 610-233-4777 > Cell: 610-246-9903 > Email: John.Russo at qlogic.com > www.qlogic.com > > > > True success is the undeniable truth that we have proved ourselves. > > -Joe Luppino-Esposito > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From jsquyres at cisco.com Fri Aug 8 07:28:34 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Fri, 8 Aug 2008 10:28:34 -0400 Subject: [ofa-general] ***SPAM*** OSU mpi latency > 100msec and > 1msec In-Reply-To: <3307cdf90808080014t573130f4h1534fd2905a52cd3@mail.gmail.com> References: <3307cdf90808080010q488486ffqeea6f24eb278f0f3@mail.gmail.com> <3307cdf90808080014t573130f4h1534fd2905a52cd3@mail.gmail.com> Message-ID: <738924D7-6F04-42A9-9F71-AF387CA55410@cisco.com> These type of effects can be caused by congestion in the network, location of the processes in the network, and/or other sources of jitter on your hosts (e.g., other processes interrupting and running, etc.). Even the 13us looks pretty high or iWARP or IB; perhaps that's caused by the outliers in your data set. FWIW: we normally get in the 1-2us latency range for IB, assuming you have top-of-the-line servers, HCAs, and exactly one switch hop between the two servers. On Aug 8, 2008, at 3:14 AM, Rajouri Jammu wrote: > Hi, > I modified/instrumented the OSU MP latency benchmark to measure time > taken by each transaction in order to get min and max latencies, in > addition to the average that's reported currently. > I noticed that some of the transactions, albeit few, took > 100usec > and > 1msec. > > Does anybody have any ideas about what could be causing such large > round trip times (>1msec) for a few transactions while the average > looks pretty good ( 10usec ranges) ? > Is it network or system issues? > > Here is a snapshot of the output and attached is the modified code. > > i'm using OFED 1.3, CentOS 5 and openmpi-1.2.5. > > Any insights or ideas would very helpful. Thanks in advance. > > Below shows # of transactions over 100usec and 1msec. > Iteration count was set to 60000 for each test. > Latency is round trip time. > > -------------------------------------------------------------------------- > # OSU MPI Latency Test v3.0 > # Size Latency (us) > 0 13.30 over_100usec: 13 over_1msec: 2 i > 601000 > 1 13.51 over_100usec: 12 over_1msec: 0 i > 601000 > 2 13.51 over_100usec: 13 over_1msec: 1 i > 601000 > 4 13.81 over_100usec: 42 over_1msec: 33 i > 601000 > 8 13.92 over_100usec: 36 over_1msec: 25 i > 601000 > 16 13.90 over_100usec: 10 over_1msec: 0 i > 601000 > 32 14.14 over_100usec: 54 over_1msec: 44 i > 601000 > 64 14.32 over_100usec: 11 over_1msec: 1 i > 601000 > 128 15.38 over_100usec: 10 over_1msec: 0 i > 601000 > 256 15.94 over_100usec: 14 over_1msec: 2 i > 601000 > 512 16.74 over_100usec: 77 over_1msec: 65 i > 601000 > 1024 21.07 over_100usec: 17 over_1msec: 0 i > 601000 > 2048 24.05 over_100usec: 17 over_1msec: 1 i > 601000 > 4096 29.99 over_100usec: 37 over_1msec: 5 i > 601000 > 8192 41.71 over_100usec: 39 over_1msec: 0 i > 601000 > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Jeff Squyres Cisco Systems From dwilder at us.ibm.com Fri Aug 8 08:42:14 2008 From: dwilder at us.ibm.com (David J. Wilder) Date: Fri, 08 Aug 2008 08:42:14 -0700 Subject: [ofa-general] [PATCH] Use vmalloc to alloc the rx_ring In-Reply-To: References: <1217483438.29436.11.camel@dumpserver> <1217519029.9162.3.camel@dumpserver> Message-ID: <1218210134.11020.7.camel@wilder.ibm.com> On Thu, 2008-08-07 at 09:41 -0700, Roland Dreier wrote: > What is the severity of this issue? Is this a patch for 2.6.27? Roland, Thanks for the review and comments. I will fix up the patch and resend. Regarding the severity, we have customers that are running udp applications that require a large receive buffer sizes in-order to get the IB performance they require. I would like this issue treated as high priority and have it pushed to 2.6.27. The patch was built against your git tree. Regards Dave From weiny2 at llnl.gov Fri Aug 8 09:08:15 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 8 Aug 2008 09:08:15 -0700 Subject: [ofa-general] [PATCH v3] Add a Node Description check on light sweep to ensure that the ND has been found for each node. In-Reply-To: References: <20080807145326.1d91604c.weiny2@llnl.gov> Message-ID: <20080808090815.0a9b923d.weiny2@llnl.gov> On Fri, 8 Aug 2008 08:58:15 -0400 "Hal Rosenstock" wrote: > On Thu, Aug 7, 2008 at 5:53 PM, Ira Weiny wrote: > > >From 123a950a8bf0fc43331a1e715f0cdd756529437c Mon Sep 17 00:00:00 2001 > > From: Ira K. Weiny > > + if (status != IB_SUCCESS) > > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > > + "__osm_state_mgr_get_node_desc: ERR 3315: " > > + "Failure initiating NodeDescription request (%s)\n", > > + ib_get_err_str(status)); > > > Aren't error codes 3314 and 3315 already taken ? > > -- Hal > Yes, I forgot you mentioned that. v3 attached. Ira >From 6470536504e0bb6c6ff86619f3801235e022a99d Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Wed, 30 Jul 2008 17:28:30 -0700 Subject: [PATCH] Add a Node Description check on light sweep to ensure that the ND has been found for each node. This case covers the condition where a ND message is dropped/lost for some reason and OpenSM is left with a valid configured node which is not named correctly. This is not the same as a node which has changed it's Node Descriptioin. In this case the node needs to send a trap. Signed-off-by: Ira Weiny --- opensm/include/opensm/osm_base.h | 11 ++++++++ opensm/opensm/osm_node.c | 2 +- opensm/opensm/osm_state_mgr.c | 53 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 1 deletions(-) diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 3793804..2e8def7 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -640,6 +640,17 @@ BEGIN_C_DECLS */ #define OSM_NO_PATH 0xFF /**********/ +/****d* OpenSM: Base/OSM_NODE_DESC_UNKNOWN +* NAME +* OSM_NODE_DESC_UNKNOWN +* +* DESCRIPTION +* Value indicating the Node Description is not set and is "unknown" +* +* SYNOPSIS +*/ +#define OSM_NODE_DESC_UNKNOWN "" +/**********/ /****d* OpenSM: Base/osm_thread_state_t * NAME * osm_thread_state_t diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index d99c656..123feb8 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -136,7 +136,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) osm_node_init_physp(p_node, p_madw); if (p_ni->node_type == IB_NODE_TYPE_SWITCH) node_init_physp0(p_node, p_madw); - p_node->print_desc = strdup(""); + p_node->print_desc = strdup(OSM_NODE_DESC_UNKNOWN); return (p_node); } diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 3cdb2cf..7502287 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -516,6 +516,53 @@ static void query_sm_info(cl_map_item_t *item, void *cxt) } /********************************************************************** + During a light sweep check each node to see if the node descriptor is valid + if not issue a ND query. +**********************************************************************/ +static void __osm_state_mgr_get_node_desc(IN cl_map_item_t * const p_object, + IN void *context) +{ + osm_physp_t *p_physp = NULL; + osm_node_t *const p_node = (osm_node_t *) p_object; + ib_api_status_t status = IB_SUCCESS; + osm_madw_context_t mad_context; + osm_sm_t *sm = (osm_sm_t *)context; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_node); + + if (p_node->print_desc && strcmp(p_node->print_desc, OSM_NODE_DESC_UNKNOWN)) + /* if ND is valid, do nothing */ + goto exit; + + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "ERR 3319: Unknown node description \"%s\" for node " + "0x%016" PRIx64 ". Reissuing ND query\n", + p_node->print_desc ? p_node->print_desc : OSM_NODE_DESC_UNKNOWN, + cl_ntoh64(osm_node_get_node_guid (p_node))); + + /* get a physp to request from. */ + p_physp = osm_node_get_any_physp_ptr(p_node); + + mad_context.nd_context.node_guid = osm_node_get_node_guid(p_node); + + status = osm_req_get(sm, + osm_physp_get_dr_path_ptr(p_physp), + IB_MAD_ATTR_NODE_DESC, + 0, CL_DISP_MSGID_NONE, &mad_context); + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "__osm_state_mgr_get_node_desc: ERR 331B: " + "Failure initiating NodeDescription request (%s)\n", + ib_get_err_str(status)); + +exit: + OSM_LOG_EXIT(sm->p_log); +} + + +/********************************************************************** Initiates a lightweight sweep of the subnet. Used during normal sweeps after the subnet is up. **********************************************************************/ @@ -524,6 +571,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) ib_api_status_t status = IB_SUCCESS; osm_bind_handle_t h_bind; cl_qmap_t *p_sw_tbl; + cl_qmap_t *p_node_tbl; cl_map_item_t *p_next; osm_node_t *p_node; osm_physp_t *p_physp; @@ -532,6 +580,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) OSM_LOG_ENTER(sm->p_log); p_sw_tbl = &sm->p_subn->sw_guid_tbl; + p_node_tbl = &sm->p_subn->node_guid_tbl; /* * First, get the bind handle. @@ -550,6 +599,10 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) cl_qmap_apply_func(p_sw_tbl, __osm_state_mgr_get_sw_info, sm); CL_PLOCK_RELEASE(sm->p_lock); + CL_PLOCK_ACQUIRE(sm->p_lock); + cl_qmap_apply_func(p_node_tbl, __osm_state_mgr_get_node_desc, sm); + CL_PLOCK_RELEASE(sm->p_lock); + /* now scan the list of physical ports that were not down but have no remote port */ CL_PLOCK_ACQUIRE(sm->p_lock); p_next = cl_qmap_head(&sm->p_subn->node_guid_tbl); -- 1.5.4.5 From YJia at tmriusa.com Fri Aug 8 09:18:55 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Fri, 8 Aug 2008 11:18:55 -0500 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: <20080808141114.1AA6DE60A13@openfabrics.org> Message-ID: Hi Hal, I have a question regarding to this issue. If I have both opensm and Cisco managed switch running in the same subnet, can opensm automatically detect the Master SM and set itself as the slave SM? Or I have to manually disable the SM in the Cisco switch in order for the opensm acts properly? Thanks! Yicheng general-request at lists.openfabrics.org Sent by: general-bounces at lists.openfabrics.org 08/08/2008 09:06 AM Please respond to general at lists.openfabrics.org To general at lists.openfabrics.org cc Subject general Digest, Vol 19, Issue 31 Today's Topics: 1. Re: opensm hang and osmtest report ERR 0130 (Hal Rosenstock) ---------------------------------------------------------------------- Message: 1 Date: Fri, 8 Aug 2008 08:57:57 -0400 From: "Hal Rosenstock" Subject: Re: [ofa-general] opensm hang and osmtest report ERR 0130 To: "Wen Hao Wang" Cc: general at lists.openfabrics.org Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Fri, Aug 8, 2008 at 3:29 AM, Wen Hao Wang wrote: >>osmtest uses a non compliant query to get all the paths and likely >>only OpenSM supports this extension. >> >>-- Hal > > OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 > installed. Maybe first I need to find out how to diable embedded SM on the > switch. Yes, you should not run a mix of different flavor SMs in a subnet so if you want to run OpenSM, you need to disable Cisco/Topspin SM. > By the way, how can I know which standby/slave SMs exist in my > cluster? saquery -s will show all SMs (ports with isSM or isSMDisabled capability For more detail, you can run sminfo on all these. -- Hal > Wen Hao Wang > Email: wangwhao at cn.ibm.com _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general End of general Digest, Vol 19, Issue 31 *************************************** _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmuc at bellnet.ca Fri Aug 8 11:01:47 2008 From: cmuc at bellnet.ca (Dell Online Notification) Date: Fri, 8 Aug 2008 14:01:47 -0400 Subject: [ofa-general] ATTENTION!!.. Message-ID: <20080808180203.WSBE1724.tomts37-srv.bellnexxia.net@toip40-bus.srvr.bell.ca> This is to notify you that £500,000.00(GBP) has been awarded to your e-mail in our Dell online promotion.Reply to this email with your Information to file for your claims Contact Person: Brown .J. Williams Email: brown_williams at live.com From dave.olson at qlogic.com Fri Aug 8 11:06:00 2008 From: dave.olson at qlogic.com (Dave Olson) Date: Fri, 8 Aug 2008 11:06:00 -0700 (PDT) Subject: [ofa-general] Centos 5.2 dmesg Error: ib_ipath: Unknown symbol ipath_init_iba7220_funcs In-Reply-To: References: Message-ID: On Fri, 8 Aug 2008, Robert Dunkley wrote: | Does anyone know what this error means? It means that ipath_iba7220.c wasn't built for some reason. Probably due to CONFIG_PCI_MSI not being defined. That should result in the call not being made from the init code, but apparently that's not working right for some reason. Since you have a mellanox card, you can ignore it. Dave Olson dave.olson at qlogic.com From hal.rosenstock at gmail.com Fri Aug 8 11:07:42 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 14:07:42 -0400 Subject: [ofa-general] ***SPAM*** Re: opensm hang and osmtest report ERR 0130 In-Reply-To: References: <20080808141114.1AA6DE60A13@openfabrics.org> Message-ID: Hi Yicheng, On Fri, Aug 8, 2008 at 12:18 PM, Yicheng Jia wrote: > > Hi Hal, > > I have a question regarding to this issue. > > If I have both opensm and Cisco managed switch running in the same subnet, This is inadvisable. There are a set of issues when running SMs of a different flavor in the same subnet. > can opensm automatically detect the Master SM and set itself as the slave > SM? Yes, the SM election process will work properly. It relies on high priority and low GUID. So if OpenSM either has low priority or same priority and low GUID, it will become standby. > Or I have to manually disable the SM in the Cisco switch That would be my recommendation: either run all OpenSMs or all Cisco/Topspin SMs in your subnet but not a mix of the two. > in order for the opensm acts properly? Not sure what you mean by OpenSM acting properly. Both SMs have different policies for a number of things beyond the spec. There is an IBTA supplied white paper on management interoperability (http://www.infinibandta.org/newsroom/whitepapers/mgtinterop_final_1.pdf) which I think is available to non members (by registering). -- Hal > Thanks! > Yicheng > > > > general-request at lists.openfabrics.org > Sent by: general-bounces at lists.openfabrics.org > > 08/08/2008 09:06 AM > Please respond to > general at lists.openfabrics.org > > To > general at lists.openfabrics.org > cc > Subject > general Digest, Vol 19, Issue 31 > > > > > > Today's Topics: > > 1. Re: opensm hang and osmtest report ERR 0130 (Hal Rosenstock) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 8 Aug 2008 08:57:57 -0400 > From: "Hal Rosenstock" > Subject: Re: [ofa-general] opensm hang and osmtest report ERR 0130 > To: "Wen Hao Wang" > Cc: general at lists.openfabrics.org > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Fri, Aug 8, 2008 at 3:29 AM, Wen Hao Wang wrote: >>>osmtest uses a non compliant query to get all the paths and likely >>>only OpenSM supports this extension. >>> >>>-- Hal >> >> OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 >> installed. Maybe first I need to find out how to diable embedded SM on the >> switch. > > Yes, you should not run a mix of different flavor SMs in a subnet so > if you want to run OpenSM, you need to disable Cisco/Topspin SM. > >> By the way, how can I know which standby/slave SMs exist in my >> cluster? > > saquery -s > will show all SMs (ports with isSM or isSMDisabled capability > For more detail, you can run sminfo on all these. > > -- Hal > >> Wen Hao Wang >> Email: wangwhao at cn.ibm.com > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > End of general Digest, Vol 19, Issue 31 > *************************************** > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > From dave.olson at qlogic.com Fri Aug 8 11:12:26 2008 From: dave.olson at qlogic.com (Dave Olson) Date: Fri, 8 Aug 2008 11:12:26 -0700 (PDT) Subject: [ofa-general] FW: An alternative solution to the node name issue in OFED 1.3.1 In-Reply-To: References: <99863D2ED484D449811D97A4C44C9CBD8902B9@EPEXCH2.qlogic.org> Message-ID: On Fri, 8 Aug 2008, Hal Rosenstock wrote: | On Fri, Aug 8, 2008 at 10:12 AM, John Russo wrote: | > Issue: We have found that causes openibd to be started before networking | > and therefore the NodeDescription, when returned from the SM, does not | > always contain the hostname of the system when ibhosts is run. | > A solution was proposed however I wanted to give an alternative that we | > worked out in case you liked it and wanted to use it instead. | | I would think setting of the NodeDescription in this manner would need | to be done optionally, via a module parameter, with the default being | off. Quite some time ago we had the discussion about it being a system | admin policy/possible security issue to reveal or not reveal the | hostname via similar mechanisms. For a similar reason, this capability | was removed from ICMP. That's addressed by the same mechanism that currently exists in the openibd script. Simply set the node_desc to something other than the hostname. The new behavior occurs only if the node_desc hasn't been explictly set. If there is strong concern that this leaves a small window in which the hostname is exposed, it could be modified to occur only if the node_desc is set to some well-defined string, such as __HOST__ or something of the sort. I think a module parameter is more than is needed; if added, it should probably default to enable, since relatively few sites are likely to have security concerns within an IB fabric (as far as exposing hostnames). Dave Olson dave.olson at qlogic.com From hal.rosenstock at gmail.com Fri Aug 8 11:28:24 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 14:28:24 -0400 Subject: [ofa-general] FW: An alternative solution to the node name issue in OFED 1.3.1 In-Reply-To: References: <99863D2ED484D449811D97A4C44C9CBD8902B9@EPEXCH2.qlogic.org> Message-ID: On Fri, Aug 8, 2008 at 2:12 PM, Dave Olson wrote: > On Fri, 8 Aug 2008, Hal Rosenstock wrote: > > | On Fri, Aug 8, 2008 at 10:12 AM, John Russo wrote: > | > Issue: We have found that causes openibd to be started before networking > | > and therefore the NodeDescription, when returned from the SM, does not > | > always contain the hostname of the system when ibhosts is run. > | > A solution was proposed however I wanted to give an alternative that we > | > worked out in case you liked it and wanted to use it instead. > | > | I would think setting of the NodeDescription in this manner would need > | to be done optionally, via a module parameter, with the default being > | off. Quite some time ago we had the discussion about it being a system > | admin policy/possible security issue to reveal or not reveal the > | hostname via similar mechanisms. For a similar reason, this capability > | was removed from ICMP. > > That's addressed by the same mechanism that currently exists in the > openibd script. Isn't that user space ? Also, this is OFED rather than upstream kernel code. > Simply set the node_desc to something other than > the hostname. The new behavior occurs only if the node_desc > hasn't been explictly set. > > If there is strong concern that this leaves a small window in which > the hostname is exposed, it could be modified to occur only if > the node_desc is set to some well-defined string, such as __HOST__ > or something of the sort. > I think a module parameter is more than is needed; if added, it > should probably default to enable, since relatively few sites are likely to > have security concerns within an IB fabric (as far as exposing > hostnames). There was some kernel code which did used system name for an IB agent and was rejected by the community for that reason. -- Hal > Dave Olson > dave.olson at qlogic.com > From yosefe at voltaire.com Fri Aug 8 11:55:30 2008 From: yosefe at voltaire.com (Yosef Etigin) Date: Fri, 8 Aug 2008 21:55:30 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> Message-ID: <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> How about putting the mtu stuff in another task, that is scheduled on another workqueue, such as the global workqueue? We can use the data field of the work to syncronize with device cleanup. On Thu, Aug 7, 2008 at 6:26 PM, Roland Dreier wrote: > > > Instead of loop-waiting for the lock, give it up if can't lock. > > Same thing is done in drivers/net/cxgb3/cxgb3_main.c. > > I think this is worse ... now if there's anything (*anything* at all -- > even stuff related to different devices) holding the rtnl lock at the > wrong time, we lose an mtu update. > > I haven't had a chance to look at this in detail yet, but I would really > like to investigate whether we can just avoid the potential deadlock in > some more elegant way. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From YJia at tmriusa.com Fri Aug 8 12:48:01 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Fri, 8 Aug 2008 14:48:01 -0500 Subject: [ofa-general] Re: opensm hang and osmtest report ERR 0130 In-Reply-To: Message-ID: > Not sure what you mean by OpenSM acting properly. I need to use OpenSM to work with both managed and unmanaged switches. I have only a single switch in a single subnet. If it detects that there's already a Master SM in the subnet, the OpenSM will elect to be the slave one and wouldn't affect the work of the managed switch. Otherwise it will work as the Master. Does current version of OpenSM have this functionality? > That would be my recommendation: either run all OpenSMs or all > Cisco/Topspin SMs in your subnet but not a mix of the two. I think this is for the condition of multiple subnets. It shouldn't be the case if there's only a single switch in a single subnet. Correct? Thanks! Yicheng "Hal Rosenstock" 08/08/2008 01:06 PM To "Yicheng Jia" cc general at lists.openfabrics.org Subject Re: opensm hang and osmtest report ERR 0130 Hi Yicheng, On Fri, Aug 8, 2008 at 12:18 PM, Yicheng Jia wrote: > > Hi Hal, > > I have a question regarding to this issue. > > If I have both opensm and Cisco managed switch running in the same subnet, This is inadvisable. There are a set of issues when running SMs of a different flavor in the same subnet. > can opensm automatically detect the Master SM and set itself as the slave > SM? Yes, the SM election process will work properly. It relies on high priority and low GUID. So if OpenSM either has low priority or same priority and low GUID, it will become standby. > Or I have to manually disable the SM in the Cisco switch That would be my recommendation: either run all OpenSMs or all Cisco/Topspin SMs in your subnet but not a mix of the two. > in order for the opensm acts properly? Not sure what you mean by OpenSM acting properly. Both SMs have different policies for a number of things beyond the spec. There is an IBTA supplied white paper on management interoperability (http://www.infinibandta.org/newsroom/whitepapers/mgtinterop_final_1.pdf) which I think is available to non members (by registering). -- Hal > Thanks! > Yicheng > > > > general-request at lists.openfabrics.org > Sent by: general-bounces at lists.openfabrics.org > > 08/08/2008 09:06 AM > Please respond to > general at lists.openfabrics.org > > To > general at lists.openfabrics.org > cc > Subject > general Digest, Vol 19, Issue 31 > > > > > > Today's Topics: > > 1. Re: opensm hang and osmtest report ERR 0130 (Hal Rosenstock) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 8 Aug 2008 08:57:57 -0400 > From: "Hal Rosenstock" > Subject: Re: [ofa-general] opensm hang and osmtest report ERR 0130 > To: "Wen Hao Wang" > Cc: general at lists.openfabrics.org > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Fri, Aug 8, 2008 at 3:29 AM, Wen Hao Wang wrote: >>>osmtest uses a non compliant query to get all the paths and likely >>>only OpenSM supports this extension. >>> >>>-- Hal >> >> OK. I need opensm is set up. The Cisco switch has TopspinOS-2.6.0/build195 >> installed. Maybe first I need to find out how to diable embedded SM on the >> switch. > > Yes, you should not run a mix of different flavor SMs in a subnet so > if you want to run OpenSM, you need to disable Cisco/Topspin SM. > >> By the way, how can I know which standby/slave SMs exist in my >> cluster? > > saquery -s > will show all SMs (ports with isSM or isSMDisabled capability > For more detail, you can run sminfo on all these. > > -- Hal > >> Wen Hao Wang >> Email: wangwhao at cn.ibm.com > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > End of general Digest, Vol 19, Issue 31 > *************************************** > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Fri Aug 8 14:13:36 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 08 Aug 2008 14:13:36 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> (Yosef Etigin's message of "Fri, 8 Aug 2008 21:55:30 +0300") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> Message-ID: > How about putting the mtu stuff in another task, that is scheduled on > another workqueue, > such as the global workqueue? We can use the data field of the work to > syncronize with device cleanup. I don't think moving to a different workqueue helps, does it? Because we just have to flush *that* workqueue somewhere too. - R. From rdreier at cisco.com Fri Aug 8 14:17:35 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 08 Aug 2008 14:17:35 -0700 Subject: [ofa-general] Re: Converting the Linux.au client/server example to a single post_send... In-Reply-To: <1218138253.8764.34.camel@hermosa.site> (Peter W. Morreale's message of "Thu, 07 Aug 2008 19:44:13 +0000") References: <1218138253.8764.34.camel@hermosa.site> Message-ID: > I am attempting to convert your client/server RDMA example code to use a > single ibv_post_send() instead of the two calls the client code > currently uses. I'm using the example code from your blog: > > http://digitalvampire.org/blog/ > > What do I have to change? I guess you need to create two send work request structs, link the second one into the first one's next member, and call ibv_post_send() with the first struct. > Separate, but related, I do not understand why the server code does a > ibv_post_recv() with the sge.addr set to (buf + sizeof(uint32_t). This > is apparently setting the address to the second word of the buffer. > Shouldn't this be the first word? No, the client does an RDMA write to write the first value into the first word, and a send to pass the second value. So the server will get the first value in the beginning of the buffer, and it wants to receive the other value in the second value. > Perhaps I misunderstand, but it appears that you are telling the server > to start the receive into the second word. (which implies the data > transfer would be out of bounds of buf, clearly not the case as the > original example works.. ???) No, you are right -- the server wants to receive the data into the second word. This works fine because the server does buf = calloc(2, sizeof (uint32_t)); mr = ibv_reg_mr(pd, buf, 2 * sizeof (uint32_t), so the buffer and memory region are both two words long. - R. From rdreier at cisco.com Fri Aug 8 14:18:13 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 08 Aug 2008 14:18:13 -0700 Subject: [ofa-general] Re: [PATCH v2] ib/core: fix for send multicast group send leave retry In-Reply-To: <4899DB79.2030204@Voltaire.COM> (Yossi Etigin's message of "Wed, 06 Aug 2008 20:12:25 +0300") References: <4899DB79.2030204@Voltaire.COM> Message-ID: Sean, what do you think of this? > Until now, only if joinning a multicast group failed there was a retry > mechanism. > This patch will add a mechanism that will retry to leave a multicast > group before giving up. > Changes from v1: > > - Save the leave state because it's overridden > - use 'else' > > Signed-off-by: Ron Livne > Signed-off-by: Yossi Etigin > > > Index: b/drivers/infiniband/core/multicast.c > =================================================================== > --- a/drivers/infiniband/core/multicast.c 2008-07-07 20:09:15.000000000 +0300 > +++ b/drivers/infiniband/core/multicast.c 2008-08-06 20:08:18.000000000 +0300 > @@ -106,6 +106,8 @@ struct mcast_group { > struct ib_sa_query *query; > int query_id; > u16 pkey_index; > + u8 leave_state; > + int retries; > }; > > struct mcast_member { > @@ -350,6 +352,7 @@ static int send_leave(struct mcast_group > > rec = group->rec; > rec.join_state = leave_state; > + group->leave_state = leave_state; > > ret = ib_sa_mcmember_rec_query(&sa_client, port->dev->device, > port->port_num, IB_SA_METHOD_DELETE, &rec, > @@ -542,7 +545,11 @@ static void leave_handler(int status, st > { > struct mcast_group *group = context; > > - mcast_work_handler(&group->work); > + if (status && (group->retries > 0)) { > + send_leave(group, group->leave_state); > + group->retries--; > + } else > + mcast_work_handler(&group->work); > } > > static struct mcast_group *acquire_group(struct mcast_port *port, > @@ -565,6 +572,7 @@ static struct mcast_group *acquire_group > if (!group) > return NULL; > > + group->retries = 3; > group->port = port; > group->rec.mgid = *mgid; > group->pkey_index = MCAST_INVALID_PKEY_INDEX; > > -- > --Yossi > > -- > --Yossi > From dwilder at us.ibm.com Fri Aug 8 14:46:00 2008 From: dwilder at us.ibm.com (David J. Wilder) Date: Fri, 08 Aug 2008 14:46:00 -0700 Subject: [ofa-general] [PATCH] Updated - Use vmalloc to alloc the rx_ring Message-ID: <1218231961.11020.22.camel@wilder.ibm.com> Roland- I have Incorporated your review comments, thanks again for your input. We have customers that are running udp applications that require a large receive queue size in-order to get the required IB performance. Please consider this a high severity problem. I would like to target the fix for 2.6.27. Thank you Dave. ---------------------------------------------------- To prevent allocation failures for the rx_ring when using non-srq and large recv_queue_size (1K or larger) use vmalloc instead of kcalloc to alocate the rx_ring. Signed-off-by: David Wilder --- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 18 +++++++++++++----- 1 files changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 0f2d304..e464780 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -202,7 +202,7 @@ static void ipoib_cm_free_rx_ring(struct net_device *dev, dev_kfree_skb_any(rx_ring[i].skb); } - kfree(rx_ring); + vfree(rx_ring); } static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) @@ -352,9 +352,13 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i int ret; int i; - rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); - if (!rx->rx_ring) + rx->rx_ring = vmalloc(ipoib_recvq_size * sizeof *rx->rx_ring); + + if (!rx->rx_ring) { + printk(KERN_WARNING "ipoib_cm:Allocation of rx_ring failed, %s", + "try using a lower value of recv_queue_size.\n"); return -ENOMEM; + } t = kmalloc(sizeof *t, GFP_KERNEL); if (!t) { @@ -1494,14 +1498,18 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) return; } - priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, - GFP_KERNEL); + priv->cm.srq_ring = + vmalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring); + if (!priv->cm.srq_ring) { printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", priv->ca->name, ipoib_recvq_size); ib_destroy_srq(priv->cm.srq); priv->cm.srq = NULL; + return; } + memset(priv->cm.srq_ring, 0, + ipoib_recvq_size * sizeof *priv->cm.srq_ring); } int ipoib_cm_dev_init(struct net_device *dev) From rdreier at cisco.com Fri Aug 8 15:56:40 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 08 Aug 2008 15:56:40 -0700 Subject: [ofa-general] Re: [PATCH] Updated - Use vmalloc to alloc the rx_ring In-Reply-To: <1218231961.11020.22.camel@wilder.ibm.com> (David J. Wilder's message of "Fri, 08 Aug 2008 14:46:00 -0700") References: <1218231961.11020.22.camel@wilder.ibm.com> Message-ID: So I applied this patch (slightly munged as below), but a few questions and comments: > To prevent allocation failures for the rx_ring when using > non-srq and large recv_queue_size (1K or larger) use > vmalloc instead of kcalloc to alocate the rx_ring. Do you really have people running connected mode on non-SRQ-capable HCAs with receive queue lengths >= 1K? It seems that each connection is going to consume ~64MB of memory in such a case, so with even a small number of remote peers, we get into GBs of memory tied up in receive rings. Is this really the best way to get performance? Also you specifically mention non-SRQ in the changelog but then change the SRQ ring allocation too. I think that change makes sense but I wonder why it is there. > + printk(KERN_WARNING "ipoib_cm:Allocation of rx_ring failed, %s", > + "try using a lower value of recv_queue_size.\n"); the %s continuation line idiom is interesting... I think it is more idiomatic just to let the compiler concatenate strings, and it generates better code too (it saves having to set up a second string constant and pass the address of that to printk... plus it saves the (trivial) runtime cost of handling the %s format). commit b1404069f64457c94de241738fdca142c2e5698f Author: David J. Wilder Date: Fri Aug 8 15:51:29 2008 -0700 IPoIB/cm: Use vmalloc() to allocate rx_rings There are users that are running UDP applications that require a large receive queue size in order to get good performance. To prevent allocation failures for rx_rings when using non-SRQ mode and large recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to alocate rx_rings. Signed-off-by: David Wilder Signed-off-by: Roland Dreier diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 7ebc400..341ffed 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -202,7 +202,7 @@ static void ipoib_cm_free_rx_ring(struct net_device *dev, dev_kfree_skb_any(rx_ring[i].skb); } - kfree(rx_ring); + vfree(rx_ring); } static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) @@ -352,9 +352,14 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i int ret; int i; - rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); - if (!rx->rx_ring) + rx->rx_ring = vmalloc(ipoib_recvq_size * sizeof *rx->rx_ring); + if (!rx->rx_ring) { + printk(KERN_WARNING "%s: failed to allocate CM non-SRQ ring (%d entries)\n", + priv->ca->name, ipoib_recvq_size); return -ENOMEM; + } + + memset(rx->rx_ring, 0, ipoib_recvq_size * sizeof *rx->rx_ring); t = kmalloc(sizeof *t, GFP_KERNEL); if (!t) { @@ -1494,14 +1499,16 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) return; } - priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, - GFP_KERNEL); + priv->cm.srq_ring = vmalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring); if (!priv->cm.srq_ring) { printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", priv->ca->name, ipoib_recvq_size); ib_destroy_srq(priv->cm.srq); priv->cm.srq = NULL; + return; } + + memset(priv->cm.srq_ring, 0, ipoib_recvq_size * sizeof *priv->cm.srq_ring); } int ipoib_cm_dev_init(struct net_device *dev) From hal.rosenstock at gmail.com Fri Aug 8 15:39:51 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Fri, 8 Aug 2008 18:39:51 -0400 Subject: [ofa-general] ***SPAM*** Re: opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: On Fri, Aug 8, 2008 at 3:48 PM, Yicheng Jia wrote: > >> Not sure what you mean by OpenSM acting properly. > > I need to use OpenSM to work with both managed and unmanaged switches. I > have only a single switch in a single subnet. If it detects that there's > already a Master SM in the subnet, the OpenSM will elect to be the slave one > and wouldn't affect the work of the managed switch. Otherwise it will work > as the Master. Does current version of OpenSM have this functionality? As I wrote before, the election part works (and the master is the high priority or if equal priority low GUID SM) but there is more to it than just this. To go beyond the IBTA recommendation, one needs to go through all the possible pitfalls of a mix of SM flavors on the same subnet. It is a little simpler given your subnet configuration but needs analysis nonetheless. >> That would be my recommendation: and it's not just my recommendation; it's the IBTA's which is what is explained in the document I referenced. >>either run all OpenSMs or all >> Cisco/Topspin SMs in your subnet but not a mix of the two. > > I think this is for the condition of multiple subnets. It shouldn't be the > case if there's only a single switch in a single subnet. Correct? No; I was talking about a single subnet. -- Hal > > Thanks! > Yicheng > > > > "Hal Rosenstock" > > 08/08/2008 01:06 PM > > To > "Yicheng Jia" > cc > general at lists.openfabrics.org > Subject > Re: opensm hang and osmtest report ERR 0130 > > > > > Hi Yicheng, > > On Fri, Aug 8, 2008 at 12:18 PM, Yicheng Jia wrote: >> >> Hi Hal, >> >> I have a question regarding to this issue. >> >> If I have both opensm and Cisco managed switch running in the same subnet, > > This is inadvisable. There are a set of issues when running SMs of a > different flavor in the same subnet. > >> can opensm automatically detect the Master SM and set itself as the slave >> SM? > > Yes, the SM election process will work properly. It relies on high > priority and low GUID. So if OpenSM either has low priority or same > priority and low GUID, it will become standby. > >> Or I have to manually disable the SM in the Cisco switch > > That would be my recommendation: either run all OpenSMs or all > Cisco/Topspin SMs in your subnet but not a mix of the two. > >> in order for the opensm acts properly? > > Not sure what you mean by OpenSM acting properly. Both SMs have > different policies for a number of things beyond the spec. There is an > IBTA supplied white paper on management interoperability > (http://www.infinibandta.org/newsroom/whitepapers/mgtinterop_final_1.pdf) > which I think is available to non members (by registering). > > -- Hal > >> Thanks! >> Yicheng >> >> >> >> general-request at lists.openfabrics.org >> Sent by: general-bounces at lists.openfabrics.org >> >> 08/08/2008 09:06 AM >> Please respond to >> general at lists.openfabrics.org >> >> To >> general at lists.openfabrics.org >> cc >> Subject >> general Digest, Vol 19, Issue 31 >> >> >> >> >> >> Today's Topics: >> >> 1. Re: opensm hang and osmtest report ERR 0130 (Hal Rosenstock) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Fri, 8 Aug 2008 08:57:57 -0400 >> From: "Hal Rosenstock" >> Subject: Re: [ofa-general] opensm hang and osmtest report ERR 0130 >> To: "Wen Hao Wang" >> Cc: general at lists.openfabrics.org >> Message-ID: >> >> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> On Fri, Aug 8, 2008 at 3:29 AM, Wen Hao Wang wrote: >>>>osmtest uses a non compliant query to get all the paths and likely >>>>only OpenSM supports this extension. >>>> >>>>-- Hal >>> >>> OK. I need opensm is set up. The Cisco switch has >>> TopspinOS-2.6.0/build195 >>> installed. Maybe first I need to find out how to diable embedded SM on >>> the >>> switch. >> >> Yes, you should not run a mix of different flavor SMs in a subnet so >> if you want to run OpenSM, you need to disable Cisco/Topspin SM. >> >>> By the way, how can I know which standby/slave SMs exist in my >>> cluster? >> >> saquery -s >> will show all SMs (ports with isSM or isSMDisabled capability >> For more detail, you can run sminfo on all these. >> >> -- Hal >> >>> Wen Hao Wang >>> Email: wangwhao at cn.ibm.com >> >> >> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> End of general Digest, Vol 19, Issue 31 >> *************************************** >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > From dwilder at us.ibm.com Fri Aug 8 16:47:53 2008 From: dwilder at us.ibm.com (David J. Wilder) Date: Fri, 08 Aug 2008 16:47:53 -0700 Subject: [ofa-general] Re: [PATCH] Updated - Use vmalloc to alloc the rx_ring In-Reply-To: References: <1218231961.11020.22.camel@wilder.ibm.com> Message-ID: <1218239273.11020.34.camel@wilder.ibm.com> On Fri, 2008-08-08 at 15:56 -0700, Roland Dreier wrote: > So I applied this patch (slightly munged as below), but a few questions > and comments: > > > To prevent allocation failures for the rx_ring when using > > non-srq and large recv_queue_size (1K or larger) use > > vmalloc instead of kcalloc to alocate the rx_ring. > > Do you really have people running connected mode on non-SRQ-capable HCAs > with receive queue lengths >= 1K? It seems that each connection is > going to consume ~64MB of memory in such a case, so with even a small > number of remote peers, we get into GBs of memory tied up in receive > rings. Is this really the best way to get performance? Yep, big systems lots of memory, these monsters can be configured with up to 128GB. The 1K size is just where we happen to see the problem, the allocation can fail on smaller kmallocs if system memory is small and/or fragmented. > > Also you specifically mention non-SRQ in the changelog but then change > the SRQ ring allocation too. I think that change makes sense but I > wonder why it is there. I change both because the free (now vfree) in ipoib_cm_free_rx_ring() is used to free both types of rings. > > > + printk(KERN_WARNING "ipoib_cm:Allocation of rx_ring failed, %s", > > + "try using a lower value of recv_queue_size.\n"); > > the %s continuation line idiom is interesting... I think it is more > idiomatic just to let the compiler concatenate strings, and it generates > better code too (it saves having to set up a second string constant and > pass the address of that to printk... plus it saves the (trivial) > runtime cost of handling the %s format). I thought I already changed that, thanks for fixing it. > > > commit b1404069f64457c94de241738fdca142c2e5698f > Author: David J. Wilder > Date: Fri Aug 8 15:51:29 2008 -0700 > > IPoIB/cm: Use vmalloc() to allocate rx_rings > > There are users that are running UDP applications that require a large > receive queue size in order to get good performance. To prevent > allocation failures for rx_rings when using non-SRQ mode and large > recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to > alocate rx_rings. > > Signed-off-by: David Wilder > Signed-off-by: Roland Dreier > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > index 7ebc400..341ffed 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > @@ -202,7 +202,7 @@ static void ipoib_cm_free_rx_ring(struct net_device *dev, > dev_kfree_skb_any(rx_ring[i].skb); > } > > - kfree(rx_ring); > + vfree(rx_ring); > } > > static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) > @@ -352,9 +352,14 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i > int ret; > int i; > > - rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); > - if (!rx->rx_ring) > + rx->rx_ring = vmalloc(ipoib_recvq_size * sizeof *rx->rx_ring); > + if (!rx->rx_ring) { > + printk(KERN_WARNING "%s: failed to allocate CM non-SRQ ring (%d entries)\n", > + priv->ca->name, ipoib_recvq_size); > return -ENOMEM; > + } > + > + memset(rx->rx_ring, 0, ipoib_recvq_size * sizeof *rx->rx_ring); > > t = kmalloc(sizeof *t, GFP_KERNEL); > if (!t) { > @@ -1494,14 +1499,16 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) > return; > } > > - priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, > - GFP_KERNEL); > + priv->cm.srq_ring = vmalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring); > if (!priv->cm.srq_ring) { > printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", > priv->ca->name, ipoib_recvq_size); > ib_destroy_srq(priv->cm.srq); > priv->cm.srq = NULL; > + return; > } > + > + memset(priv->cm.srq_ring, 0, ipoib_recvq_size * sizeof *priv->cm.srq_ring); > } > > int ipoib_cm_dev_init(struct net_device *dev) From vlad at lists.openfabrics.org Sat Aug 9 02:44:46 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 9 Aug 2008 02:44:46 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080809-0200 daily build status Message-ID: <20080809094446.148F7E6033B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.24 Failed: Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-1.2798.fc6_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-1.2798.fc6' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.20 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.20_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.20' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: include/asm/apic.h:47: warning: value computed is not used /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1840: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.17_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_file_ops)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/.tmp_ipath_file_ops.o /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c: In function 'ipath_open': /home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.c:1827: error: implicit declaration of function 'cycle_kernel_lock' make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath/ipath_file_ops.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband/hw/ipath] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080809-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From yosefe at voltaire.com Sat Aug 9 05:38:52 2008 From: yosefe at voltaire.com (Yosef Etigin) Date: Sat, 9 Aug 2008 15:38:52 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> Message-ID: <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> On Sat, Aug 9, 2008 at 12:13 AM, Roland Dreier wrote: > > How about putting the mtu stuff in another task, that is scheduled on > > another workqueue, > > such as the global workqueue? We can use the data field of the work to > > syncronize with device cleanup. > > I don't think moving to a different workqueue helps, does it? Because > we just have to flush *that* workqueue somewhere too. > > - R. Yes, but it won't have to be from ipoib_stop, it can be from a place where rtnl_lock is not held. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From incanted at memotime.com Sat Aug 9 11:49:46 2008 From: incanted at memotime.com (Hegener Zorra) Date: Sat, 09 Aug 2008 18:49:46 +0000 Subject: [ofa-general] :o) Message-ID: <6086117365.20080809182336@memotime.com> Heyello, -------------- next part -------------- An HTML attachment was scrubbed... URL: From hydnocarpus at mcdonaldservices.com Sat Aug 9 14:58:47 2008 From: hydnocarpus at mcdonaldservices.com (Sundaresan Hansen) Date: Sa, 09 Aug 2008 23:58:47 +0200 Subject: [ofa-general] Office 2008 Home Student Message-ID: <000501c8fa6a$5e8f8300$0100007f@fyvmk> ^The following titles and more are now available for pcs and macs# Ghost 12 Creative Suite Master Collection SAS JMP Statistical Discovery 7 Office 2008 Home Student Windows Server 2008 REALbasic 2007 Release 5 ^ msxpsale . com # System Requirements # For PC: # Intel Pentium 4 (1.4GHz processor), Intel Centrino, Intel Xeon Windows Server 2008 or Intel Core Duo (or Windows Server 2008) processor; SSE2-enabled processor required for AMD systems & Microsoft Windows XP with Service Pack 2 or Microsoft Windows Vista Home Premium, Business, Ultimate, or Enterprise (certified for 32-bit editions) # 512MB of RAM or more # 1GB of available hard-disk space (additional free space Windows Server 2008) # Microsoft compatible sound card (multichannel ASIO-compatible sound card recommended) # 1,024x768 monitor resolution with 32-bit color adapter recommended # CD-R or DVD-ROM drive ^ For MAC: ^ PowerPC G4 or G5 or multicore Intel processor ^ Mac OS X or similar ^ 512MB of RAM or more ^ 1GB of available hard-disk space ^ Core Audio compatible sound card ^ 1,024x768 monitor resolution with 32-bit color adapter ^ DVD-ROM drive# DVD+-R burner required for DVD creation Fighting continues to rage between Russia and the former Soviet republic of Georgia. From ogerlitz at voltaire.com Sun Aug 10 00:22:53 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 10 Aug 2008 10:22:53 +0300 Subject: [ofa-general] ib_ipoib: Unknown symbol icmpv6_send In-Reply-To: <002301c8f8c7$c86790f0$640fa8c0@ptpdesk> References: <002301c8f8c7$c86790f0$640fa8c0@ptpdesk> Message-ID: <489E974D.2060703@voltaire.com> Hoot Thompson wrote: > I've seen other postings noting a similar error but have not seen a > resolution. When trying to load the ib_ipoib module I get the following > error..... ib_ipoib: Unknown symbol icmpv6_send. How do I clear this error? > It's a SuSE 10 SP1 system. you need to have ipv6 support at your kernel Or. From vlad at lists.openfabrics.org Sun Aug 10 02:53:11 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 10 Aug 2008 02:53:11 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080810-0200 daily build status Message-ID: <20080810095311.B7A51E608F0@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080810-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From ronli at voltaire.com Sun Aug 10 08:39:46 2008 From: ronli at voltaire.com (Ron Livne) Date: Sun, 10 Aug 2008 15:39:46 +0000 (UTC) Subject: [ofa-general] ofed kernel config problem fix Message-ID: Hi Vlad, I pulled today the latest ofed git. When I tried to run "make oldconfig" I got the following message: scripts/kconfig/conf -o arch/x86/Kconfig file drivers/infiniband/hw/nes/Kconfig already scanned? make[1]: *** [oldconfig] Error 1 make: *** [oldconfig] Error 2 It seems there was a duplicate line in drivers/infiniband/Kconfig. Here's how I fixed it: diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 066ccd4..d7be463 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -46,7 +46,6 @@ source "drivers/infiniband/hw/amso1100/Kconfig" source "drivers/infiniband/hw/cxgb3/Kconfig" source "drivers/infiniband/hw/nes/Kconfig" source "drivers/infiniband/hw/mlx4/Kconfig" -source "drivers/infiniband/hw/nes/Kconfig" source "drivers/infiniband/ulp/ipoib/Kconfig" From jackm at dev.mellanox.co.il Sun Aug 10 06:12:21 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 10 Aug 2008 16:12:21 +0300 Subject: [ofa-general] CM: possible memory leak introduced by commit 110cf374a809817d5c080c0ac82d65d029820a66 (Roland's tree) Message-ID: <200808101612.22094.jackm@dev.mellanox.co.il> I think the following patch introduced a memory leak into the cm. (commit 110cf374a809817d5c080c0ac82d65d029820a66, committed on July 27): http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=110cf374a809817d5c080c0ac82d65d029820a66;hp=d4c4196f24ade5f336882587480652efde2c739c Now, no one seems responsible for freeing the memory allocated by kzalloc in procedure cm_add_one (file drivers/infiniband/core/cm.c): /**** jpm: Who deallocates this kzalloc when the cm device is removed??? */ cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) * ib_device->phys_port_cnt, GFP_KERNEL); if (!cm_dev) return; cm_dev->ib_device = ib_device; cm_get_ack_delay(cm_dev); cm_dev->device = device_create_drvdata(&cm_class, &ib_device->dev, MKDEV(0, 0), NULL, "%s", ib_device->name); Am I correct? - Jack From wangwhao at cn.ibm.com Sun Aug 10 19:12:30 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Mon, 11 Aug 2008 10:12:30 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: Message-ID: Hal and Yevgeny: Thanks for your comments! I have disabled the embedded SM on Cisco switch, and run opensm without argument on one of my servers. The command gave "subnet up" message and hung there. Command output of osmtest also contained errors, while the latest line was OSMTEST: TEST "All Validations" PASS. Would you please have a look at this, I am not sure whether all have succeeded, or there is still something wrong. [root at gaia-07 ~]# opensm ------------------------------------------------- OpenSM 3.1.11 Command Line Arguments: Log File: /var/log/opensm.log ------------------------------------------------- OpenSM 3.1.11 Using default GUID 0x2c90300013371 Entering MASTER state SUBNET UP (Never end up ...) [root at gaia-07 ~]# osmtest -f c Command Line Arguments Done with args Flow = Create Inventory Aug 11 03:43:50 294507 [1B8CC3B0] 0x7f -> Setting log level to: 0x03 Aug 11 03:43:50 313268 [1B8CC3B0] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 11 03:43:50 336271 [1B8CC3B0] 0x02 -> osmtest_validate_sa_class_port_info: ----------------------------- SA Class Port Info: base_ver:1 class_ver:2 cap_mask:0x2602 cap_mask2:0x0 resp_time_val:0x10 ----------------------------- OSMTEST: TEST "Create Inventory" PASS [root at gaia-07 ~]# osmtest -f a Command Line Arguments Done with args Flow = All Validations Aug 11 03:33:21 711113 [79D093B0] 0x7f -> Setting log level to: 0x03 Aug 11 03:33:21 729683 [79D093B0] 0x02 -> osm_vendor_bind: Binding to port 0x2c90300013371 Aug 11 03:33:21 752518 [79D093B0] 0x02 -> osmtest_validate_sa_class_port_info: ----------------------------- SA Class Port Info: base_ver:1 class_ver:2 cap_mask:0x2602 cap_mask2:0x0 resp_time_val:0x10 ----------------------------- Aug 11 03:33:21 755242 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ ===== Expecting Errors - START ===== Aug 11 03:33:21 755337 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0002 Aug 11 03:33:21 755344 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 755356 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR 0069: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 755364 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Remote error = IB_SA_MAD_STATUS_REQ_INVALID Aug 11 03:33:21 755370 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got error IB_REMOTE_ERROR Aug 11 03:33:21 755374 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ===== Expecting Errors - END ===== ]] Aug 11 03:33:21 755378 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ ===== Expecting Errors - START ===== Aug 11 03:33:21 755469 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0002 Aug 11 03:33:21 755475 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 755486 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR 0069: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 755491 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Remote error = IB_SA_MAD_STATUS_REQ_INVALID Aug 11 03:33:21 755496 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got error IB_REMOTE_ERROR Aug 11 03:33:21 755500 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ===== Expecting Errors - END ===== ]] Aug 11 03:33:21 755507 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ ===== Expecting Errors - START ===== Aug 11 03:33:21 755583 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0005 Aug 11 03:33:21 755589 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 755599 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR 0069: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 755604 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Remote error = IB_SA_MAD_STATUS_INVALID_GID Aug 11 03:33:21 755609 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got error IB_REMOTE_ERROR Aug 11 03:33:21 755613 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ===== Expecting Errors - END ===== ]] Aug 11 03:33:21 755617 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ ===== Expecting Errors - START ===== Aug 11 03:33:21 755692 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0005 Aug 11 03:33:21 755699 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 755708 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR 0069: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 755713 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Remote error = IB_SA_MAD_STATUS_INVALID_GID Aug 11 03:33:21 755718 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got error IB_REMOTE_ERROR Aug 11 03:33:21 755722 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ===== Expecting Errors - END ===== ]] Aug 11 03:33:21 756481 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ ===== Expecting Errors - START ===== Aug 11 03:33:21 756557 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0002 Aug 11 03:33:21 756564 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 756574 [79D093B0] 0x01 -> osmtest_get_pkeytbl_rec_by_lid: ERR 007F: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 756579 [79D093B0] 0x01 -> osmtest_get_pkeytbl_rec_by_lid: Remote error = IB_SA_MAD_STATUS_REQ_INVALID Aug 11 03:33:21 756584 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got error IB_INSUFFICIENT_MEMORY Aug 11 03:33:21 756588 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ===== Expecting Errors - END ===== ]] Aug 11 03:33:21 757513 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0C00 Aug 11 03:33:21 757520 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 757530 [79D093B0] 0x01 -> osmtest_sminfo_record_request: ERR 008D: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 757536 [79D093B0] 0x01 -> osmtest_sminfo_record_request: Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR Aug 11 03:33:21 757541 [79D093B0] 0x01 -> osmtest_sminfo_request: IS EXPECTED ERROR ^^^^ Aug 11 03:33:21 759085 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0C00 Aug 11 03:33:21 759092 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 759102 [79D093B0] 0x01 -> osmtest_informinfo_request: ERR 008F: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 759107 [79D093B0] 0x01 -> osmtest_informinfo_request: Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR Aug 11 03:33:21 759112 [79D093B0] 0x01 -> osmtest_informinfo_request: InformInfoRecord IS EXPECTED ERROR ^^^^ Aug 11 03:33:21 759248 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0C00 Aug 11 03:33:21 759254 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 759264 [79D093B0] 0x01 -> osmtest_informinfo_request: ERR 008F: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 759269 [79D093B0] 0x01 -> osmtest_informinfo_request: Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR Aug 11 03:33:21 759274 [79D093B0] 0x01 -> osmtest_informinfo_request: InformInfo IS EXPECTED ERROR ^^^^ Aug 11 03:33:21 759353 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0002 Aug 11 03:33:21 759360 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:21 759370 [79D093B0] 0x01 -> osmtest_informinfo_request: ERR 008F: ib_query failed (IB_REMOTE_ERROR) Aug 11 03:33:21 759375 [79D093B0] 0x01 -> osmtest_informinfo_request: Remote error = IB_SA_MAD_STATUS_REQ_INVALID Aug 11 03:33:21 759379 [79D093B0] 0x01 -> osmtest_informinfo_request: InformInfo UnSubscribe IS EXPECTED ERROR ^^^^ Aug 11 03:33:21 796288 [79D093B0] 0x02 -> osmtest_wrong_sm_key_ignored: Trying PortInfoRecord for port with LID 0x3 Num:0x1 Aug 11 03:33:21 796292 [79D093B0] 0x01 -> osmtest_wrong_sm_key_ignored: [[ ===== Expecting Errors - START ===== Aug 11 03:33:25 793475 [41BC7940] 0x01 -> umad_receiver: ERR 5409: send completed with error (method=0x1 attr=0x12 trans_id=0xca00000206) -- dropping Aug 11 03:33:25 793488 [41BC7940] 0x01 -> umad_receiver: ERR 5410: class 0x3 LID 0x3 Aug 11 03:33:25 793494 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_TIMEOUT) Aug 11 03:33:25 793507 [79D093B0] 0x01 -> osmtest_wrong_sm_key_ignored: ===== Expecting Errors - END ===== ]] Aug 11 03:33:25 793525 [79D093B0] 0x02 -> osmt_register_service: Registering service: name: osmt.srvc.1804289383.10242 id: 0x6b8b1d65 Aug 11 03:33:25 793615 [79D093B0] 0x02 -> osmt_register_service: Registering service: name: osmt.srvc.846930885.10242 id: 0x327afbc3 Aug 11 03:33:25 793706 [79D093B0] 0x02 -> osmt_register_service: Registering service: name: osmt.srvc.1681692775.10242 id: 0x643c7065 Aug 11 03:33:25 793783 [79D093B0] 0x02 -> osmt_register_service_with_data: Registering service: name: osmt.srvc.1714636912.10242 id: 0x6633206e Aug 11 03:33:25 793869 [79D093B0] 0x02 -> osmt_register_service_with_data: Registering service: name: osmt.srvc.1714636912.10242 id: 0x6633206e Aug 11 03:33:25 793947 [79D093B0] 0x02 -> osmt_register_service_with_full_key: Registering service: name: osmt.srvc.424238330.10242 id: 0x194934f8 Aug 11 03:33:25 794032 [79D093B0] 0x02 -> osmt_register_service_with_full_key: Registering service: name: osmt.srvc.719885380.10242 id: 0x194934f8 Aug 11 03:33:35 795534 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 795548 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 795580 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 795729 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 795736 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 795748 [79D093B0] 0x01 -> osmt_get_service_by_id: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 795967 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 795974 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 795987 [79D093B0] 0x01 -> osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796054 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796060 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796073 [79D093B0] 0x01 -> osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796141 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796147 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796160 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796227 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796233 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796246 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796389 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796395 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796407 [79D093B0] 0x01 -> osmt_get_service_by_name_and_key: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796476 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796483 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796495 [79D093B0] 0x01 -> osmt_get_service_by_name_and_key: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796679 [79D093B0] 0x02 -> osmt_delete_service_by_name: Trying to Delete service name: osmt.srvc.1804289383.10242 Aug 11 03:33:35 796884 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796891 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796903 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 796973 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 796979 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 796991 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 797073 [79D093B0] 0x02 -> osmt_delete_service_by_name: Trying to Delete service name: osmt.srvc.424238330.10242 Aug 11 03:33:35 797134 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 797140 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 797153 [79D093B0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 797220 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 11 03:33:35 797227 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 11 03:33:35 797239 [79D093B0] 0x01 -> osmt_run_service_records_flow: IS EXPECTED ERROR ^^^^ Aug 11 03:33:35 797244 [79D093B0] 0x02 -> osmt_delete_service_by_name: Failed to delete service_name: osmt.srvc.424238330.10242 Aug 11 03:33:35 797249 [79D093B0] 0x02 -> osmt_delete_service_by_name: Trying to Delete service name: osmt.srvc.1681692775.10242 Aug 11 03:33:35 797399 [79D093B0] 0x02 -> osmt_delete_service_by_name: Trying to Delete service name: osmt.srvc.719885380.10242 Aug 11 03:33:35 797547 [79D093B0] 0x02 -> osmtest_run: The event forwarding flow is not implemented yet! OSMTEST: TEST "All Validations" PASS [root at gaia-07 ~]# tail /var/log/opensm.log Aug 11 03:33:21 755421 [45F72940] 0x01 -> osm_mpr_rcv_process_cb: ERR 4512: __osm_mpr_rcv_get_end_points failed, not enough GIDs (nsrc 1 ndest 0) Aug 11 03:33:21 756141 [4196B940] 0x01 -> osm_pir_rcv_process: ERR 2109: No port found with LID 0xffff Aug 11 03:33:21 756525 [44B70940] 0x01 -> osm_pkey_rec_rcv_process ERR 4608: Request from non-trusted requester: Given SM_Key:0x0000000000000000 Aug 11 03:33:21 756652 [45571940] 0x01 -> osm_pkey_rec_rcv_process: ERR 460B: No port found with LID 0xffff Aug 11 03:33:21 757479 [4196B940] 0x01 -> osm_smir_rcv_process: ERR 2804: Unsupported Method (SubnAdmSet) Aug 11 03:33:21 759319 [4236C940] 0x01 -> osm_infr_rcv_process_set_method: ERR 4307: Failed to UnSubscribe to non existing inform object Aug 11 03:33:21 796326 [47D75940] 0x01 -> __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored Aug 11 03:33:22 795985 [47D75940] 0x01 -> __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored Aug 11 03:33:23 795486 [47D75940] 0x01 -> __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored Aug 11 03:33:24 794985 [47D75940] 0x01 -> __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Sun Aug 10 22:31:36 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Sun, 10 Aug 2008 22:31:36 -0700 Subject: [ofa-general] RE: possible memory leak introduced by commit 110cf374a809817d5c080c0ac82d65d029820a66 (Roland's tree) In-Reply-To: <200808101612.22094.jackm@dev.mellanox.co.il> References: <200808101612.22094.jackm@dev.mellanox.co.il> Message-ID: >Now, no one seems responsible for freeing the memory allocated by >kzalloc in procedure cm_add_one (file drivers/infiniband/core/cm.c): > > /**** jpm: Who deallocates this kzalloc when the cm device is removed??? */ > cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) * > ib_device->phys_port_cnt, GFP_KERNEL); > if (!cm_dev) > return; > > cm_dev->ib_device = ib_device; > cm_get_ack_delay(cm_dev); > > cm_dev->device = device_create_drvdata(&cm_class, &ib_device->dev, > MKDEV(0, 0), NULL, > "%s", ib_device->name); > >Am I correct? It looks like it. I will check into this more on Monday. Thanks. - Sean (Sorry if this breaks the thread.) From sean.hefty at intel.com Sun Aug 10 22:32:41 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Sun, 10 Aug 2008 22:32:41 -0700 Subject: [ofa-general] RE: [PATCH v2] ib/core: fix for send multicast group send leave retry In-Reply-To: References: <4899DB79.2030204@Voltaire.COM> Message-ID: > > @@ -542,7 +545,11 @@ static void leave_handler(int status, st > > { > > struct mcast_group *group = context; > > > > - mcast_work_handler(&group->work); > > + if (status && (group->retries > 0)) { > > + send_leave(group, group->leave_state); I didn't catch this in my earlier response, but we should call mcast_work_handler() if send_leave() fails to make sure that we reset the group state back to idle and process any queued joins. The rest of the changes look okay to me. - Sean (Sorry if this breaks threading.) From alekseys at voltaire.com Sun Aug 10 23:13:28 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:13:28 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 Message-ID: <1218435208.8137.13.camel@linux-zn6t.site> This series of patches implements IPv6 support for RDMA CM and remove a limitations of the first version of the patch. It can handle link-local addresses, recognizing zero address ( bind any )and supports running of server and client on the same machine. All tests were performed with modified version of rping command and patches to rping I'll send later for review. Every patch insert a small logical change to the kernel, but if it neccesary I can unite them to a single one. I'll be very glad to get comments any comments on the patch. This patch is necessary, because of US government require IPv6 support in all products. From alekseys at voltaire.com Sun Aug 10 23:15:11 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:15:11 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 In-Reply-To: <1218435208.8137.13.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> Message-ID: <1218435311.10251.0.camel@linux-zn6t.site> >From bef62874ab2205affd690685b358edcafdd5cf9f Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Mon, 4 Aug 2008 17:47:04 +0300 Subject: [IPv6 RDMA CM PATCHv2 1/8] Using sockaddr_storage instead of padding arrays in addr_req structure Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 09a2bec..c5b623b 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -49,8 +49,8 @@ MODULE_LICENSE("Dual BSD/GPL"); struct addr_req { struct list_head list; - struct sockaddr src_addr; - struct sockaddr dst_addr; + struct sockaddr_storage src_addr; + struct sockaddr_storage dst_addr; struct rdma_dev_addr *addr; struct rdma_addr_client *client; void *context; -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:15:45 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:15:45 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 In-Reply-To: <1218435208.8137.13.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> Message-ID: <1218435345.10251.2.camel@linux-zn6t.site> >From dc81233eb41501948d0086a5f5c2d1ea9d54c831 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Tue, 5 Aug 2008 13:45:23 +0300 Subject: [IPv6 RDMA CM PATCHv2 2/8] In order to prepare RDMA CM work with IPv6 these functions changed to obtain as argument struct sockaddr* pointer and not sockaddr_in* addr_resolve_remote addr_resolve_local rdma_resolve_ip Changes in process_req function are side effect of modifications in functions above. Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 46 ++++++++++++++++++++-------------------- 1 files changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index c5b623b..b59ad53 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -171,12 +171,12 @@ static void addr_send_arp(struct sockaddr_in *dst_in) ip_rt_put(rt); } -static int addr_resolve_remote(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_remote(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; struct flowi fl; struct rtable *rt; struct neighbour *neigh; @@ -207,8 +207,8 @@ static int addr_resolve_remote(struct sockaddr_in *src_in, } if (!src_ip) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = rt->rt_src; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; } ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); @@ -223,7 +223,7 @@ out: static void process_req(struct work_struct *work) { struct addr_req *req, *temp_req; - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct list_head done_list; INIT_LIST_HEAD(&done_list); @@ -231,8 +231,8 @@ static void process_req(struct work_struct *work) mutex_lock(&lock); list_for_each_entry_safe(req, temp_req, &req_list, list) { if (req->status == -ENODATA) { - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_remote(src_in, dst_in, req->addr); if (req->status && time_after_eq(jiffies, req->timeout)) @@ -251,20 +251,20 @@ static void process_req(struct work_struct *work) list_for_each_entry_safe(req, temp_req, &done_list, list) { list_del(&req->list); - req->callback(req->status, &req->src_addr, req->addr, - req->context); + req->callback(req->status, (struct sockaddr *) &req->src_addr, \ + req->addr, req->context); put_client(req->client); kfree(req); } } -static int addr_resolve_local(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_local(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { struct net_device *dev; - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; int ret; dev = ip_dev_find(&init_net, dst_ip); @@ -272,15 +272,15 @@ static int addr_resolve_local(struct sockaddr_in *src_in, return -EADDRNOTAVAIL; if (ipv4_is_zeronet(src_ip)) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = dst_ip; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; ret = rdma_copy_addr(addr, dev, dev->dev_addr); } else if (ipv4_is_loopback(src_ip)) { - ret = rdma_translate_ip((struct sockaddr *)dst_in, addr); + ret = rdma_translate_ip(dst_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } else { - ret = rdma_translate_ip((struct sockaddr *)src_in, addr); + ret = rdma_translate_ip(src_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } @@ -296,7 +296,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, struct rdma_dev_addr *addr, void *context), void *context) { - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct addr_req *req; int ret = 0; @@ -313,8 +313,8 @@ int rdma_resolve_ip(struct rdma_addr_client *client, req->client = client; atomic_inc(&client->refcount); - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_local(src_in, dst_in, addr); if (req->status == -EADDRNOTAVAIL) @@ -328,7 +328,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, case -ENODATA: req->timeout = msecs_to_jiffies(timeout_ms) + jiffies; queue_req(req); - addr_send_arp(dst_in); + addr_send_arp((struct sockaddr_in *)dst_in); break; default: ret = req->status; -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:16:01 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:16:01 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 In-Reply-To: <1218435208.8137.13.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> Message-ID: <1218435361.10251.4.camel@linux-zn6t.site> >From 3fc41a2249f29429dd84d95ea34c4ab96ca275c1 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Tue, 5 Aug 2008 18:39:41 +0300 Subject: [IPv6 RDMA CM PATCHv2 3/8] Added IPv6 support in rdma_translate_ip function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 30 +++++++++++++++++++++++------- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index b59ad53..f95d21f 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -41,6 +41,7 @@ #include #include #include +#include #include MODULE_AUTHOR("Sean Hefty"); @@ -113,15 +114,30 @@ EXPORT_SYMBOL(rdma_copy_addr); int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) { struct net_device *dev; - __be32 ip = ((struct sockaddr_in *) addr)->sin_addr.s_addr; - int ret; + int ret = -EADDRNOTAVAIL; - dev = ip_dev_find(&init_net, ip); - if (!dev) - return -EADDRNOTAVAIL; + switch (addr->sa_family) { + case AF_INET: + dev = ip_dev_find(&init_net, + ((struct sockaddr_in *) addr)->sin_addr.s_addr); - ret = rdma_copy_addr(dev_addr, dev, NULL); - dev_put(dev); + if (!dev) + return -EADDRNOTAVAIL; + + ret = rdma_copy_addr(dev_addr, dev, NULL); + dev_put(dev); + break; + case AF_INET6: + for_each_netdev(&init_net, dev) { + if (ipv6_chk_addr(&init_net, &((struct sockaddr_in6 *) addr)->sin6_addr, dev, 1)) { + ret = rdma_copy_addr(dev_addr, dev, NULL); + break; + } + } + break; + default: + break; + } return ret; } EXPORT_SYMBOL(rdma_translate_ip); -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:17:18 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:17:18 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 4/8 In-Reply-To: <1218435361.10251.4.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> Message-ID: <1218435438.10251.6.camel@linux-zn6t.site> >From 5a9a0d1d769f0296bf0fca3ad488cd971cf396ab Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 6 Aug 2008 15:44:33 +0300 Subject: [IPv6 RDMA CM PATCHv2 4/8] Added AF_INET6 family case to rdma_bind_addr Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d951896..4728265 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2073,7 +2073,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) struct rdma_id_private *id_priv; int ret; - if (addr->sa_family != AF_INET) + if (addr->sa_family != AF_INET && addr->sa_family != AF_INET6) return -EAFNOSUPPORT; id_priv = container_of(id, struct rdma_id_private, id); -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:17:54 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:17:54 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 5/8 In-Reply-To: <1218435438.10251.6.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> Message-ID: <1218435474.10251.8.camel@linux-zn6t.site> >From 73cc25d25b53fde8da3c1c33d5cfcab120074f4d Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 6 Aug 2008 16:34:01 +0300 Subject: [IPv6 RDMA CM PATCHv2 5/8] Added AF_INET6 case to cma_format_hdr function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 75 +++++++++++++++++++++++++++++------------ 1 files changed, 53 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 4728265..ec0855f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2113,32 +2113,63 @@ EXPORT_SYMBOL(rdma_bind_addr); static int cma_format_hdr(void *hdr, enum rdma_port_space ps, struct rdma_route *route) { - struct sockaddr_in *src4, *dst4; struct cma_hdr *cma_hdr; struct sdp_hh *sdp_hdr; - src4 = (struct sockaddr_in *) &route->addr.src_addr; - dst4 = (struct sockaddr_in *) &route->addr.dst_addr; - - switch (ps) { - case RDMA_PS_SDP: - sdp_hdr = hdr; - if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) - return -EINVAL; - sdp_set_ip_ver(sdp_hdr, 4); - sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; - sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; - sdp_hdr->port = src4->sin_port; - break; - default: - cma_hdr = hdr; - cma_hdr->cma_version = CMA_VERSION; - cma_set_ip_ver(cma_hdr, 4); - cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; - cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; - cma_hdr->port = src4->sin_port; - break; + if (route->addr.src_addr.ss_family == AF_INET) { + struct sockaddr_in *src4, *dst4; + + src4 = (struct sockaddr_in *) &route->addr.src_addr; + dst4 = (struct sockaddr_in *) &route->addr.dst_addr; + + switch (ps) { + case RDMA_PS_SDP: + sdp_hdr = hdr; + if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) + return -EINVAL; + sdp_set_ip_ver(sdp_hdr, 4); + sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; + sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; + sdp_hdr->port = src4->sin_port; + break; + default: + cma_hdr = hdr; + cma_hdr->cma_version = CMA_VERSION; + cma_set_ip_ver(cma_hdr, 4); + cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; + cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; + cma_hdr->port = src4->sin_port; + break; + } + } else if (route->addr.src_addr.ss_family == AF_INET6) { + struct sockaddr_in6 *src6, *dst6; + + src6 = (struct sockaddr_in6 *) &route->addr.src_addr; + dst6 = (struct sockaddr_in6 *) &route->addr.dst_addr; + + switch (ps) { + case RDMA_PS_SDP: + sdp_hdr = hdr; + if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) + return -EINVAL; + sdp_set_ip_ver(sdp_hdr, 6); + sdp_hdr->src_addr.ip6 = src6->sin6_addr; + sdp_hdr->dst_addr.ip6 = dst6->sin6_addr; + sdp_hdr->port = src6->sin6_port; + break; + default: + cma_hdr = hdr; + cma_hdr->cma_version = CMA_VERSION; + cma_set_ip_ver(cma_hdr, 6); + cma_hdr->src_addr.ip6 = src6->sin6_addr; + cma_hdr->dst_addr.ip6 = dst6->sin6_addr; + cma_hdr->port = src6->sin6_port; + break; } + return 0; + } else + return -EAFNOSUPPORT; + return 0; } -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:18:34 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:18:34 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 6/8 In-Reply-To: <1218435474.10251.8.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> Message-ID: <1218435514.10251.10.camel@linux-zn6t.site> >From 65d27ef169a9b1c8c6582c9267ca6183b3fe11ab Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Thu, 7 Aug 2008 09:13:18 +0300 Subject: [IPv6 RDMA CM PATCHv2 6/8] Using sockaddr_storage structure instead of sockaddr_in for catching IPv6 protocol in cma_bind_any function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index ec0855f..6d0daa5 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1467,10 +1467,10 @@ static void cma_listen_on_all(struct rdma_id_private *id_priv) static int cma_bind_any(struct rdma_cm_id *id, sa_family_t af) { - struct sockaddr_in addr_in; + struct sockaddr_storage addr_in; memset(&addr_in, 0, sizeof addr_in); - addr_in.sin_family = af; + addr_in.ss_family = af; return rdma_bind_addr(id, (struct sockaddr *) &addr_in); } -- 1.5.6.dirty From alekseys at voltaire.com Sun Aug 10 23:19:12 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:19:12 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 7/8 In-Reply-To: <1218435514.10251.10.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> <1218435514.10251.10.camel@linux-zn6t.site> Message-ID: <1218435552.10251.12.camel@linux-zn6t.site> >From f31fc16d06a5f02a7d362ae6ee688b1dc039b0a6 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Thu, 7 Aug 2008 10:04:17 +0300 Subject: [IPv6 RDMA CM PATCHv2 7/8] IPv6 support in addr_resolve_local function New cma_ipv6_dev_find function for searching device with specified network address ( like ip_dev_find in IPv4 ) I'm in doubt if it really should be realized as function, but I used it more then once so... Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 94 ++++++++++++++++++++++++++++----------- 1 files changed, 67 insertions(+), 27 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index f95d21f..00dcb22 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -111,6 +111,15 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev, } EXPORT_SYMBOL(rdma_copy_addr); +static inline struct net_device *cma_ipv6_dev_find(struct in6_addr *addr) +{ + struct net_device *dev = 0; + for_each_netdev(&init_net, dev) + if (ipv6_chk_addr(&init_net, addr, dev, 1)) + return dev; + return NULL; +} + int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) { struct net_device *dev; @@ -128,12 +137,13 @@ int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) dev_put(dev); break; case AF_INET6: - for_each_netdev(&init_net, dev) { - if (ipv6_chk_addr(&init_net, &((struct sockaddr_in6 *) addr)->sin6_addr, dev, 1)) { - ret = rdma_copy_addr(dev_addr, dev, NULL); - break; - } - } + dev = cma_ipv6_dev_find( + &((struct sockaddr_in6 *)addr)->sin6_addr); + + if (!dev) + return -EADDRNOTAVAIL; + + ret = rdma_copy_addr(dev_addr, dev, NULL); break; default: break; @@ -279,30 +289,60 @@ static int addr_resolve_local(struct sockaddr *src_in, struct rdma_dev_addr *addr) { struct net_device *dev; - __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; - __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; - int ret; + int ret = -EADDRNOTAVAIL; - dev = ip_dev_find(&init_net, dst_ip); - if (!dev) - return -EADDRNOTAVAIL; + if (dst_in->sa_family == AF_INET) { + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; - if (ipv4_is_zeronet(src_ip)) { - src_in->sa_family = dst_in->sa_family; - ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; - ret = rdma_copy_addr(addr, dev, dev->dev_addr); - } else if (ipv4_is_loopback(src_ip)) { - ret = rdma_translate_ip(dst_in, addr); - if (!ret) - memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); - } else { - ret = rdma_translate_ip(src_in, addr); - if (!ret) - memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); - } + dev = ip_dev_find(&init_net, dst_ip); + if (!dev) + return -EADDRNOTAVAIL; - dev_put(dev); - return ret; + if (ipv4_is_zeronet(src_ip)) { + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; + ret = rdma_copy_addr(addr, dev, dev->dev_addr); + } else if (ipv4_is_loopback(src_ip)) { + ret = rdma_translate_ip(dst_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } else { + ret = rdma_translate_ip(src_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } + + dev_put(dev); + return ret; + } else if (dst_in->sa_family == AF_INET6) { + struct in6_addr *a = &((struct sockaddr_in6 *)dst_in)->sin6_addr; + + dev = cma_ipv6_dev_find(a); + + if (!dev) + return -EADDRNOTAVAIL; + + a = &((struct sockaddr_in6 *)src_in)->sin6_addr; + + if (ipv6_addr_any(a)) { + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in6 *)src_in)->sin6_addr = + ((struct sockaddr_in6 *)dst_in)->sin6_addr; + ret = rdma_copy_addr(addr, dev, dev->dev_addr); + } else if (ipv6_addr_loopback(a)) { + ret = rdma_translate_ip(dst_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } else { + ret = rdma_translate_ip(src_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } + + return ret; + } + return -EADDRNOTAVAIL; } int rdma_resolve_ip(struct rdma_addr_client *client, -- 1.5.6.dirty From kliteyn at dev.mellanox.co.il Sun Aug 10 23:19:27 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 11 Aug 2008 09:19:27 +0300 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: <489FD9EF.1000207@dev.mellanox.co.il> Wen Hao Wang wrote: > Hal and Yevgeny: > > Thanks for your comments! > > I have disabled the embedded SM on Cisco switch, and run opensm without > argument on one of my servers. The command gave "subnet up" message and > hung there. Command output of osmtest also contained errors, while the > latest line was /OSMTEST: TEST "All Validations" PASS/. Would you please > have a look at this, I am not sure whether all have succeeded, or there > is still something wrong. It's OK, everything worked fine. Osmtest does some wrong stuff intentionally, just to see that opensm can handle it correctly - note all the "expecting errors" messages in the osmtest log. Same goes for errors in the opensm log, such as invalid sm_key - osmtest intentionally sends these "wrong" packets. -- Yevgeny > [root at gaia-07 ~]# opensm > ------------------------------------------------- > OpenSM 3.1.11 > Command Line Arguments: > Log File: /var/log/opensm.log > ------------------------------------------------- > OpenSM 3.1.11 > > Using default GUID 0x2c90300013371 > Entering MASTER state > > SUBNET UP > > (Never end up ...) > [root at gaia-07 ~]# osmtest -f c > > Command Line Arguments > Done with args > Flow = Create Inventory > Aug 11 03:43:50 294507 [1B8CC3B0] 0x7f -> Setting log level to: 0x03 > Aug 11 03:43:50 313268 [1B8CC3B0] 0x02 -> osm_vendor_bind: Binding to > port 0x2c90300013371 > Aug 11 03:43:50 336271 [1B8CC3B0] 0x02 -> > osmtest_validate_sa_class_port_info: > ----------------------------- > SA Class Port Info: > base_ver:1 > class_ver:2 > cap_mask:0x2602 > cap_mask2:0x0 > resp_time_val:0x10 > ----------------------------- > OSMTEST: TEST "Create Inventory" PASS > [root at gaia-07 ~]# osmtest -f a > > Command Line Arguments > Done with args > Flow = All Validations > Aug 11 03:33:21 711113 [79D093B0] 0x7f -> Setting log level to: 0x03 > Aug 11 03:33:21 729683 [79D093B0] 0x02 -> osm_vendor_bind: Binding to > port 0x2c90300013371 > Aug 11 03:33:21 752518 [79D093B0] 0x02 -> > osmtest_validate_sa_class_port_info: > ----------------------------- > SA Class Port Info: > base_ver:1 > class_ver:2 > cap_mask:0x2602 > cap_mask2:0x0 > resp_time_val:0x10 > ----------------------------- > Aug 11 03:33:21 755242 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ > ===== Expecting Errors - START ===== > Aug 11 03:33:21 755337 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0002 > Aug 11 03:33:21 755344 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 755356 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR > 0069: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 755364 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > Remote error = IB_SA_MAD_STATUS_REQ_INVALID > Aug 11 03:33:21 755370 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got > error IB_REMOTE_ERROR > Aug 11 03:33:21 755374 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:21 755378 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ > ===== Expecting Errors - START ===== > Aug 11 03:33:21 755469 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0002 > Aug 11 03:33:21 755475 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 755486 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR > 0069: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 755491 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > Remote error = IB_SA_MAD_STATUS_REQ_INVALID > Aug 11 03:33:21 755496 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got > error IB_REMOTE_ERROR > Aug 11 03:33:21 755500 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:21 755507 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ > ===== Expecting Errors - START ===== > Aug 11 03:33:21 755583 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0005 > Aug 11 03:33:21 755589 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 755599 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR > 0069: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 755604 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > Remote error = IB_SA_MAD_STATUS_INVALID_GID > Aug 11 03:33:21 755609 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got > error IB_REMOTE_ERROR > Aug 11 03:33:21 755613 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:21 755617 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ > ===== Expecting Errors - START ===== > Aug 11 03:33:21 755692 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0005 > Aug 11 03:33:21 755699 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 755708 [79D093B0] 0x01 -> osmtest_get_multipath_rec: ERR > 0069: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 755713 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > Remote error = IB_SA_MAD_STATUS_INVALID_GID > Aug 11 03:33:21 755718 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got > error IB_REMOTE_ERROR > Aug 11 03:33:21 755722 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:21 756481 [79D093B0] 0x01 -> osmtest_get_multipath_rec: [[ > ===== Expecting Errors - START ===== > Aug 11 03:33:21 756557 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0002 > Aug 11 03:33:21 756564 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 756574 [79D093B0] 0x01 -> > osmtest_get_pkeytbl_rec_by_lid: ERR 007F: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 756579 [79D093B0] 0x01 -> > osmtest_get_pkeytbl_rec_by_lid: Remote error = IB_SA_MAD_STATUS_REQ_INVALID > Aug 11 03:33:21 756584 [79D093B0] 0x01 -> osmtest_get_multipath_rec: Got > error IB_INSUFFICIENT_MEMORY > Aug 11 03:33:21 756588 [79D093B0] 0x01 -> osmtest_get_multipath_rec: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:21 757513 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0C00 > Aug 11 03:33:21 757520 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 757530 [79D093B0] 0x01 -> osmtest_sminfo_record_request: > ERR 008D: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 757536 [79D093B0] 0x01 -> osmtest_sminfo_record_request: > Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR > Aug 11 03:33:21 757541 [79D093B0] 0x01 -> osmtest_sminfo_request: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:21 759085 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0C00 > Aug 11 03:33:21 759092 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 759102 [79D093B0] 0x01 -> osmtest_informinfo_request: > ERR 008F: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 759107 [79D093B0] 0x01 -> osmtest_informinfo_request: > Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR > Aug 11 03:33:21 759112 [79D093B0] 0x01 -> osmtest_informinfo_request: > InformInfoRecord IS EXPECTED ERROR ^^^^ > Aug 11 03:33:21 759248 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0C00 > Aug 11 03:33:21 759254 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 759264 [79D093B0] 0x01 -> osmtest_informinfo_request: > ERR 008F: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 759269 [79D093B0] 0x01 -> osmtest_informinfo_request: > Remote error = IB_MAD_STATUS_UNSUP_METHOD_ATTR > Aug 11 03:33:21 759274 [79D093B0] 0x01 -> osmtest_informinfo_request: > InformInfo IS EXPECTED ERROR ^^^^ > Aug 11 03:33:21 759353 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0002 > Aug 11 03:33:21 759360 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:21 759370 [79D093B0] 0x01 -> osmtest_informinfo_request: > ERR 008F: ib_query failed (IB_REMOTE_ERROR) > Aug 11 03:33:21 759375 [79D093B0] 0x01 -> osmtest_informinfo_request: > Remote error = IB_SA_MAD_STATUS_REQ_INVALID > Aug 11 03:33:21 759379 [79D093B0] 0x01 -> osmtest_informinfo_request: > InformInfo UnSubscribe IS EXPECTED ERROR ^^^^ > Aug 11 03:33:21 796288 [79D093B0] 0x02 -> osmtest_wrong_sm_key_ignored: > Trying PortInfoRecord for port with LID 0x3 Num:0x1 > Aug 11 03:33:21 796292 [79D093B0] 0x01 -> osmtest_wrong_sm_key_ignored: > [[ ===== Expecting Errors - START ===== > Aug 11 03:33:25 793475 [41BC7940] 0x01 -> umad_receiver: ERR 5409: send > completed with error (method=0x1 attr=0x12 trans_id=0xca00000206) -- > dropping > Aug 11 03:33:25 793488 [41BC7940] 0x01 -> umad_receiver: ERR 5410: class > 0x3 LID 0x3 > Aug 11 03:33:25 793494 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_TIMEOUT) > Aug 11 03:33:25 793507 [79D093B0] 0x01 -> osmtest_wrong_sm_key_ignored: > ===== Expecting Errors - END ===== ]] > Aug 11 03:33:25 793525 [79D093B0] 0x02 -> osmt_register_service: > Registering service: name: osmt.srvc.1804289383.10242 id: 0x6b8b1d65 > Aug 11 03:33:25 793615 [79D093B0] 0x02 -> osmt_register_service: > Registering service: name: osmt.srvc.846930885.10242 id: 0x327afbc3 > Aug 11 03:33:25 793706 [79D093B0] 0x02 -> osmt_register_service: > Registering service: name: osmt.srvc.1681692775.10242 id: 0x643c7065 > Aug 11 03:33:25 793783 [79D093B0] 0x02 -> > osmt_register_service_with_data: Registering service: name: > osmt.srvc.1714636912.10242 id: 0x6633206e > Aug 11 03:33:25 793869 [79D093B0] 0x02 -> > osmt_register_service_with_data: Registering service: name: > osmt.srvc.1714636912.10242 id: 0x6633206e > Aug 11 03:33:25 793947 [79D093B0] 0x02 -> > osmt_register_service_with_full_key: Registering service: name: > osmt.srvc.424238330.10242 id: 0x194934f8 > Aug 11 03:33:25 794032 [79D093B0] 0x02 -> > osmt_register_service_with_full_key: Registering service: name: > osmt.srvc.719885380.10242 id: 0x194934f8 > Aug 11 03:33:35 795534 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 795548 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 795580 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 795729 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 795736 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 795748 [79D093B0] 0x01 -> osmt_get_service_by_id: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 795967 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 795974 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 795987 [79D093B0] 0x01 -> > osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796054 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796060 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796073 [79D093B0] 0x01 -> > osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796141 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796147 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796160 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796227 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796233 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796246 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796389 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796395 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796407 [79D093B0] 0x01 -> > osmt_get_service_by_name_and_key: IS EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796476 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796483 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796495 [79D093B0] 0x01 -> > osmt_get_service_by_name_and_key: IS EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796679 [79D093B0] 0x02 -> osmt_delete_service_by_name: > Trying to Delete service name: osmt.srvc.1804289383.10242 > Aug 11 03:33:35 796884 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796891 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796903 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 796973 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 796979 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 796991 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 797073 [79D093B0] 0x02 -> osmt_delete_service_by_name: > Trying to Delete service name: osmt.srvc.424238330.10242 > Aug 11 03:33:35 797134 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 797140 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 797153 [79D093B0] 0x01 -> osmt_get_service_by_name: IS > EXPECTED ERROR ^^^^ > Aug 11 03:33:35 797220 [41BC7940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR > 5501: Remote error:0x0003 > Aug 11 03:33:35 797227 [41BC7940] 0x01 -> osmtest_query_res_cb: ERR > 0003: Error on query (IB_REMOTE_ERROR) > Aug 11 03:33:35 797239 [79D093B0] 0x01 -> osmt_run_service_records_flow: > IS EXPECTED ERROR ^^^^ > Aug 11 03:33:35 797244 [79D093B0] 0x02 -> osmt_delete_service_by_name: > Failed to delete service_name: osmt.srvc.424238330.10242 > Aug 11 03:33:35 797249 [79D093B0] 0x02 -> osmt_delete_service_by_name: > Trying to Delete service name: osmt.srvc.1681692775.10242 > Aug 11 03:33:35 797399 [79D093B0] 0x02 -> osmt_delete_service_by_name: > Trying to Delete service name: osmt.srvc.719885380.10242 > Aug 11 03:33:35 797547 [79D093B0] 0x02 -> osmtest_run: The event > forwarding flow is not implemented yet! > OSMTEST: TEST "All Validations" PASS > [root at gaia-07 ~]# tail /var/log/opensm.log > Aug 11 03:33:21 755421 [45F72940] 0x01 -> osm_mpr_rcv_process_cb: ERR > 4512: __osm_mpr_rcv_get_end_points failed, not enough GIDs (nsrc 1 ndest 0) > Aug 11 03:33:21 756141 [4196B940] 0x01 -> osm_pir_rcv_process: ERR 2109: > No port found with LID 0xffff > Aug 11 03:33:21 756525 [44B70940] 0x01 -> osm_pkey_rec_rcv_process ERR > 4608: Request from non-trusted requester: Given SM_Key:0x0000000000000000 > Aug 11 03:33:21 756652 [45571940] 0x01 -> osm_pkey_rec_rcv_process: ERR > 460B: No port found with LID 0xffff > Aug 11 03:33:21 757479 [4196B940] 0x01 -> osm_smir_rcv_process: ERR > 2804: Unsupported Method (SubnAdmSet) > Aug 11 03:33:21 759319 [4236C940] 0x01 -> > osm_infr_rcv_process_set_method: ERR 4307: Failed to UnSubscribe to non > existing inform object > Aug 11 03:33:21 796326 [47D75940] 0x01 -> > __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: > 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored > Aug 11 03:33:22 795985 [47D75940] 0x01 -> > __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: > 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored > Aug 11 03:33:23 795486 [47D75940] 0x01 -> > __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: > 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored > Aug 11 03:33:24 794985 [47D75940] 0x01 -> > __osm_sa_mad_ctrl_rcv_callback: ERR 1A04: Non-Zero SA MAD SM_Key: > 0xf27000000000000 != SM_Key: 0x100000000000000; MAD ignored > > Wen Hao Wang > Email: wangwhao at cn.ibm.com > From alekseys at voltaire.com Sun Aug 10 23:20:20 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 09:20:20 +0300 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 8/8 In-Reply-To: <1218435552.10251.12.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> <1218435514.10251.10.camel@linux-zn6t.site> <1218435552.10251.12.camel@linux-zn6t.site> Message-ID: <1218435620.10251.14.camel@linux-zn6t.site> >From 088d2fe2b2b3913ccca124de74a9cfe2cd362edd Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Sun, 10 Aug 2008 13:50:58 +0300 Subject: [IPv6 RDMA CM PATCHv2 8/8] Implemented IPv6 for RDMA CM in resolving remote nodes addr_send_arp function modified to obtain generic sockaddr structure to support both IPv4 and IPv6 protocols addr6_resolve_remote added in order to deal with IPv6 addr_resolve_remote renamed to addr4_resolve_remote Function addr_resolve_remote modified to obtain sockaddr strucure and call corresponding IPv4/IPv6 function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 96 ++++++++++++++++++++++++++++++++++----- 1 files changed, 83 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 00dcb22..7756d80 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -43,6 +43,7 @@ #include #include #include +#include MODULE_AUTHOR("Sean Hefty"); MODULE_DESCRIPTION("IB Address Translation"); @@ -182,27 +183,42 @@ static void queue_req(struct addr_req *req) mutex_unlock(&lock); } -static void addr_send_arp(struct sockaddr_in *dst_in) +static void addr_send_arp(struct sockaddr *dst_in) { struct rtable *rt; struct flowi fl; - __be32 dst_ip = dst_in->sin_addr.s_addr; memset(&fl, 0, sizeof fl); - fl.nl_u.ip4_u.daddr = dst_ip; - if (ip_route_output_key(&init_net, &rt, &fl)) - return; - neigh_event_send(rt->u.dst.neighbour, NULL); - ip_rt_put(rt); + if (dst_in->sa_family == AF_INET) { + fl.nl_u.ip4_u.daddr = + ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; + + if (ip_route_output_key(&init_net, &rt, &fl)) + return; + + neigh_event_send(rt->u.dst.neighbour, NULL); + ip_rt_put(rt); + + } else { + struct dst_entry *dst; + fl.nl_u.ip6_u.daddr = + ((struct sockaddr_in6 *)dst_in)->sin6_addr; + dst = ip6_route_output(&init_net, NULL, &fl); + if (!dst) + return; + + neigh_event_send(dst->neighbour, NULL); + dst_release(dst); + } } -static int addr_resolve_remote(struct sockaddr *src_in, - struct sockaddr *dst_in, +static int addr4_resolve_remote(struct sockaddr_in *src_in, + struct sockaddr_in *dst_in, struct rdma_dev_addr *addr) { - __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; - __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; + __be32 src_ip = src_in->sin_addr.s_addr; + __be32 dst_ip = dst_in->sin_addr.s_addr; struct flowi fl; struct rtable *rt; struct neighbour *neigh; @@ -233,7 +249,7 @@ static int addr_resolve_remote(struct sockaddr *src_in, } if (!src_ip) { - src_in->sa_family = dst_in->sa_family; + src_in->sin_family = dst_in->sin_family; ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; } @@ -246,6 +262,60 @@ out: return ret; } +static int addr6_resolve_remote(struct sockaddr_in6 *src_in, + struct sockaddr_in6 *dst_in, + struct rdma_dev_addr *addr) +{ + struct flowi fl; + struct dst_entry *dst; + struct neighbour *neigh; + int ret = -ENODATA; + + memset(&fl, 0, sizeof fl); + fl.nl_u.ip6_u.daddr = dst_in->sin6_addr; + fl.nl_u.ip6_u.saddr = src_in->sin6_addr; + + dst = ip6_route_output(&init_net, NULL, &fl); + + if (!dst) + goto out; + + /* If the device does ARP internally, return 'done' */ + if (dst->dev->flags & IFF_NOARP) { + ret = rdma_copy_addr(addr, dst->dev, NULL); + goto release; + } + + neigh = dst->neighbour; + if (!neigh) { + ret = -ENODATA; + goto release; + } + + if (!(neigh->nud_state & NUD_VALID)) { + ret = -ENODATA; + goto release; + } + ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); + +release: + dst_release(dst); +out: + return ret; +} + +static int addr_resolve_remote(struct sockaddr *src_in, + struct sockaddr *dst_in, + struct rdma_dev_addr *addr) +{ + if (src_in->sa_family == AF_INET) { + return addr4_resolve_remote((struct sockaddr_in *)src_in, + (struct sockaddr_in *)dst_in, addr); + } else + return addr6_resolve_remote((struct sockaddr_in6 *)src_in, + (struct sockaddr_in6 *)dst_in, addr); +} + static void process_req(struct work_struct *work) { struct addr_req *req, *temp_req; @@ -384,7 +454,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, case -ENODATA: req->timeout = msecs_to_jiffies(timeout_ms) + jiffies; queue_req(req); - addr_send_arp((struct sockaddr_in *)dst_in); + addr_send_arp(dst_in); break; default: ret = req->status; -- 1.5.6.dirty From alekseys at voltaire.com Mon Aug 11 01:46:02 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Mon, 11 Aug 2008 11:46:02 +0300 Subject: [ofa-general] PATCH Remove padding arrays from librdmacm Message-ID: <1218444362.21009.5.camel@linux-zn6t.site> This patch remove using padding arrays from cma_multicast structure as we did in the kernel Signed-off-by: Aleksey Senin --- src/cma.c | 4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/src/cma.c b/src/cma.c index d4441ce..70dbe1c 100644 --- a/src/cma.c +++ b/src/cma.c @@ -140,9 +140,7 @@ struct cma_multicast { uint32_t handle; union ibv_gid mgid; uint16_t mlid; - struct sockaddr addr; - uint8_t pad[sizeof(struct sockaddr_in6) - - sizeof(struct sockaddr)]; + struct sockaddr_storage addr; }; struct cma_event { -- 1.5.6.dirty From vlad at mellanox.co.il Mon Aug 11 02:04:22 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Mon, 11 Aug 2008 12:04:22 +0300 Subject: [ofa-general] ofed kernel config problem fix In-Reply-To: References: Message-ID: <48A00096.20609@mellanox.co.il> Ron Livne wrote: > Hi Vlad, > I pulled today the latest ofed git. > When I tried to run "make oldconfig" I got the following message: > > scripts/kconfig/conf -o arch/x86/Kconfig > file drivers/infiniband/hw/nes/Kconfig already scanned? > make[1]: *** [oldconfig] Error 1 > make: *** [oldconfig] Error 2 > > > It seems there was a duplicate line in drivers/infiniband/Kconfig. > Here's how I fixed it: > > diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig > index 066ccd4..d7be463 100644 > --- a/drivers/infiniband/Kconfig > +++ b/drivers/infiniband/Kconfig > @@ -46,7 +46,6 @@ source "drivers/infiniband/hw/amso1100/Kconfig" > source "drivers/infiniband/hw/cxgb3/Kconfig" > source "drivers/infiniband/hw/nes/Kconfig" > source "drivers/infiniband/hw/mlx4/Kconfig" > -source "drivers/infiniband/hw/nes/Kconfig" > > source "drivers/infiniband/ulp/ipoib/Kconfig" > Applied, Regards, Vladimir From vlad at lists.openfabrics.org Mon Aug 11 02:52:36 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 11 Aug 2008 02:52:36 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080811-0200 daily build status Message-ID: <20080811095236.5CFA7E60B4C@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080811-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From keshetti.mahesh at gmail.com Mon Aug 11 04:32:49 2008 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Mon, 11 Aug 2008 17:02:49 +0530 Subject: [ofa-general] OpenSM ran out of LIDs Message-ID: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> Hello all, I am getting the below errors in OpenSM while simulating a large Infiniband network using 'ibsim'. Aug 11 14:28:29 002298 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002546 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002602 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002659 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002708 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002764 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002826 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002921 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! Aug 11 14:28:29 002998 [46E0A960] 0x01 -> __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! As per my calculation the no. of unicast lids required for the network (34000) is very much within the limits (max=48000). What could be the reason for the above failure ? Can anyone enlighten me about the LID database concept used in OpenSM ? Also, opensm is not considering the guid2lid provided (with -x) at the time of starting. Is there any other way to provide persistent guid2lid mapping to OpenSM ? -Mahesh From ekkvctaflgt at mom-mail.com Mon Aug 11 06:27:55 2008 From: ekkvctaflgt at mom-mail.com (katrina) Date: Mon, 11 Aug 2008 05:27:55 -0800 Subject: [ofa-general] is it you? katrina here Message-ID: Hi, i am here sitting in the internet caffe. Found your email and decided to write. I am 25 y.o.girl. I have a picture if you want. No need to reply here as this is not may email. Write me at katrina570 at gosfordw.com From ogerlitz at voltaire.com Mon Aug 11 05:46:37 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Mon, 11 Aug 2008 15:46:37 +0300 (IDT) Subject: [ofa-general] low bi bw with qperf for hca loopback test Message-ID: Hi Johann, Using qperf-0.4.0-1.ofed1.3.1 on a system with DDR HCA (connectx) and PCIe gen2, I see a low BW result for bidirectional BW when doing HCA loopback test. Do you have any insight what is the reason for that (and how it can be fixed...) ? When I use a different benchmark tool, I get a result of 4.9 GB/s which is close to the 5.3 GB/s limit of my DRAM (see http://www.crucial.com/support/memory_speeds.aspx second table, third row) Or. # qperf -m 65500 -t 5 172.25.5.77 rc_bw rc_bw: bw = 1.96 GB/sec <--- limit is DDR BW, OK # qperf -m 65500 -t 5 172.25.5.77 rc_bi_bw rc_bi_bw: bw = 3.89 GB/sec <--- DDR BI BW, OK # qperf -m 65500 -t 5 127.0.0.1 rc_bw rc_bw: bw = 2.52 GB/sec <--- not perfect but nice # qperf -m 65500 -t 5 127.0.0.1 rc_bi_bw rc_bi_bw: bw = 2.52 GB/sec <--- why ??? # ib_rdma_bw -b 172.25.5.77 28911: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=1 | cma=0 | 28911: Bandwidth peak (#0 to #961): 3737.36 MB/sec 28911: Bandwidth average: 3737.35 MB/sec # ib_rdma_bw -b 127.0.0.1 28913: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=1 | cma=0 | 28913: Bandwidth peak (#0 to #999): 4910.35 MB/sec 28913: Bandwidth average: 4910.32 MB/sec <--- OK, close to the DRAM limit 28913: Service Demand peak (#0 to #999): 530 cycles/KB From jean-vincent.ficet at bull.net Mon Aug 11 06:16:18 2008 From: jean-vincent.ficet at bull.net (Vincent Ficet) Date: Mon, 11 Aug 2008 15:16:18 +0200 Subject: [ofa-general] Resetting a port (C code) Message-ID: <48A03BA2.8090408@bull.net> Hello, Would anyone have some sample code in C that illustrates how to act upon a given port (e.g. disable / reset / enable) ? Thanks for your help, Vincent From wangwhao at cn.ibm.com Mon Aug 11 06:48:57 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Mon, 11 Aug 2008 21:48:57 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: <489FD9EF.1000207@dev.mellanox.co.il> Message-ID: >Wen Hao Wang wrote: >> Hal and Yevgeny: >> >> Thanks for your comments! >> >> I have disabled the embedded SM on Cisco switch, and run opensm without >> argument on one of my servers. The command gave "subnet up" message and >> hung there. Command output of osmtest also contained errors, while the >> latest line was /OSMTEST: TEST "All Validations" PASS/. Would you please >> have a look at this, I am not sure whether all have succeeded, or there >> is still something wrong. > > It's OK, everything worked fine. > Osmtest does some wrong stuff intentionally, just to see that > opensm can handle it correctly - note all the "expecting errors" > messages in the osmtest log. Same goes for errors in the opensm > log, such as invalid sm_key - osmtest intentionally sends these > "wrong" packets. > > -- Yevgeny Thanks for your confirmation! I am using opensm 3.1.11 shipped in OFED 1.3.1. The man page of osmtest says log file /var/log/osm.log should be created by default. But I can not see such one file. Is it one program bug or out-of-date document? Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Aug 11 06:43:11 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 11 Aug 2008 09:43:11 -0400 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: <489FD9EF.1000207@dev.mellanox.co.il> Message-ID: On Mon, Aug 11, 2008 at 9:48 AM, Wen Hao Wang wrote: >>Wen Hao Wang wrote: >>> Hal and Yevgeny: >>> >>> Thanks for your comments! >>> >>> I have disabled the embedded SM on Cisco switch, and run opensm without >>> argument on one of my servers. The command gave "subnet up" message and >>> hung there. Command output of osmtest also contained errors, while the >>> latest line was /OSMTEST: TEST "All Validations" PASS/. Would you please >>> have a look at this, I am not sure whether all have succeeded, or there >>> is still something wrong. >> >> It's OK, everything worked fine. >> Osmtest does some wrong stuff intentionally, just to see that >> opensm can handle it correctly - note all the "expecting errors" >> messages in the osmtest log. Same goes for errors in the opensm >> log, such as invalid sm_key - osmtest intentionally sends these >> "wrong" packets. >> >> -- Yevgeny > > Thanks for your confirmation! I am using opensm 3.1.11 shipped in OFED > 1.3.1. The man page of osmtest says log file /var/log/osm.log should be > created by default. But I can not see such one file. Is it one program bug > or out-of-date document? I think it is /var/log/opensm.log by default ? man page may be out of date in terms of this. -- Hal > > Wen Hao Wang > Email: wangwhao at cn.ibm.com > From hal.rosenstock at gmail.com Mon Aug 11 06:44:50 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 11 Aug 2008 09:44:50 -0400 Subject: [ofa-general] Resetting a port (C code) In-Reply-To: <48A03BA2.8090408@bull.net> References: <48A03BA2.8090408@bull.net> Message-ID: On Mon, Aug 11, 2008 at 9:16 AM, Vincent Ficet wrote: > Hello, > > Would anyone have some sample code in C that illustrates how to act upon a > given port (e.g. disable / reset / enable) ? Not sure exactly what you mean by reset a port but there is ibportstate diag command which does do this (enable, disable, reset) at the IB level. -- Hal > > Thanks for your help, > > Vincent > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From hal.rosenstock at gmail.com Mon Aug 11 06:48:58 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 11 Aug 2008 09:48:58 -0400 Subject: [ofa-general] OpenSM ran out of LIDs In-Reply-To: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> References: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> Message-ID: On Mon, Aug 11, 2008 at 7:32 AM, Keshetti Mahesh wrote: > Hello all, > > I am getting the below errors in OpenSM while simulating a large > Infiniband network using 'ibsim'. What OpenSM version ? > Aug 11 14:28:29 002298 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002546 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002602 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002659 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002708 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002764 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002826 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002921 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > Aug 11 14:28:29 002998 [46E0A960] 0x01 -> > __osm_lid_mgr_find_free_lid_range: ERR 0307: OPENSM RAN OUT OF LIDS!!! > > As per my calculation the no. of unicast lids required for the network > (34000) is very much within the > limits (max=48000). How are you calculating this ? Are you including switch LIDs in this ? How many switches in this subnet ? Are you using a non 0 LMC ? >What could be the reason for the above failure ? > Can anyone enlighten me about the LID database concept used in OpenSM ? > > Also, opensm is not considering the guid2lid provided (with -x) at the time of > starting. Is the guid2lid file in OSM_CACHE_DIR ? > Is there any other way to provide persistent guid2lid > mapping to OpenSM ? Not AFAIK. -- Hal > > -Mahesh > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From jean-vincent.ficet at bull.net Mon Aug 11 06:52:47 2008 From: jean-vincent.ficet at bull.net (Vincent Ficet) Date: Mon, 11 Aug 2008 15:52:47 +0200 Subject: [ofa-general] Resetting a port (C code) In-Reply-To: References: <48A03BA2.8090408@bull.net> Message-ID: <48A0442F.2010902@bull.net> Hal, >> >> Would anyone have some sample code in C that illustrates how to act upon a >> given port (e.g. disable / reset / enable) ? >> > > Not sure exactly what you mean by reset a port but there is > ibportstate diag command which does do this (enable, disable, reset) > at the IB level. > Thanks for the tip, that's exactly what I was looking for ;-) Vincent From wangwhao at cn.ibm.com Mon Aug 11 07:13:17 2008 From: wangwhao at cn.ibm.com (Wen Hao Wang) Date: Mon, 11 Aug 2008 22:13:17 +0800 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: Message-ID: > I think it is /var/log/opensm.log by default ? man page may be out of > date in terms of this. > > -- Hal I also checked /var/log/opensm.log. It did *not* exist. I am thinking of opening one bug in https://bugs.openfabrics.org/. If you have any concern, please let me know. Thanks. Wen Hao Wang Email: wangwhao at cn.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hal.rosenstock at gmail.com Mon Aug 11 07:05:11 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 11 Aug 2008 10:05:11 -0400 Subject: [ofa-general] opensm hang and osmtest report ERR 0130 In-Reply-To: References: Message-ID: On Mon, Aug 11, 2008 at 10:13 AM, Wen Hao Wang wrote: >> I think it is /var/log/opensm.log by default ? man page may be out of >> date in terms of this. >> >> -- Hal > > I also checked /var/log/opensm.log. It did *not* exist. I am thinking of > opening one bug in https://bugs.openfabrics.org/. If you have any concern, > please let me know. Thanks. By default the log goes to stdout which is what you've seen and what osmtest -h says. It really should say osmtest.log rather than osm.log for this IMO. -- Hal > Wen Hao Wang > Email: wangwhao at cn.ibm.com > From hal.rosenstock at gmail.com Mon Aug 11 07:16:54 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 11 Aug 2008 10:16:54 -0400 Subject: [ofa-general] [PATCH] osmtest/osmtest.8: Fix log_file option description in man page Message-ID: Sasha, Please see attached file. -- Hal -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-osmtest-m1 Type: application/octet-stream Size: 871 bytes Desc: not available URL: From johann.george at qlogic.com Mon Aug 11 07:54:29 2008 From: johann.george at qlogic.com (Johann George) Date: Mon, 11 Aug 2008 07:54:29 -0700 Subject: [ofa-general] Re: low bi bw with qperf for hca loopback test In-Reply-To: References: Message-ID: <20080811145429.GA12481@cuprite.pathscale.com> Hello Or. Interesting results. I assume in the loopback case, you are running both instances of qperf on the same machine? I'm wondering if you are somehow CPU limited? If you give qperf the -v option, it will print out percentage CPU utilization. Note that in its printout, a cpu is a core so 150% utilization indicates 1.5 cores. I'm also wondering (although most unlikely) if somehow both instances of qperf are somehow stuck on the same CPU. You can use the -la and -ra options to set the processor affinities of the client and server. Finally, rc_bw and rc_bi_bw determine bandwidth using Send/Receives. I believe that the ib_rdma_bw utility uses RDMA Writes. For the equivalent test, use the qperf rc_rdma_write_bw test. you can get a list of all tests by typing: qperf -h tests Let me know what you find. Thanks. Johann On Mon, Aug 11, 2008 at 03:46:37PM +0300, Or Gerlitz wrote: > > Hi Johann, > > Using qperf-0.4.0-1.ofed1.3.1 on a system with DDR HCA (connectx) and PCIe gen2, > I see a low BW result for bidirectional BW when doing HCA loopback test. > > Do you have any insight what is the reason for that (and how it can be fixed...) ? > > When I use a different benchmark tool, I get a result of 4.9 GB/s which is close to the 5.3 GB/s > limit of my DRAM (see http://www.crucial.com/support/memory_speeds.aspx second table, third row) > > Or. > > # qperf -m 65500 -t 5 172.25.5.77 rc_bw > rc_bw: > bw = 1.96 GB/sec <--- limit is DDR BW, OK > > # qperf -m 65500 -t 5 172.25.5.77 rc_bi_bw > rc_bi_bw: > bw = 3.89 GB/sec <--- DDR BI BW, OK > > # qperf -m 65500 -t 5 127.0.0.1 rc_bw > rc_bw: > bw = 2.52 GB/sec <--- not perfect but nice > > # qperf -m 65500 -t 5 127.0.0.1 rc_bi_bw > rc_bi_bw: > bw = 2.52 GB/sec <--- why ??? > > > # ib_rdma_bw -b 172.25.5.77 > 28911: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=1 | cma=0 | > 28911: Bandwidth peak (#0 to #961): 3737.36 MB/sec > 28911: Bandwidth average: 3737.35 MB/sec > > > # ib_rdma_bw -b 127.0.0.1 > 28913: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=1 | cma=0 | > > 28913: Bandwidth peak (#0 to #999): 4910.35 MB/sec > 28913: Bandwidth average: 4910.32 MB/sec <--- OK, close to the DRAM limit > 28913: Service Demand peak (#0 to #999): 530 cycles/KB From tziporet at mellanox.co.il Mon Aug 11 08:49:03 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Mon, 11 Aug 2008 18:49:03 +0300 Subject: [ofa-general] OFED meeting agenda for today (Aug 11, 2008) Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD4599CE@mtlexch01.mtl.com> This is the agenda for OFED meeting today (Aug 11, 2008) on OFED 1.4 beta readiness 1. OFED 1.4 status: OFED daily build is now based on kernel 2.6.27-rc1. We still miss backports for iSER thus its disabled from OFED now NFS/RDMA - no backport for distros yet SRPT - does not compile on 2.6.27-rc1 2. Beta release: - Decide what must be completed for the beta - The beta date 3. Testing description: I got descriptions from Voltaire, Intel and Mellanox All companies are requested to submit 4. Open discussion Tziporet From yossi.openib at gmail.com Mon Aug 11 09:18:16 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Mon, 11 Aug 2008 19:18:16 +0300 Subject: [ofa-general] [PATCH] ipoib: garbage-collect stale multicast entries In-Reply-To: References: <48849F59.7060502@Voltaire.COM> Message-ID: <48A06648.30107@gmail.com> Multicast sender joins the MGID as full member, but does not leave (as long as the interface is up). This causes an MGID leakage in the SM. Here, a garbage-collection task will be scheduled once a in a while (1 minute), and leave stale multicast groups (more than 2 minutes old). Signed-off-by: Yossi Etigin -- drivers/infiniband/ulp/ipoib/ipoib.h | 6 ++- drivers/infiniband/ulp/ipoib/ipoib_main.c | 3 + drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 47 +++++++++++++++++++++---- 3 files changed, 47 insertions(+), 9 deletions(-) Index: b/drivers/infiniband/ulp/ipoib/ipoib.h =================================================================== --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -92,6 +92,7 @@ enum { IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, IPOIB_FLAG_CSUM = 11, + IPOIB_MCAST_RUN_GC = 12, IPOIB_MAX_BACKOFF_SECONDS = 16, @@ -135,6 +136,7 @@ struct ipoib_mcast { struct list_head list; unsigned long created; + unsigned long used; unsigned long backoff; unsigned long flags; @@ -292,7 +294,8 @@ struct ipoib_dev_priv { struct rb_root multicast_tree; struct delayed_work pkey_poll_task; - struct delayed_work mcast_task; + struct delayed_work mcast_join_task; + struct delayed_work mcast_leave_task; struct work_struct flush_light; struct work_struct flush_normal; struct work_struct flush_heavy; @@ -464,6 +467,7 @@ int ipoib_dev_init(struct net_device *de void ipoib_dev_cleanup(struct net_device *dev); void ipoib_mcast_join_task(struct work_struct *work); +void ipoib_mcast_leave_task(struct work_struct *work); void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb); void ipoib_mcast_restart_task(struct work_struct *work); Index: b/drivers/infiniband/ulp/ipoib/ipoib_main.c =================================================================== --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1080,7 +1080,8 @@ static void ipoib_setup(struct net_devic INIT_LIST_HEAD(&priv->multicast_list); INIT_DELAYED_WORK(&priv->pkey_poll_task, ipoib_pkey_poll); - INIT_DELAYED_WORK(&priv->mcast_task, ipoib_mcast_join_task); + INIT_DELAYED_WORK(&priv->mcast_join_task, ipoib_mcast_join_task); + INIT_DELAYED_WORK(&priv->mcast_leave_task, ipoib_mcast_leave_task); INIT_WORK(&priv->flush_light, ipoib_ib_dev_flush_light); INIT_WORK(&priv->flush_normal, ipoib_ib_dev_flush_normal); INIT_WORK(&priv->flush_heavy, ipoib_ib_dev_flush_heavy); Index: b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c =================================================================== --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -118,6 +118,7 @@ static struct ipoib_mcast *ipoib_mcast_a mcast->dev = dev; mcast->created = jiffies; + mcast->used = jiffies; mcast->backoff = 1; INIT_LIST_HEAD(&mcast->list); @@ -389,7 +390,7 @@ static int ipoib_mcast_join_complete(int mutex_lock(&mcast_mutex); if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) queue_delayed_work(ipoib_workqueue, - &priv->mcast_task, 0); + &priv->mcast_join_task, 0); mutex_unlock(&mcast_mutex); if (mcast == priv->broadcast) @@ -422,7 +423,7 @@ static int ipoib_mcast_join_complete(int mutex_lock(&mcast_mutex); spin_lock_irq(&priv->lock); if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) - queue_delayed_work(ipoib_workqueue, &priv->mcast_task, + queue_delayed_work(ipoib_workqueue, &priv->mcast_join_task, mcast->backoff * HZ); spin_unlock_irq(&priv->lock); mutex_unlock(&mcast_mutex); @@ -492,7 +493,7 @@ static void ipoib_mcast_join(struct net_ mutex_lock(&mcast_mutex); if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) queue_delayed_work(ipoib_workqueue, - &priv->mcast_task, + &priv->mcast_join_task, mcast->backoff * HZ); mutex_unlock(&mcast_mutex); } @@ -501,7 +502,7 @@ static void ipoib_mcast_join(struct net_ void ipoib_mcast_join_task(struct work_struct *work) { struct ipoib_dev_priv *priv = - container_of(work, struct ipoib_dev_priv, mcast_task.work); + container_of(work, struct ipoib_dev_priv, mcast_join_task.work); struct net_device *dev = priv->dev; if (!test_bit(IPOIB_MCAST_RUN, &priv->flags)) @@ -530,7 +531,7 @@ void ipoib_mcast_join_task(struct work_s mutex_lock(&mcast_mutex); if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) queue_delayed_work(ipoib_workqueue, - &priv->mcast_task, HZ); + &priv->mcast_join_task, HZ); mutex_unlock(&mcast_mutex); return; } @@ -594,7 +595,9 @@ int ipoib_mcast_start_thread(struct net_ mutex_lock(&mcast_mutex); if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) - queue_delayed_work(ipoib_workqueue, &priv->mcast_task, 0); + queue_delayed_work(ipoib_workqueue, &priv->mcast_join_task, 0); + if (!test_and_set_bit(IPOIB_MCAST_RUN_GC, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->mcast_leave_task, 0); mutex_unlock(&mcast_mutex); return 0; @@ -608,7 +611,9 @@ int ipoib_mcast_stop_thread(struct net_d mutex_lock(&mcast_mutex); clear_bit(IPOIB_MCAST_RUN, &priv->flags); - cancel_delayed_work(&priv->mcast_task); + clear_bit(IPOIB_MCAST_RUN_GC, &priv->flags); + cancel_delayed_work(&priv->mcast_join_task); + cancel_delayed_work(&priv->mcast_leave_task); mutex_unlock(&mcast_mutex); if (flush) @@ -715,6 +720,7 @@ out: } } + mcast->used = jiffies; ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN); } @@ -859,6 +865,33 @@ void ipoib_mcast_restart_task(struct wor ipoib_mcast_start_thread(dev); } +void ipoib_mcast_leave_task(struct work_struct *work) +{ + struct ipoib_dev_priv *priv = + container_of(work, struct ipoib_dev_priv, mcast_leave_task.work); + struct net_device *dev = priv->dev; + struct ipoib_mcast *mcast, *tmcast; + LIST_HEAD(remove_list); + + if (!test_bit(IPOIB_MCAST_RUN_GC, &priv->flags)) + return; + + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) && + time_before(mcast->used, jiffies - 120 * HZ)) { + rb_erase(&mcast->rb_node, &priv->multicast_tree); + list_move_tail(&mcast->list, &remove_list); + } + } + + list_for_each_entry_safe(mcast, tmcast, &remove_list, list) { + ipoib_mcast_leave(dev, mcast); + ipoib_mcast_free(mcast); + } + + queue_delayed_work(ipoib_workqueue, &priv->mcast_leave_task, 60 * HZ); +} + #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev) From yosefe at Voltaire.COM Mon Aug 11 09:35:50 2008 From: yosefe at Voltaire.COM (Yossi Etigin) Date: Mon, 11 Aug 2008 19:35:50 +0300 Subject: [ofa-general] [PATCH v3] ib/core: fix for send multicast group send leave retry Message-ID: <48A06A66.7070605@Voltaire.COM> Until now, only if joining a multicast group failed there was a retry mechanism. This patch will add a mechanism that will retry to leave a multicast group before giving up. Changes from v1: - Save the leave state because it's overridden - use 'else' Changes from v2: - Call mcast_work_handler() when send_leave() fails Signed-off-by: Ron Livne Signed-off-by: Yossi Etigin Index: b/drivers/infiniband/core/multicast.c =================================================================== --- a/drivers/infiniband/core/multicast.c 2008-08-11 19:13:26.000000000 +0300 +++ b/drivers/infiniband/core/multicast.c 2008-08-11 19:34:21.000000000 +0300 @@ -106,6 +106,8 @@ struct mcast_group { struct ib_sa_query *query; int query_id; u16 pkey_index; + u8 leave_state; + int retries; }; struct mcast_member { @@ -350,6 +352,7 @@ static int send_leave(struct mcast_group rec = group->rec; rec.join_state = leave_state; + group->leave_state = leave_state; ret = ib_sa_mcmember_rec_query(&sa_client, port->dev->device, port->port_num, IB_SA_METHOD_DELETE, &rec, @@ -542,7 +545,11 @@ static void leave_handler(int status, st { struct mcast_group *group = context; - mcast_work_handler(&group->work); + if (status && (group->retries > 0) && + !send_leave(group, group->leave_state)) + group->retries--; + else + mcast_work_handler(&group->work); } static struct mcast_group *acquire_group(struct mcast_port *port, @@ -565,6 +572,7 @@ static struct mcast_group *acquire_group if (!group) return NULL; + group->retries = 3; group->port = port; group->rec.mgid = *mgid; group->pkey_index = MCAST_INVALID_PKEY_INDEX; -- --Yossi From sean.hefty at intel.com Mon Aug 11 10:42:25 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 10:42:25 -0700 Subject: [ofa-general] RE: [PATCH v3] ib/core: fix for send multicast group send leave retry In-Reply-To: <48A06A66.7070605@Voltaire.COM> References: <48A06A66.7070605@Voltaire.COM> Message-ID: These latest changes look okay by me. Acked-by: Sean Hefty (My e-mail is having IT related difficulties...) From sean.hefty at intel.com Mon Aug 11 12:18:23 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 12:18:23 -0700 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 In-Reply-To: <1218435361.10251.4.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> Message-ID: Mostly minor nits on a few of these patches. >@@ -113,15 +114,30 @@ EXPORT_SYMBOL(rdma_copy_addr); > int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) > { > struct net_device *dev; >- __be32 ip = ((struct sockaddr_in *) addr)->sin_addr.s_addr; >- int ret; >+ int ret = -EADDRNOTAVAIL; > >- dev = ip_dev_find(&init_net, ip); >- if (!dev) >- return -EADDRNOTAVAIL; >+ switch (addr->sa_family) { >+ case AF_INET: >+ dev = ip_dev_find(&init_net, >+ ((struct sockaddr_in *) addr)->sin_addr.s_addr); > >- ret = rdma_copy_addr(dev_addr, dev, NULL); >- dev_put(dev); >+ if (!dev) >+ return -EADDRNOTAVAIL; We can just break here. Also, my personal preferences is to have the assignment of dev and the if check together, without an extra blank line. >+ >+ ret = rdma_copy_addr(dev_addr, dev, NULL); >+ dev_put(dev); >+ break; >+ case AF_INET6: >+ for_each_netdev(&init_net, dev) { >+ if (ipv6_chk_addr(&init_net, &((struct sockaddr_in6 *) >addr)->sin6_addr, dev, 1)) { Line is a little long, plus it gets removed by a later patch. - Sean From sean.hefty at intel.com Mon Aug 11 12:22:53 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 12:22:53 -0700 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 In-Reply-To: <1218435345.10251.2.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435345.10251.2.camel@linux-zn6t.site> Message-ID: >@@ -251,20 +251,20 @@ static void process_req(struct work_struct *work) > > list_for_each_entry_safe(req, temp_req, &done_list, list) { > list_del(&req->list); >- req->callback(req->status, &req->src_addr, req->addr, >- req->context); >+ req->callback(req->status, (struct sockaddr *) &req->src_addr, >\ Looks like '\' ended up in the patch. - Sean From sean.hefty at intel.com Mon Aug 11 12:27:17 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 12:27:17 -0700 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 5/8 In-Reply-To: <1218435474.10251.8.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> Message-ID: > drivers/infiniband/core/cma.c | 75 +++++++++++++++++++++++++++++------------ > 1 files changed, 53 insertions(+), 22 deletions(-) > >diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >index 4728265..ec0855f 100644 >--- a/drivers/infiniband/core/cma.c >+++ b/drivers/infiniband/core/cma.c >@@ -2113,32 +2113,63 @@ EXPORT_SYMBOL(rdma_bind_addr); > static int cma_format_hdr(void *hdr, enum rdma_port_space ps, > struct rdma_route *route) > { >- struct sockaddr_in *src4, *dst4; > struct cma_hdr *cma_hdr; > struct sdp_hh *sdp_hdr; > >- src4 = (struct sockaddr_in *) &route->addr.src_addr; >- dst4 = (struct sockaddr_in *) &route->addr.dst_addr; >- >- switch (ps) { >- case RDMA_PS_SDP: >- sdp_hdr = hdr; >- if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) >- return -EINVAL; >- sdp_set_ip_ver(sdp_hdr, 4); >- sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; >- sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; >- sdp_hdr->port = src4->sin_port; >- break; >- default: >- cma_hdr = hdr; >- cma_hdr->cma_version = CMA_VERSION; >- cma_set_ip_ver(cma_hdr, 4); >- cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; >- cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; >- cma_hdr->port = src4->sin_port; >- break; >+ if (route->addr.src_addr.ss_family == AF_INET) { >+ struct sockaddr_in *src4, *dst4; >+ >+ src4 = (struct sockaddr_in *) &route->addr.src_addr; >+ dst4 = (struct sockaddr_in *) &route->addr.dst_addr; >+ >+ switch (ps) { >+ case RDMA_PS_SDP: >+ sdp_hdr = hdr; >+ if (sdp_get_majv(sdp_hdr->sdp_version) != >SDP_MAJ_VERSION) >+ return -EINVAL; >+ sdp_set_ip_ver(sdp_hdr, 4); >+ sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; >+ sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; >+ sdp_hdr->port = src4->sin_port; >+ break; >+ default: >+ cma_hdr = hdr; >+ cma_hdr->cma_version = CMA_VERSION; >+ cma_set_ip_ver(cma_hdr, 4); >+ cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; >+ cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; >+ cma_hdr->port = src4->sin_port; >+ break; >+ } >+ } else if (route->addr.src_addr.ss_family == AF_INET6) { I think this can just be 'else', since we've checked the address family elsewhere. >+ struct sockaddr_in6 *src6, *dst6; >+ >+ src6 = (struct sockaddr_in6 *) &route->addr.src_addr; >+ dst6 = (struct sockaddr_in6 *) &route->addr.dst_addr; >+ >+ switch (ps) { >+ case RDMA_PS_SDP: >+ sdp_hdr = hdr; >+ if (sdp_get_majv(sdp_hdr->sdp_version) != >SDP_MAJ_VERSION) >+ return -EINVAL; >+ sdp_set_ip_ver(sdp_hdr, 6); >+ sdp_hdr->src_addr.ip6 = src6->sin6_addr; >+ sdp_hdr->dst_addr.ip6 = dst6->sin6_addr; >+ sdp_hdr->port = src6->sin6_port; >+ break; >+ default: >+ cma_hdr = hdr; >+ cma_hdr->cma_version = CMA_VERSION; >+ cma_set_ip_ver(cma_hdr, 6); >+ cma_hdr->src_addr.ip6 = src6->sin6_addr; >+ cma_hdr->dst_addr.ip6 = dst6->sin6_addr; >+ cma_hdr->port = src6->sin6_port; >+ break; > } >+ return 0; >+ } else >+ return -EAFNOSUPPORT; >+ > return 0; Converting the first else will fix this, but the spacing is off here, and the first 'return 0' isn't needed. - Sean From sean.hefty at intel.com Mon Aug 11 12:34:20 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 12:34:20 -0700 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 7/8 In-Reply-To: <1218435552.10251.12.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> <1218435514.10251.10.camel@linux-zn6t.site> <1218435552.10251.12.camel@linux-zn6t.site> Message-ID: >+static inline struct net_device *cma_ipv6_dev_find(struct in6_addr *addr) inline isn't needed >+{ >+ struct net_device *dev = 0; We don't need to initialize dev here. (If we did, it should be set to NULL.) >+ for_each_netdev(&init_net, dev) >+ if (ipv6_chk_addr(&init_net, addr, dev, 1)) >+ return dev; >+ return NULL; >+} >+ > int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) > { > struct net_device *dev; >@@ -128,12 +137,13 @@ int rdma_translate_ip(struct sockaddr *addr, struct >rdma_dev_addr *dev_addr) > dev_put(dev); > break; > case AF_INET6: >- for_each_netdev(&init_net, dev) { >- if (ipv6_chk_addr(&init_net, &((struct sockaddr_in6 *) >addr)->sin6_addr, dev, 1)) { >- ret = rdma_copy_addr(dev_addr, dev, NULL); >- break; >- } >- } >+ dev = cma_ipv6_dev_find( >+ &((struct sockaddr_in6 *)addr)->sin6_addr); >+ extra blank line >+ if (!dev) >+ return -EADDRNOTAVAIL; >+ >+ ret = rdma_copy_addr(dev_addr, dev, NULL); > break; > default: > break; >@@ -279,30 +289,60 @@ static int addr_resolve_local(struct sockaddr *src_in, > struct rdma_dev_addr *addr) > { > struct net_device *dev; >- __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; >- __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; >- int ret; >+ int ret = -EADDRNOTAVAIL; initialization isn't needed > >- dev = ip_dev_find(&init_net, dst_ip); >- if (!dev) >- return -EADDRNOTAVAIL; >+ if (dst_in->sa_family == AF_INET) { >+ __be32 src_ip = ((struct sockaddr_in *)src_in)- >>sin_addr.s_addr; >+ __be32 dst_ip = ((struct sockaddr_in *)dst_in)- >>sin_addr.s_addr; > >- if (ipv4_is_zeronet(src_ip)) { >- src_in->sa_family = dst_in->sa_family; >- ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; >- ret = rdma_copy_addr(addr, dev, dev->dev_addr); >- } else if (ipv4_is_loopback(src_ip)) { >- ret = rdma_translate_ip(dst_in, addr); >- if (!ret) >- memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >- } else { >- ret = rdma_translate_ip(src_in, addr); >- if (!ret) >- memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >- } >+ dev = ip_dev_find(&init_net, dst_ip); >+ if (!dev) >+ return -EADDRNOTAVAIL; > >- dev_put(dev); >- return ret; >+ if (ipv4_is_zeronet(src_ip)) { >+ src_in->sa_family = dst_in->sa_family; >+ ((struct sockaddr_in *)src_in)->sin_addr.s_addr = >dst_ip; >+ ret = rdma_copy_addr(addr, dev, dev->dev_addr); >+ } else if (ipv4_is_loopback(src_ip)) { >+ ret = rdma_translate_ip(dst_in, addr); >+ if (!ret) >+ memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >+ } else { >+ ret = rdma_translate_ip(src_in, addr); >+ if (!ret) >+ memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >+ } >+ >+ dev_put(dev); >+ return ret; use single return statement at end of function >+ } else if (dst_in->sa_family == AF_INET6) { should be safe to just make into an else >+ struct in6_addr *a = &((struct sockaddr_in6 *)dst_in)- >>sin6_addr; >+ >+ dev = cma_ipv6_dev_find(a); >+ extra blank line >+ if (!dev) >+ return -EADDRNOTAVAIL; >+ >+ a = &((struct sockaddr_in6 *)src_in)->sin6_addr; >+ >+ if (ipv6_addr_any(a)) { >+ src_in->sa_family = dst_in->sa_family; >+ ((struct sockaddr_in6 *)src_in)->sin6_addr = >+ ((struct sockaddr_in6 *)dst_in)->sin6_addr; >+ ret = rdma_copy_addr(addr, dev, dev->dev_addr); >+ } else if (ipv6_addr_loopback(a)) { >+ ret = rdma_translate_ip(dst_in, addr); >+ if (!ret) >+ memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >+ } else { >+ ret = rdma_translate_ip(src_in, addr); >+ if (!ret) >+ memcpy(addr->dst_dev_addr, dev->dev_addr, >MAX_ADDR_LEN); >+ } >+ >+ return ret; use single return statement at end of function >+ } >+ return -EADDRNOTAVAIL; use return ret; > } From sean.hefty at intel.com Mon Aug 11 12:43:08 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 12:43:08 -0700 Subject: [ofa-general] IPv6 RDMA CM PATCHv2 8/8 In-Reply-To: <1218435620.10251.14.camel@linux-zn6t.site> References: <1218435208.8137.13.camel@linux-zn6t.site> <1218435361.10251.4.camel@linux-zn6t.site> <1218435438.10251.6.camel@linux-zn6t.site> <1218435474.10251.8.camel@linux-zn6t.site> <1218435514.10251.10.camel@linux-zn6t.site> <1218435552.10251.12.camel@linux-zn6t.site> <1218435620.10251.14.camel@linux-zn6t.site> Message-ID: >-static int addr_resolve_remote(struct sockaddr *src_in, >- struct sockaddr *dst_in, >+static int addr4_resolve_remote(struct sockaddr_in *src_in, >+ struct sockaddr_in *dst_in, A previous patch converted this from sockaddr_in to sockaddr, and now it's converted back. > struct rdma_dev_addr *addr) > { >- __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; >- __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; >+ __be32 src_ip = src_in->sin_addr.s_addr; >+ __be32 dst_ip = dst_in->sin_addr.s_addr; > struct flowi fl; > struct rtable *rt; > struct neighbour *neigh; >@@ -233,7 +249,7 @@ static int addr_resolve_remote(struct sockaddr *src_in, > } > > if (!src_ip) { >- src_in->sa_family = dst_in->sa_family; >+ src_in->sin_family = dst_in->sin_family; > ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; > } > >@@ -246,6 +262,60 @@ out: > return ret; > } > >+static int addr6_resolve_remote(struct sockaddr_in6 *src_in, >+ struct sockaddr_in6 *dst_in, >+ struct rdma_dev_addr *addr) >+{ >+ struct flowi fl; >+ struct dst_entry *dst; >+ struct neighbour *neigh; >+ int ret = -ENODATA; >+ >+ memset(&fl, 0, sizeof fl); >+ fl.nl_u.ip6_u.daddr = dst_in->sin6_addr; >+ fl.nl_u.ip6_u.saddr = src_in->sin6_addr; >+ >+ dst = ip6_route_output(&init_net, NULL, &fl); >+ extra blank line >+ if (!dst) >+ goto out; >+ >+ /* If the device does ARP internally, return 'done' */ >+ if (dst->dev->flags & IFF_NOARP) { >+ ret = rdma_copy_addr(addr, dst->dev, NULL); >+ goto release; >+ } >+ >+ neigh = dst->neighbour; >+ if (!neigh) { >+ ret = -ENODATA; ret was initialized to ENODATA above >+ goto release; >+ } >+ >+ if (!(neigh->nud_state & NUD_VALID)) { >+ ret = -ENODATA; ret was initialized to ENODATA above >+ goto release; >+ } >+ ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); >+ >+release: >+ dst_release(dst); >+out: >+ return ret; >+} >+ >+static int addr_resolve_remote(struct sockaddr *src_in, >+ struct sockaddr *dst_in, >+ struct rdma_dev_addr *addr) >+{ >+ if (src_in->sa_family == AF_INET) { >+ return addr4_resolve_remote((struct sockaddr_in *)src_in, >+ (struct sockaddr_in *)dst_in, addr); >+ } else curly brace use is inconsistent here - not needed on the if portion - Sean From rdreier at cisco.com Mon Aug 11 13:55:04 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 11 Aug 2008 13:55:04 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> (Yosef Etigin's message of "Sat, 9 Aug 2008 15:38:52 +0300") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: > > I don't think moving to a different workqueue helps, does it? Because > > we just have to flush *that* workqueue somewhere too. > Yes, but it won't have to be from ipoib_stop, it can be from a place > where rtnl_lock is not held. That's kind of the direction I've been looking, except I don't think we need to invent a new workqueue to do this. It seems that ipoib_stop is the wrong place to flush our workqueue in general. - R. From sean.hefty at intel.com Mon Aug 11 14:43:50 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 11 Aug 2008 14:43:50 -0700 Subject: [ofa-general] RE: PATCH Remove padding arrays from librdmacm In-Reply-To: <1218444362.21009.5.camel@linux-zn6t.site> References: <1218444362.21009.5.camel@linux-zn6t.site> Message-ID: thanks - applied From eli at dev.mellanox.co.il Mon Aug 11 15:35:24 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 12 Aug 2008 01:35:24 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: <20080811223524.GA24278@mtls03> How about the following approach? We mark with a flag the fact that we're being called from ipoib_stop and in that case we do not attempt to take the lock. drivers/infiniband/ulp/ipoib/ipoib.h | 1 + drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 ++ drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 9 +++++++-- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index b0ffc9a..a2b5d8c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -92,6 +92,7 @@ enum { IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, IPOIB_FLAG_CSUM = 11, + IPOIB_FLAG_STOPPING = 12, IPOIB_MAX_BACKOFF_SECONDS = 16, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index f51201b..008b674 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -151,6 +151,7 @@ static int ipoib_stop(struct net_device *dev) ipoib_dbg(priv, "stopping interface\n"); + set_bit(IPOIB_FLAG_STOPPING, &priv->flags); clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); napi_disable(&priv->napi); @@ -182,6 +183,7 @@ static int ipoib_stop(struct net_device *dev) mutex_unlock(&priv->vlan_mutex); } + clear_bit(IPOIB_FLAG_STOPPING, &priv->flags); return 0; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 8950e95..9b35188 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -576,9 +576,14 @@ void ipoib_mcast_join_task(struct work_struct *work) priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) { - rtnl_lock(); + int took_lock = 0; + if (!test_bit(IPOIB_FLAG_STOPPING, &priv->flags)) { + rtnl_lock(); + took_lock = 1; + } dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); - rtnl_unlock(); + if (took_lock) + rtnl_unlock(); } ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); -- 1.5.6.5 From keshetti.mahesh at gmail.com Mon Aug 11 22:13:50 2008 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Tue, 12 Aug 2008 10:43:50 +0530 Subject: [ofa-general] OpenSM ran out of LIDs In-Reply-To: References: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> Message-ID: <829ded920808112213k429180bay9dce2db0add44112@mail.gmail.com> >> I am getting the below errors in OpenSM while simulating a large >> Infiniband network using 'ibsim'. > > What OpenSM version ? I am using "opensm-3.1.10", the same one which comes along with OFED-1.3. > > How are you calculating this ? Are you including switch LIDs in this ? One LID per each port of HCA and one LID per each switch present in the network > How many switches in this subnet ? Are you using a non 0 LMC ? No. I am using LMC=0 only. >> Also, opensm is not considering the guid2lid provided (with -x) at the time of >> starting. > > Is the guid2lid file in OSM_CACHE_DIR ? Yes. guid2lid fiile is is present in the OSM_CACHE_DIR. OpenSM instead of reading from that file, it directly writes to the guid2lid file I have provided. -Mahesh From yosefe at voltaire.com Mon Aug 11 23:25:44 2008 From: yosefe at voltaire.com (Yosef Etigin) Date: Tue, 12 Aug 2008 09:25:44 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <20080811223524.GA24278@mtls03> References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <20080811223524.GA24278@mtls03> Message-ID: <32cb786f0808112325r6f82a827g35be4fc516a10be0@mail.gmail.com> I don't think it will work, because the lock is taken way before your flag is set. If the lock is taken (by rtnl that calls ipoib_stop) and the flag is not set yet, ipoib_mcast_join_task() will still try to take the lock, and we have a deadlock. --Yossi > if (!ipoib_cm_admin_enabled(dev)) { > - rtnl_lock(); > + int took_lock = 0; > + if (!test_bit(IPOIB_FLAG_STOPPING, &priv->flags)) { > + rtnl_lock(); > + took_lock = 1; > + } > dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu)); > - rtnl_unlock(); > + if (took_lock) > + rtnl_unlock(); > } From ogerlitz at voltaire.com Mon Aug 11 23:29:21 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 12 Aug 2008 09:29:21 +0300 Subject: [ofa-general] Re: low bi bw with qperf for hca loopback test In-Reply-To: <20080811145429.GA12481@cuprite.pathscale.com> References: <20080811145429.GA12481@cuprite.pathscale.com> Message-ID: <48A12DC1.9090502@voltaire.com> Johann George wrote: > Interesting results. I assume in the loopback case, you are running both instances of qperf on the same machine? I'm wondering if you are somehow CPU limited? If you give qperf the -v option, it will print out percentage CPU utilization. Note that in its printout, a cpu is a core so 150% utilization indicates 1.5 cores. Yes, I am running both instances on the same machine. > > I'm also wondering (although most unlikely) if somehow both instances of qperf are somehow stuck on the same CPU. You can use the -la and -ra options to set the processor affinities of the client and server. I used the la and ra directives to place each instance on a different CPU. Its a dual cpu - quad core machine so I placed the client on cpu 0 and the server on cpu 4. It didn't help, I also tried with polling (-P 1) which indeed increased dramatically the cpu utilization but it didn't help either. I have a feeling that for some reason the cpu being the bottleneck in this test. Just to make sure, I also made more two runs, one with -ar 0 () and one with -ar 1, the later caused bw reduction from 2.5 GB/s to 1.5 GB/s so I concluded that all the runs I made before where I didn't specify any "access-receive" directive the data was not touched. The 2.5 GB/s is too similar to the CPU frequency of this system which is 2.66 GB/s, also if I run one instance of the stream RAM benchmark, I get 2.6 GB/s BW so its another evidence that qperf rc_bi_bw when run in loopback config has some bottleneck which is not related to the HCA or the PCI bus / bridge... > Finally, rc_bw and rc_bi_bw determine bandwidth using Send/Receives. I believe that the ib_rdma_bw utility uses RDMA Writes. For the equivalent test, use the qperf rc_rdma_write_bw test. mmm, it's not really possible .... since qperf doesn't have a rdma_write_bi_bw test and ib_send_bw doesn't really work with the -b directive. Or. From azizlerl at boun.edu.tr Mon Aug 11 22:30:01 2008 From: azizlerl at boun.edu.tr (angie jonggu) Date: Tue, 12 Aug 2008 05:30:01 +0000 Subject: [ofa-general] For: general Britney Spears Wants to Give Away Sex Video Message-ID: <000a01c8fc4b$06c2a280$48d3dc9f@invsx> Dass es auch anders gehen kann, zeigt Land Rover mit dem Entwurf fur einen Hybridantrieb, der Gelandewagen bis zu 30 Prozent sparsamer machen gleichzeitig abseits der Stra?en noch ein Stuckchen weiter bringen soll (mehr...). Wahrend dieses Konzept allerdings noch Zukunftsmusik ist, sind ein paar andere Okoautos bereits in der Gegenwart angekommen: Die Elektrofahrzeuge. Weil der gemeine Londoner wechselweise unter dem Verkehrsinfarkt oder den acht Pfund City-Maut stohnt, stehen Stromer in der britischen Hauptstadt hoher im Kurs als sonst wo in Europa. Watch tChaos bis um Mitternacht: Vom 36-stundigen Pilotenstreik bei den Lufthansa-Tochtern CityLine und Eurowings sind Zehntausende Fluggaste betroffen - Lufthansa rechnet am Mittwoch mit mehr als 500 Flugausfallen in ganz Europa. Bereits am Dienstag wurden mehr als 400 Fluge gestrichen. mehr... [ Forum ] he video -------------- next part -------------- An HTML attachment was scrubbed... URL: From PHF at zurich.ibm.com Tue Aug 12 00:25:09 2008 From: PHF at zurich.ibm.com (Philip Frey1) Date: Tue, 12 Aug 2008 09:25:09 +0200 Subject: [ofa-general] cxgb3: MW Support Message-ID: Are memory windows supported by the Chelsio T3 adapter? If so, what is the verb to bind/unbind them? Many thanks, Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-vincent.ficet at bull.net Tue Aug 12 00:54:17 2008 From: jean-vincent.ficet at bull.net (Vincent Ficet) Date: Tue, 12 Aug 2008 09:54:17 +0200 Subject: [ofa-general] opensm error when running ibsim Message-ID: <48A141A9.4090707@bull.net> Hello, When running the latest opensm with ibsim, I get the following error: [user at host:~ ] ibsim -s ibsim/net-examples/net.1 [ ... ] [user at host:~ ] export LD_PRELOAD=/path/to/libumad2sim.so [user at host:~ ] opensm -f - ... OpenSM 3.2.2 ibwarn: [32336] sim_connect: attached as client 2 at node "Switch1" Entering DISCOVERING state Error from osm_opensm_bind (0x2A) Perhaps another instance of OpenSM is already running Exiting SM However, no other instance of opensm was running at that time. Digging further, I realised that commenting out the 'guid' entry in opensm.conf fixed this issue. ==> Shouldn't opensm have some mechanism that excludes this parameter when a connection to the sim:ctl socket is made ? Cheers, Vincent From vlad at lists.openfabrics.org Tue Aug 12 02:52:53 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 12 Aug 2008 02:52:53 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080812-0200 daily build status Message-ID: <20080812095253.E7800E60CD2@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080812-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From keshetti.mahesh at gmail.com Tue Aug 12 04:55:07 2008 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Tue, 12 Aug 2008 17:25:07 +0530 Subject: [ofa-general] Re: opensm error when running ibsim Message-ID: <829ded920808120455n40a8434fje0813501e996870b@mail.gmail.com> > [user at host:~ ] opensm -f - How does "opensm.conf " came into picture when you are running 'opensm' from command line ? FYI, there is a good script "run_opensm.sh" in "/usr/share/doc/ibsim-0.4/scripts/run_opensm.sh" to start OpenSM with 'ibsim'. -Mahesh From jean-vincent.ficet at bull.net Tue Aug 12 05:03:58 2008 From: jean-vincent.ficet at bull.net (Vincent Ficet) Date: Tue, 12 Aug 2008 14:03:58 +0200 Subject: [ofa-general] Re: opensm error when running ibsim In-Reply-To: <829ded920808120455n40a8434fje0813501e996870b@mail.gmail.com> References: <829ded920808120455n40a8434fje0813501e996870b@mail.gmail.com> Message-ID: <48A17C2E.7000504@bull.net> Hello, > How does "opensm.conf " came into picture when you are running > 'opensm' from command line ? > The default config files related defines come from management/opensm/include/config.h, which is generated by autoconf. In actual fact, it's easy to see which files are being used using strace as follows: [user at host] git > strace -e trace=open opensm -f - 2>&1 | grep opensm.conf open("/home_nfs/vficet/work/infiniband/etc/opensm/opensm.conf", O_RDONLY) = 3 Reading Cached Option File: /home_nfs/vficet/work/infiniband/etc/opensm/opensm.conf > FYI, there is a good script "run_opensm.sh" in > "/usr/share/doc/ibsim-0.4/scripts/run_opensm.sh" > to start OpenSM with 'ibsim'. > Sounds good. I'll try it ! Cheers, Vincent From tziporet at dev.mellanox.co.il Tue Aug 12 05:38:08 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Tue, 12 Aug 2008 15:38:08 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <32cb786f0808112325r6f82a827g35be4fc516a10be0@mail.gmail.com> References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <20080811223524.GA24278@mtls03> <32cb786f0808112325r6f82a827g35be4fc516a10be0@mail.gmail.com> Message-ID: <48A18430.2060802@mellanox.co.il> Yosef Etigin wrote: > I don't think it will work, because the lock is taken way before your > flag is set. > If the lock is taken (by rtnl that calls ipoib_stop) and the flag is > not set yet, > ipoib_mcast_join_task() will still try to take the lock, and we have a deadlock. > Can you test it? Tziporet > > From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:36 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:36 +0200 Subject: [ofa-general] [PATCH 2/5] ib/ehca: rename goto label In-Reply-To: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> References: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> Message-ID: <20080812144936.0ed20333@BL3D1974.boeblingen.de.ibm.com> Rename the "poll_cq_one_read_cqe" goto label to what it actually does, "repoll". Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -589,7 +589,7 @@ static inline int ehca_poll_cq_one(struc struct ehca_qp *my_qp; int cqe_count = 0, is_error; -poll_cq_one_read_cqe: +repoll: cqe = (struct ehca_cqe *) ipz_qeit_get_inc_valid(&my_cq->ipz_queue); if (!cqe) { @@ -617,7 +617,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, "cq_num=%x qp_num=%x", my_cq->cq_number, cqe->local_qp_number); /* ignore this purged cqe */ - goto poll_cq_one_read_cqe; + goto repoll; } spin_lock_irqsave(&qp->spinlock_s, flags); purgeflag = qp->sqerr_purgeflag; @@ -636,7 +636,7 @@ poll_cq_one_read_cqe: * that caused sqe and turn off purge flag */ qp->sqerr_purgeflag = 0; - goto poll_cq_one_read_cqe; + goto repoll; } } From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:41 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:41 +0200 Subject: [ofa-general] [PATCH 3/5] ib/ehca: repoll on invalid opcode In-Reply-To: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> References: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> Message-ID: <20080812144941.7c89c18d@BL3D1974.boeblingen.de.ibm.com> When the ehca driver detects an invalid opcode in a CQE, it currently passes the CQE to the application and returns with success. This patch changes the CQE handling to discard CQEs with invalid opcodes and to continue reading the next CQE from the CQ. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -667,7 +667,7 @@ repoll: ehca_dmp(cqe, 64, "ehca_cq=%p cq_num=%x", my_cq, my_cq->cq_number); /* update also queue adder to throw away this entry!!! */ - goto poll_cq_one_exit0; + goto repoll; } /* eval ib_wc_status */ From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:31 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:31 +0200 Subject: [ofa-general] [PATCH 1/5] ib/ehca: update qp_state on cached modify_qp() In-Reply-To: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> References: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> Message-ID: <20080812144931.00365782@BL3D1974.boeblingen.de.ibm.com> Since the introduction of the port auto-detect mode for ehca, calls to modify_qp() may be cached in the device driver when the ports are not activated yet. When a modify_qp() call is cached, the qp state remains untouched until the port is activated, which will leave the qp in the reset state. In the reset state, however, it is not allowed to post SQ WQEs, which confuses applications like ib_mad. The solution for this problem is to immediately set the qp state as requested by modify_qp(), even when the call is cached. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_qp.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qp.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1534,8 +1534,6 @@ static int internal_modify_qp(struct ib_ if (attr_mask & IB_QP_QKEY) my_qp->qkey = attr->qkey; - my_qp->state = qp_new_state; - modify_qp_exit2: if (squeue_locked) { /* this means: sqe -> rts */ spin_unlock_irqrestore(&my_qp->spinlock_s, flags); @@ -1551,6 +1549,8 @@ modify_qp_exit1: int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata) { + int ret = 0; + struct ehca_shca *shca = container_of(ibqp->device, struct ehca_shca, ib_device); struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); @@ -1597,12 +1597,18 @@ int ehca_modify_qp(struct ib_qp *ibqp, s attr->qp_state, my_qp->init_attr.port_num, ibqp->qp_type); spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); - return 0; + goto out; } spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); } - return internal_modify_qp(ibqp, attr, attr_mask, 0); + ret = internal_modify_qp(ibqp, attr, attr_mask, 0); + +out: + if ((ret == 0) && (attr_mask & IB_QP_STATE)) + my_qp->state = attr->qp_state; + + return ret; } void ehca_recover_sqp(struct ib_qp *sqp) From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:46 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:46 +0200 Subject: [ofa-general] [PATCH 4/5] ib/ehca: check idr_find() return value In-Reply-To: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> References: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> Message-ID: <20080812144946.5e2499f8@BL3D1974.boeblingen.de.ibm.com> The idr_find() function may fail when trying to get the QP that is associated with a CQE, e.g. when a QP has been destroyed between the generation of a CQE and the poll request for it. In consequence, the return value of idr_find() must be checked and the CQE must be discarded when the QP cannot be found. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -680,8 +680,10 @@ repoll: read_lock(&ehca_qp_idr_lock); my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); - wc->qp = &my_qp->ib_qp; read_unlock(&ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc->qp = &my_qp->ib_qp; wc->byte_len = cqe->nr_bytes_transferred; wc->pkey_index = cqe->pkey_index; From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:52 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:52 +0200 Subject: [ofa-general] [PATCH 5/5] ib/ehca: discard double CQE for one WR In-Reply-To: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> References: <20080812143542.0cbde755@BL3D1974.boeblingen.de.ibm.com> Message-ID: <20080812144952.36426c55@BL3D1974.boeblingen.de.ibm.com> Under rare circumstances, the ehca hardware might erroneously generate two CQEs for the same WQE, which is not compliant to the IB spec and will cause unpredictable errors like memory being freed twice. To avoid this problem, the driver needs to detect the second CQE and discard it. For this purpose, introduce an array holding as many elements as the SQ of the QP, called sq_map. Each sq_map entry stores a "reported" flag for one WQE in the SQ. When a work request is posted to the SQ, the respective "reported" flag is set to zero. After the arrival of a CQE, the flag is set to 1, which allows to detect the occurence of a second CQE. The mapping between WQE / CQE and the corresponding sq_map element is implemented by replacing the lowest 16 Bits of the wr_id with the index in the queue map. The original 16 Bits are stored in the sq_map entry and are restored when the CQE is passed to the application. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_classes.h | 9 +++++ drivers/infiniband/hw/ehca/ehca_qes.h | 1 drivers/infiniband/hw/ehca/ehca_qp.c | 34 +++++++++++++----- drivers/infiniband/hw/ehca/ehca_reqs.c | 54 +++++++++++++++++++++++------- 4 files changed, 78 insertions(+), 20 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -139,6 +139,7 @@ static void trace_send_wr_ud(const struc static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, const struct ib_send_wr *send_wr, + u32 sq_map_idx, int hidden) { u32 idx; @@ -157,7 +158,11 @@ static inline int ehca_write_swqe(struct /* clear wqe header until sglist */ memset(wqe_p, 0, offsetof(struct ehca_wqe, u.ud_av.sg_list)); - wqe_p->work_request_id = send_wr->wr_id; + wqe_p->work_request_id = send_wr->wr_id & ~QMAP_IDX_MASK; + wqe_p->work_request_id |= sq_map_idx & QMAP_IDX_MASK; + + qp->sq_map[sq_map_idx].app_wr_id = send_wr->wr_id & QMAP_IDX_MASK; + qp->sq_map[sq_map_idx].reported = 0; switch (send_wr->opcode) { case IB_WR_SEND: @@ -381,6 +386,7 @@ static inline int post_one_send(struct e { struct ehca_wqe *wqe_p; int ret; + u32 sq_map_idx; u64 start_offset = my_qp->ipz_squeue.current_q_offset; /* get pointer next to free WQE */ @@ -393,8 +399,15 @@ static inline int post_one_send(struct e "qp_num=%x", my_qp->ib_qp.qp_num); return -ENOMEM; } + + /* + * Get the index of the WQE in the send queue. The same index is used + * for writing into the sq_map. + */ + sq_map_idx = start_offset / my_qp->ipz_squeue.qe_size; + /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, sq_map_idx, hidden); /* * if something failed, * reset the free entry pointer to the start value @@ -654,8 +667,34 @@ repoll: my_cq, my_cq->cq_number); } - /* we got a completion! */ - wc->wr_id = cqe->work_request_id; + read_lock(&ehca_qp_idr_lock); + my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); + read_unlock(&ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc->qp = &my_qp->ib_qp; + + if (!(cqe->w_completion_flags & WC_SEND_RECEIVE_BIT)) { + struct ehca_qmap_entry *qmap_entry; + /* + * We got a send completion and need to restore the original + * wr_id. + */ + qmap_entry = &my_qp->sq_map[cqe->work_request_id & + QMAP_IDX_MASK]; + + if (qmap_entry->reported) { + ehca_warn(cq->device, "Double cqe on qp_num=%#x", + my_qp->real_qp_num); + /* found a double cqe, discard it and read next one */ + goto repoll; + } + wc->wr_id = cqe->work_request_id & ~QMAP_IDX_MASK; + wc->wr_id |= qmap_entry->app_wr_id; + qmap_entry->reported = 1; + } else + /* We got a receive completion. */ + wc->wr_id = cqe->work_request_id; /* eval ib_wc_opcode */ wc->opcode = ib_wc_opcode[cqe->optype]-1; @@ -678,13 +717,6 @@ repoll: } else wc->status = IB_WC_SUCCESS; - read_lock(&ehca_qp_idr_lock); - my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); - read_unlock(&ehca_qp_idr_lock); - if (!my_qp) - goto repoll; - wc->qp = &my_qp->ib_qp; - wc->byte_len = cqe->nr_bytes_transferred; wc->pkey_index = cqe->pkey_index; wc->slid = cqe->rlid; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_classes.h +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_classes.h @@ -156,6 +156,14 @@ struct ehca_mod_qp_parm { #define EHCA_MOD_QP_PARM_MAX 4 +#define QMAP_IDX_MASK 0xFFFFULL + +/* struct for tracking if cqes have been reported to the application */ +struct ehca_qmap_entry { + u16 app_wr_id; + u16 reported; +}; + struct ehca_qp { union { struct ib_qp ib_qp; @@ -165,6 +173,7 @@ struct ehca_qp { enum ehca_ext_qp_type ext_type; enum ib_qp_state state; struct ipz_queue ipz_squeue; + struct ehca_qmap_entry *sq_map; struct ipz_queue ipz_rqueue; struct h_galpas galpas; u32 qkey; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qp.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qp.c @@ -412,6 +412,7 @@ static struct ehca_qp *internal_create_q struct ehca_shca *shca = container_of(pd->device, struct ehca_shca, ib_device); struct ib_ucontext *context = NULL; + u32 nr_qes; u64 h_ret; int is_llqp = 0, has_srq = 0; int qp_type, max_send_sge, max_recv_sge, ret; @@ -715,6 +716,15 @@ static struct ehca_qp *internal_create_q "and pages ret=%i", ret); goto create_qp_exit2; } + nr_qes = my_qp->ipz_squeue.queue_length / + my_qp->ipz_squeue.qe_size; + my_qp->sq_map = vmalloc(nr_qes * + sizeof(struct ehca_qmap_entry)); + if (!my_qp->sq_map) { + ehca_err(pd->device, "Couldn't allocate squeue " + "map ret=%i", ret); + goto create_qp_exit3; + } } if (HAS_RQ(my_qp)) { @@ -724,7 +734,7 @@ static struct ehca_qp *internal_create_q if (ret) { ehca_err(pd->device, "Couldn't initialize rqueue " "and pages ret=%i", ret); - goto create_qp_exit3; + goto create_qp_exit4; } } @@ -770,7 +780,7 @@ static struct ehca_qp *internal_create_q if (!my_qp->mod_qp_parm) { ehca_err(pd->device, "Could not alloc mod_qp_parm"); - goto create_qp_exit4; + goto create_qp_exit5; } } } @@ -780,7 +790,7 @@ static struct ehca_qp *internal_create_q h_ret = ehca_define_sqp(shca, my_qp, init_attr); if (h_ret != H_SUCCESS) { ret = ehca2ib_return_code(h_ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -789,7 +799,7 @@ static struct ehca_qp *internal_create_q if (ret) { ehca_err(pd->device, "Couldn't assign qp to send_cq ret=%i", ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -815,22 +825,26 @@ static struct ehca_qp *internal_create_q if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; - goto create_qp_exit6; + goto create_qp_exit7; } } return my_qp; -create_qp_exit6: +create_qp_exit7: ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); -create_qp_exit5: +create_qp_exit6: kfree(my_qp->mod_qp_parm); -create_qp_exit4: +create_qp_exit5: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); +create_qp_exit4: + if (HAS_SQ(my_qp)) + vfree(my_qp->sq_map); + create_qp_exit3: if (HAS_SQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); @@ -1979,8 +1993,10 @@ static int internal_destroy_qp(struct ib if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); - if (HAS_SQ(my_qp)) + if (HAS_SQ(my_qp)) { ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); + vfree(my_qp->sq_map); + } kmem_cache_free(qp_cache, my_qp); atomic_dec(&shca->num_qps); return 0; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qes.h +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qes.h @@ -213,6 +213,7 @@ struct ehca_wqe { #define WC_STATUS_ERROR_BIT 0x80000000 #define WC_STATUS_REMOTE_ERROR_FLAGS 0x0000F800 #define WC_STATUS_PURGE_BIT 0x10 +#define WC_SEND_RECEIVE_BIT 0x80 struct ehca_cqe { u64 work_request_id; From alexs at linux.vnet.ibm.com Tue Aug 12 05:49:24 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 14:49:24 +0200 Subject: [ofa-general] [PATCH 0/5] ib/ehca: Fix stability issues Message-ID: <20080812144924.4447933e@BL3D1974.boeblingen.de.ibm.com> Hi Roland, the following patchset contains four small fixes and one bigger patch (5/5) for addressing some ehca issues we found during cluster test. [1/5] update qp_state on cached modify_qp() [2/5] rename goto label in ehca_poll_cq_one() [3/5] repoll on invalid opcode instead of returning success [4/5] check idr_find() return value [5/5] discard double CQE for one WR They all apply on top of 2.6.27-rc1. If possible, we would like to get them into 2.6.27. Regards, Alexander Schmidt From eli at mellanox.co.il Tue Aug 12 06:07:38 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 12 Aug 2008 16:07:38 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <48A18430.2060802@mellanox.co.il> References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <20080811223524.GA24278@mtls03> <32cb786f0808112325r6f82a827g35be4fc516a10be0@mail.gmail.com> <48A18430.2060802@mellanox.co.il> Message-ID: <1218546458.7135.66.camel@mtls03> On Tue, 2008-08-12 at 15:38 +0300, Tziporet Koren wrote: > Yosef Etigin wrote: > > I don't think it will work, because the lock is taken way before your > > flag is set. > > If the lock is taken (by rtnl that calls ipoib_stop) and the flag is > > not set yet, > > ipoib_mcast_join_task() will still try to take the lock, and we have a deadlock. > > > Can you test it? > Tziporet > > > I think the patch I sent is problematic so I don't see any point in testing it. Looks line this http://lists.openfabrics.org/pipermail/ewg/2008-August/007495.html will work. From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:07 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:07 +0200 Subject: [ofa-general] [PATCH 1/5 try2] ib/ehca: update qp_state on cached modify_qp() Message-ID: <200808121546.07212.alexs@linux.vnet.ibm.com> Since the introduction of the port auto-detect mode for ehca, calls to modify_qp() may be cached in the device driver when the ports are not activated yet. When a modify_qp() call is cached, the qp state remains untouched until the port is activated, which will leave the qp in the reset state. In the reset state, however, it is not allowed to post SQ WQEs, which confuses applications like ib_mad. The solution for this problem is to immediately set the qp state as requested by modify_qp(), even when the call is cached. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_qp.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qp.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1534,8 +1534,6 @@ static int internal_modify_qp(struct ib_ if (attr_mask & IB_QP_QKEY) my_qp->qkey = attr->qkey; - my_qp->state = qp_new_state; - modify_qp_exit2: if (squeue_locked) { /* this means: sqe -> rts */ spin_unlock_irqrestore(&my_qp->spinlock_s, flags); @@ -1551,6 +1549,8 @@ modify_qp_exit1: int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata) { + int ret = 0; + struct ehca_shca *shca = container_of(ibqp->device, struct ehca_shca, ib_device); struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); @@ -1597,12 +1597,18 @@ int ehca_modify_qp(struct ib_qp *ibqp, s attr->qp_state, my_qp->init_attr.port_num, ibqp->qp_type); spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); - return 0; + goto out; } spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); } - return internal_modify_qp(ibqp, attr, attr_mask, 0); + ret = internal_modify_qp(ibqp, attr, attr_mask, 0); + +out: + if ((ret == 0) && (attr_mask & IB_QP_STATE)) + my_qp->state = attr->qp_state; + + return ret; } void ehca_recover_sqp(struct ib_qp *sqp) From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:20 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:20 +0200 Subject: [ofa-general] [PATCH 3/5 try2] ib/ehca: repoll on invalid opcode Message-ID: <200808121546.20787.alexs@linux.vnet.ibm.com> When the ehca driver detects an invalid opcode in a CQE, it currently passes the CQE to the application and returns with success. This patch changes the CQE handling to discard CQEs with invalid opcodes and to continue reading the next CQE from the CQ. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -667,7 +667,7 @@ repoll: ehca_dmp(cqe, 64, "ehca_cq=%p cq_num=%x", my_cq, my_cq->cq_number); /* update also queue adder to throw away this entry!!! */ - goto poll_cq_one_exit0; + goto repoll; } /* eval ib_wc_status */ From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:13 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:13 +0200 Subject: [ofa-general] [PATCH 2/5 try2] ib/ehca: rename goto label Message-ID: <200808121546.13860.alexs@linux.vnet.ibm.com> Rename the "poll_cq_one_read_cqe" goto label to what it actually does, "repoll". Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -589,7 +589,7 @@ static inline int ehca_poll_cq_one(struc struct ehca_qp *my_qp; int cqe_count = 0, is_error; -poll_cq_one_read_cqe: +repoll: cqe = (struct ehca_cqe *) ipz_qeit_get_inc_valid(&my_cq->ipz_queue); if (!cqe) { @@ -617,7 +617,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, "cq_num=%x qp_num=%x", my_cq->cq_number, cqe->local_qp_number); /* ignore this purged cqe */ - goto poll_cq_one_read_cqe; + goto repoll; } spin_lock_irqsave(&qp->spinlock_s, flags); purgeflag = qp->sqerr_purgeflag; @@ -636,7 +636,7 @@ poll_cq_one_read_cqe: * that caused sqe and turn off purge flag */ qp->sqerr_purgeflag = 0; - goto poll_cq_one_read_cqe; + goto repoll; } } From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:30 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:30 +0200 Subject: [ofa-general] [PATCH 5/5 try2] ib/ehca: discard double CQE for one WR Message-ID: <200808121546.31057.alexs@linux.vnet.ibm.com> Under rare circumstances, the ehca hardware might erroneously generate two CQEs for the same WQE, which is not compliant to the IB spec and will cause unpredictable errors like memory being freed twice. To avoid this problem, the driver needs to detect the second CQE and discard it. For this purpose, introduce an array holding as many elements as the SQ of the QP, called sq_map. Each sq_map entry stores a "reported" flag for one WQE in the SQ. When a work request is posted to the SQ, the respective "reported" flag is set to zero. After the arrival of a CQE, the flag is set to 1, which allows to detect the occurence of a second CQE. The mapping between WQE / CQE and the corresponding sq_map element is implemented by replacing the lowest 16 Bits of the wr_id with the index in the queue map. The original 16 Bits are stored in the sq_map entry and are restored when the CQE is passed to the application. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_classes.h | 9 +++++ drivers/infiniband/hw/ehca/ehca_qes.h | 1 drivers/infiniband/hw/ehca/ehca_qp.c | 34 +++++++++++++----- drivers/infiniband/hw/ehca/ehca_reqs.c | 54 +++++++++++++++++++++++------- 4 files changed, 78 insertions(+), 20 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -139,6 +139,7 @@ static void trace_send_wr_ud(const struc static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, const struct ib_send_wr *send_wr, + u32 sq_map_idx, int hidden) { u32 idx; @@ -157,7 +158,11 @@ static inline int ehca_write_swqe(struct /* clear wqe header until sglist */ memset(wqe_p, 0, offsetof(struct ehca_wqe, u.ud_av.sg_list)); - wqe_p->work_request_id = send_wr->wr_id; + wqe_p->work_request_id = send_wr->wr_id & ~QMAP_IDX_MASK; + wqe_p->work_request_id |= sq_map_idx & QMAP_IDX_MASK; + + qp->sq_map[sq_map_idx].app_wr_id = send_wr->wr_id & QMAP_IDX_MASK; + qp->sq_map[sq_map_idx].reported = 0; switch (send_wr->opcode) { case IB_WR_SEND: @@ -381,6 +386,7 @@ static inline int post_one_send(struct e { struct ehca_wqe *wqe_p; int ret; + u32 sq_map_idx; u64 start_offset = my_qp->ipz_squeue.current_q_offset; /* get pointer next to free WQE */ @@ -393,8 +399,15 @@ static inline int post_one_send(struct e "qp_num=%x", my_qp->ib_qp.qp_num); return -ENOMEM; } + + /* + * Get the index of the WQE in the send queue. The same index is used + * for writing into the sq_map. + */ + sq_map_idx = start_offset / my_qp->ipz_squeue.qe_size; + /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, sq_map_idx, hidden); /* * if something failed, * reset the free entry pointer to the start value @@ -654,8 +667,34 @@ repoll: my_cq, my_cq->cq_number); } - /* we got a completion! */ - wc->wr_id = cqe->work_request_id; + read_lock(&ehca_qp_idr_lock); + my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); + read_unlock(&ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc->qp = &my_qp->ib_qp; + + if (!(cqe->w_completion_flags & WC_SEND_RECEIVE_BIT)) { + struct ehca_qmap_entry *qmap_entry; + /* + * We got a send completion and need to restore the original + * wr_id. + */ + qmap_entry = &my_qp->sq_map[cqe->work_request_id & + QMAP_IDX_MASK]; + + if (qmap_entry->reported) { + ehca_warn(cq->device, "Double cqe on qp_num=%#x", + my_qp->real_qp_num); + /* found a double cqe, discard it and read next one */ + goto repoll; + } + wc->wr_id = cqe->work_request_id & ~QMAP_IDX_MASK; + wc->wr_id |= qmap_entry->app_wr_id; + qmap_entry->reported = 1; + } else + /* We got a receive completion. */ + wc->wr_id = cqe->work_request_id; /* eval ib_wc_opcode */ wc->opcode = ib_wc_opcode[cqe->optype]-1; @@ -678,13 +717,6 @@ repoll: } else wc->status = IB_WC_SUCCESS; - read_lock(&ehca_qp_idr_lock); - my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); - read_unlock(&ehca_qp_idr_lock); - if (!my_qp) - goto repoll; - wc->qp = &my_qp->ib_qp; - wc->byte_len = cqe->nr_bytes_transferred; wc->pkey_index = cqe->pkey_index; wc->slid = cqe->rlid; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_classes.h +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_classes.h @@ -156,6 +156,14 @@ struct ehca_mod_qp_parm { #define EHCA_MOD_QP_PARM_MAX 4 +#define QMAP_IDX_MASK 0xFFFFULL + +/* struct for tracking if cqes have been reported to the application */ +struct ehca_qmap_entry { + u16 app_wr_id; + u16 reported; +}; + struct ehca_qp { union { struct ib_qp ib_qp; @@ -165,6 +173,7 @@ struct ehca_qp { enum ehca_ext_qp_type ext_type; enum ib_qp_state state; struct ipz_queue ipz_squeue; + struct ehca_qmap_entry *sq_map; struct ipz_queue ipz_rqueue; struct h_galpas galpas; u32 qkey; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qp.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qp.c @@ -412,6 +412,7 @@ static struct ehca_qp *internal_create_q struct ehca_shca *shca = container_of(pd->device, struct ehca_shca, ib_device); struct ib_ucontext *context = NULL; + u32 nr_qes; u64 h_ret; int is_llqp = 0, has_srq = 0; int qp_type, max_send_sge, max_recv_sge, ret; @@ -715,6 +716,15 @@ static struct ehca_qp *internal_create_q "and pages ret=%i", ret); goto create_qp_exit2; } + nr_qes = my_qp->ipz_squeue.queue_length / + my_qp->ipz_squeue.qe_size; + my_qp->sq_map = vmalloc(nr_qes * + sizeof(struct ehca_qmap_entry)); + if (!my_qp->sq_map) { + ehca_err(pd->device, "Couldn't allocate squeue " + "map ret=%i", ret); + goto create_qp_exit3; + } } if (HAS_RQ(my_qp)) { @@ -724,7 +734,7 @@ static struct ehca_qp *internal_create_q if (ret) { ehca_err(pd->device, "Couldn't initialize rqueue " "and pages ret=%i", ret); - goto create_qp_exit3; + goto create_qp_exit4; } } @@ -770,7 +780,7 @@ static struct ehca_qp *internal_create_q if (!my_qp->mod_qp_parm) { ehca_err(pd->device, "Could not alloc mod_qp_parm"); - goto create_qp_exit4; + goto create_qp_exit5; } } } @@ -780,7 +790,7 @@ static struct ehca_qp *internal_create_q h_ret = ehca_define_sqp(shca, my_qp, init_attr); if (h_ret != H_SUCCESS) { ret = ehca2ib_return_code(h_ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -789,7 +799,7 @@ static struct ehca_qp *internal_create_q if (ret) { ehca_err(pd->device, "Couldn't assign qp to send_cq ret=%i", ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -815,22 +825,26 @@ static struct ehca_qp *internal_create_q if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; - goto create_qp_exit6; + goto create_qp_exit7; } } return my_qp; -create_qp_exit6: +create_qp_exit7: ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); -create_qp_exit5: +create_qp_exit6: kfree(my_qp->mod_qp_parm); -create_qp_exit4: +create_qp_exit5: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); +create_qp_exit4: + if (HAS_SQ(my_qp)) + vfree(my_qp->sq_map); + create_qp_exit3: if (HAS_SQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); @@ -1979,8 +1993,10 @@ static int internal_destroy_qp(struct ib if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); - if (HAS_SQ(my_qp)) + if (HAS_SQ(my_qp)) { ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); + vfree(my_qp->sq_map); + } kmem_cache_free(qp_cache, my_qp); atomic_dec(&shca->num_qps); return 0; --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qes.h +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qes.h @@ -213,6 +213,7 @@ struct ehca_wqe { #define WC_STATUS_ERROR_BIT 0x80000000 #define WC_STATUS_REMOTE_ERROR_FLAGS 0x0000F800 #define WC_STATUS_PURGE_BIT 0x10 +#define WC_SEND_RECEIVE_BIT 0x80 struct ehca_cqe { u64 work_request_id; From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:27 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:27 +0200 Subject: [ofa-general] [PATCH 4/5 try2] ib/ehca: check idr_find() return value Message-ID: <200808121546.27705.alexs@linux.vnet.ibm.com> The idr_find() function may fail when trying to get the QP that is associated with a CQE, e.g. when a QP has been destroyed between the generation of a CQE and the poll request for it. In consequence, the return value of idr_find() must be checked and the CQE must be discarded when the QP cannot be found. Signed-off-by: Alexander Schmidt --- drivers/infiniband/hw/ehca/ehca_reqs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -680,8 +680,10 @@ repoll: read_lock(&ehca_qp_idr_lock); my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); - wc->qp = &my_qp->ib_qp; read_unlock(&ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc->qp = &my_qp->ib_qp; wc->byte_len = cqe->nr_bytes_transferred; wc->pkey_index = cqe->pkey_index; From alexs at linux.vnet.ibm.com Tue Aug 12 06:46:00 2008 From: alexs at linux.vnet.ibm.com (Alexander Schmidt) Date: Tue, 12 Aug 2008 15:46:00 +0200 Subject: [ofa-general] [PATCH 0/5 try2] ib/ehca: Fix stability issues Message-ID: <200808121546.00211.alexs@linux.vnet.ibm.com> Hi Roland, Sorry, the first set was mangled because of a broken mailer, so here it is again, double checked... the following patchset contains four small fixes and one bigger patch (5/5) for addressing some ehca issues we found during cluster test. [1/5] update qp_state on cached modify_qp() [2/5] rename goto label in ehca_poll_cq_one() [3/5] repoll on invalid opcode instead of returning success [4/5] check idr_find() return value [5/5] discard double CQE for one WR They all apply on top of 2.6.27-rc1. If possible, we would like to get them into 2.6.27. Regards, Alexander Schmidt From ruimario at gmail.com Tue Aug 12 06:49:38 2008 From: ruimario at gmail.com (Rui Machado) Date: Tue, 12 Aug 2008 15:49:38 +0200 Subject: Fwd: [ofa-general] limit on memory registration In-Reply-To: <6978b4af0808110709r3532da12yf16d71a8bc530445@mail.gmail.com> References: <6978b4af0808060847r7bd16de5p22f186401c3a56e2@mail.gmail.com> <4899CD88.4030805@gmail.com> <6978b4af0808110709r3532da12yf16d71a8bc530445@mail.gmail.com> Message-ID: <6978b4af0808120649l5bdf3a49i9e5800c51b3418ad@mail.gmail.com> forgot list ... ---------- Forwarded message ---------- From: Rui Machado Date: 2008/8/11 Subject: Re: [ofa-general] limit on memory registration To: Dotan Barak 2008/8/6 Dotan Barak : > Rui Machado wrote: >> >> Hi all, >> >> is there any limitation on the size that can be registered >> (ibv_reg_mr) for communication? >> I seem to be limited to 16GB (on 32GB 64bit x86 machine). >> Is this normal? Can someone tell me why and/or if there is a workaround? >> > > In the device attributes there is an attribute called max_mr_size which > indicate the maximum block size > that can be registered for a device. > > There is one more limitation which is a device specific for the translation > table virtual <-->physical > You can check the low level driver of the HW that you are using in order to > increase the size > of this table. > > which HW do you use? > I am now working with another hw. Mellanox ConnectX HCA But I still have the limitation on the amount of memory to register. I would really like to understand what's going on here and why does it fail. And I also the need to fix it :-/ (case it's possible) Any help is much appreciated. Cheers From jackm at dev.mellanox.co.il Tue Aug 12 07:20:08 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 12 Aug 2008 17:20:08 +0300 Subject: [ofa-general] [PATCH 0 of 2 for 2.6.28] Raw Ethertype QP support Message-ID: <200808121720.08900.jackm@dev.mellanox.co.il> This patch set fixes the Raw Ethertype QP implementation in the infiniband core, and implements Raw Ethertype QP support in the mlx4 ib driver. The Raw Ethertype packet is described in the IB Spec Rev 1.2.1, Section 5.3. Raw Datagrams are described in section 9.8.4. Note that the Raw QP types are included under the special QP types (described in section 10.2.4.5). The fields required for sending a raw datagram are given in section 11.4.1.1 (POST SEND REQUEST), and in Section 10.2.10 (INFINIBAND HEADER DATA AND SOURCES), Table 64. - Jack From jackm at dev.mellanox.co.il Tue Aug 12 07:20:11 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 12 Aug 2008 17:20:11 +0300 Subject: [ofa-general] [PATCH 1 of 2 for 2.6.28] core: Fix Raw Ethertype QP support Message-ID: <200808121720.11878.jackm@dev.mellanox.co.il> From: Igor Yarovinsky core: Fix Raw Ethertype QP support. Fix Raw Ethertype qp support: a. Need a new struct in the wr union in struct ib_send_wr b. Need new helper pack and unpack functions in ud_header.c Signed-off-by: Igor Yarovinsky Signed-off-by: Jack Morgenstein --- Raw Ethertype packets on the wire contain only LRH, Raw Header (Ethernet Type), payload, and VCRC. Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-08-11 10:37:01.000000000 +0300 +++ infiniband/include/rdma/ib_verbs.h 2008-08-12 16:29:49.000000000 +0300 @@ -752,6 +752,11 @@ struct ib_send_wr { int access_flags; u32 rkey; } fast_reg; + struct { + struct ib_unpacked_lrh *lrh; + u32 eth_type; + u8 static_rate; + } raw_ety; } wr; }; Index: infiniband/drivers/infiniband/core/ud_header.c =================================================================== --- infiniband.orig/drivers/infiniband/core/ud_header.c 2008-07-28 18:20:11.000000000 +0300 +++ infiniband/drivers/infiniband/core/ud_header.c 2008-08-12 10:55:10.000000000 +0300 @@ -241,6 +241,36 @@ void ib_ud_header_init(int pay EXPORT_SYMBOL(ib_ud_header_init); /** + * ib_lrh_header_pack - Pack LRH header struct into wire format + * @lrh:unpacked LRH header struct + * @buf:Buffer to pack into + * + * ib_lrh_header_pack() packs the LRH header structure @lrh into + * wire format in the buffer @buf. + */ +int ib_lrh_header_pack(struct ib_unpacked_lrh *lrh, void *buf) +{ + ib_pack(lrh_table, ARRAY_SIZE(lrh_table), lrh, buf); + return 0; +} +EXPORT_SYMBOL(ib_lrh_header_pack); + +/** + * ib_lrh_header_unpack - Unpack LRH structure from wire format + * @lrh:unpacked LRH header struct + * @buf:Buffer to pack into + * + * ib_lrh_header_unpack() unpacks the LRH header structure from + * wire format (in buf) into @lrh. + */ +int ib_lrh_header_unpack(void *buf, struct ib_unpacked_lrh *lrh) +{ + ib_unpack(lrh_table, ARRAY_SIZE(lrh_table), buf, lrh); + return 0; +} +EXPORT_SYMBOL(ib_lrh_header_unpack); + +/** * ib_ud_header_pack - Pack UD header struct into wire format * @header:UD header struct * @buf:Buffer to pack into Index: infiniband/include/rdma/ib_pack.h =================================================================== --- infiniband.orig/include/rdma/ib_pack.h 2008-07-28 18:20:15.000000000 +0300 +++ infiniband/include/rdma/ib_pack.h 2008-08-12 10:55:10.000000000 +0300 @@ -240,4 +240,7 @@ int ib_ud_header_pack(struct ib_ud_heade int ib_ud_header_unpack(void *buf, struct ib_ud_header *header); +int ib_lrh_header_pack(struct ib_unpacked_lrh *lrh, void *buf); +int ib_lrh_header_unpack(void *buf, struct ib_unpacked_lrh *lrh); + #endif /* IB_PACK_H */ From jackm at dev.mellanox.co.il Tue Aug 12 07:20:15 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 12 Aug 2008 17:20:15 +0300 Subject: [ofa-general] [PATCH 2 of 2 for 2.6.28] mlx4: Add Raw Ethertype QP support Message-ID: <200808121720.15346.jackm@dev.mellanox.co.il> From: Igor Yarovinsky mlx4: Add Raw Ethertype QP support. This implementation supports one Raw Ethertype QP per port. Signed-off-by: Igor Yarovinsky Signed-off-by: Jack Morgenstein --- Raw Ethertype is implemented similarly to MADs. When posting sends, the LRH and RWH headers are added as a single 16-byte inline segment. Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-08-12 16:28:56.000000000 +0300 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-08-12 16:30:22.000000000 +0300 @@ -54,7 +54,8 @@ enum { /* * Largest possible UD header: send with GRH and immediate data. */ - MLX4_IB_UD_HEADER_SIZE = 72 + MLX4_IB_UD_HEADER_SIZE = 72, + MLX4_IB_MAX_RAW_ETY_HDR_SIZE = 12 }; struct mlx4_ib_sqp { @@ -280,6 +281,12 @@ static int send_wqe_overhead(enum ib_qp_ ALIGN(4 + sizeof (struct mlx4_wqe_inline_seg), sizeof (struct mlx4_wqe_data_seg)); + case IB_QPT_RAW_ETY: + return sizeof(struct mlx4_wqe_ctrl_seg) + + ALIGN(MLX4_IB_MAX_RAW_ETY_HDR_SIZE + + sizeof(struct mlx4_wqe_inline_seg), + sizeof(struct mlx4_wqe_data_seg)); + default: return sizeof (struct mlx4_wqe_ctrl_seg); } @@ -335,6 +342,10 @@ static int set_kernel_sq_size(struct mlx cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; + if (type == IB_QPT_RAW_ETY && + cap->max_send_sge + 1 > dev->dev->caps.max_sq_sg) + return -EINVAL; + s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg), cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) + send_wqe_overhead(type, qp->flags); @@ -375,7 +386,7 @@ static int set_kernel_sq_size(struct mlx */ if (dev->dev->caps.fw_ver >= MLX4_FW_VER_WQE_CTRL_NEC && qp->sq_signal_bits && BITS_PER_LONG == 64 && - type != IB_QPT_SMI && type != IB_QPT_GSI) + type != IB_QPT_SMI && type != IB_QPT_GSI && type != IB_QPT_RAW_ETY) qp->sq.wqe_shift = ilog2(64); else qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s)); @@ -711,6 +722,9 @@ struct ib_qp *mlx4_ib_create_qp(struct i break; } + case IB_QPT_RAW_ETY: + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_RAW_ETY)) + return ERR_PTR(-ENOSYS); case IB_QPT_SMI: case IB_QPT_GSI: { @@ -726,7 +740,8 @@ struct ib_qp *mlx4_ib_create_qp(struct i err = create_qp_common(dev, pd, init_attr, udata, dev->dev->caps.sqp_start + - (init_attr->qp_type == IB_QPT_SMI ? 0 : 2) + + (init_attr->qp_type == IB_QPT_RAW_ETY ? 4 : + (init_attr->qp_type == IB_QPT_SMI ? 0 : 2)) + init_attr->port_num - 1, qp); if (err) { @@ -740,7 +755,6 @@ struct ib_qp *mlx4_ib_create_qp(struct i break; } default: - /* Don't support raw QPs */ return ERR_PTR(-EINVAL); } @@ -771,6 +785,7 @@ static int to_mlx4_st(enum ib_qp_type ty case IB_QPT_RC: return MLX4_QP_ST_RC; case IB_QPT_UC: return MLX4_QP_ST_UC; case IB_QPT_UD: return MLX4_QP_ST_UD; + case IB_QPT_RAW_ETY: case IB_QPT_SMI: case IB_QPT_GSI: return MLX4_QP_ST_MLX; default: return -1; @@ -895,7 +910,8 @@ static int __mlx4_ib_modify_qp(struct ib } } - if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI) + if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI || + ibqp->qp_type == IB_QPT_RAW_ETY) context->mtu_msgmax = (IB_MTU_4096 << 5) | 11; else if (ibqp->qp_type == IB_QPT_UD) { if (qp->flags & MLX4_IB_QP_LSO) @@ -1044,7 +1060,7 @@ static int __mlx4_ib_modify_qp(struct ib if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR && (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI || - ibqp->qp_type == IB_QPT_UD)) { + ibqp->qp_type == IB_QPT_UD || ibqp->qp_type == IB_QPT_RAW_ETY)) { context->pri_path.sched_queue = (qp->port - 1) << 6; if (is_qp0(dev, qp)) context->pri_path.sched_queue |= MLX4_IB_DEFAULT_QP0_SCHED_QUEUE; @@ -1186,6 +1202,49 @@ out: return err; } +static int build_raw_ety_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr, + void *wqe, unsigned *mlx_seg_len) +{ + int payload = 0; + int header_size, packet_length; + struct mlx4_wqe_mlx_seg *mlx = wqe; + struct mlx4_wqe_inline_seg *inl = wqe + sizeof *mlx; + u32 *lrh = wqe + sizeof *mlx + sizeof *inl; + int i; + + /* Only IB_WR_SEND is supported */ + if (wr->opcode != IB_WR_SEND) + return -EINVAL; + + for (i = 0; i < wr->num_sge; ++i) + payload += wr->sg_list[i].length; + + header_size = IB_LRH_BYTES + 4; /* LRH + RAW_HEADER (32 bits) */ + + /* headers + payload and round up */ + packet_length = (header_size + payload + 3) / 4; + + mlx->flags &= cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE); + + mlx->flags |= cpu_to_be32(MLX4_WQE_MLX_ICRC | + (wr->wr.raw_ety.lrh->service_level << 8)); + + mlx->rlid = wr->wr.raw_ety.lrh->destination_lid; + + wr->wr.raw_ety.lrh->packet_length = cpu_to_be16(packet_length); + + ib_lrh_header_pack(wr->wr.raw_ety.lrh, lrh); + lrh += IB_LRH_BYTES / 4; /* LRH size is a dword multiple */ + *lrh = cpu_to_be32(wr->wr.raw_ety.eth_type); + + inl->byte_count = cpu_to_be32(1 << 31 | header_size); + + *mlx_seg_len = + ALIGN(sizeof(struct mlx4_wqe_inline_seg) + header_size, 16); + + return 0; +} + static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr, void *wqe, unsigned *mlx_seg_len) { @@ -1601,6 +1660,17 @@ int mlx4_ib_post_send(struct ib_qp *ibqp size += seglen / 16; break; + case IB_QPT_RAW_ETY: + err = build_raw_ety_header(to_msqp(qp), wr, ctrl, + &seglen); + if (unlikely(err)) { + *bad_wr = wr; + goto out; + } + wqe += seglen; + size += seglen / 16; + break; + default: break; } Index: infiniband/drivers/net/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/net/mlx4/qp.c 2008-08-12 16:28:56.000000000 +0300 +++ infiniband/drivers/net/mlx4/qp.c 2008-08-12 16:30:22.000000000 +0300 @@ -247,8 +247,9 @@ EXPORT_SYMBOL_GPL(mlx4_qp_free); static int mlx4_CONF_SPECIAL_QP(struct mlx4_dev *dev, u32 base_qpn) { - return mlx4_cmd(dev, 0, base_qpn, 0, MLX4_CMD_CONF_SPECIAL_QP, - MLX4_CMD_TIME_CLASS_B); + return mlx4_cmd(dev, 0, base_qpn, + (dev->caps.flags & MLX4_DEV_CAP_FLAG_RAW_ETY) ? 4 : 0, + MLX4_CMD_CONF_SPECIAL_QP, MLX4_CMD_TIME_CLASS_B); } int mlx4_init_qp_table(struct mlx4_dev *dev) Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-08-12 16:28:56.000000000 +0300 +++ infiniband/include/linux/mlx4/device.h 2008-08-12 16:30:22.000000000 +0300 @@ -60,6 +60,7 @@ enum { MLX4_DEV_CAP_FLAG_IPOIB_CSUM = 1 << 7, MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1 << 8, MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1 << 9, + MLX4_DEV_CAP_FLAG_RAW_ETY = 1 << 13, MLX4_DEV_CAP_FLAG_MEM_WINDOW = 1 << 16, MLX4_DEV_CAP_FLAG_APM = 1 << 17, MLX4_DEV_CAP_FLAG_ATOMIC = 1 << 18, Index: infiniband/include/linux/mlx4/qp.h =================================================================== --- infiniband.orig/include/linux/mlx4/qp.h 2008-08-12 16:28:56.000000000 +0300 +++ infiniband/include/linux/mlx4/qp.h 2008-08-12 16:30:22.000000000 +0300 @@ -191,7 +191,8 @@ struct mlx4_wqe_ctrl_seg { enum { MLX4_WQE_MLX_VL15 = 1 << 17, - MLX4_WQE_MLX_SLR = 1 << 16 + MLX4_WQE_MLX_SLR = 1 << 16, + MLX4_WQE_MLX_ICRC = 1 << 4 }; struct mlx4_wqe_mlx_seg { Index: infiniband/drivers/infiniband/hw/mlx4/main.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/main.c 2008-08-12 16:28:56.000000000 +0300 +++ infiniband/drivers/infiniband/hw/mlx4/main.c 2008-08-12 16:30:22.000000000 +0300 @@ -111,6 +111,8 @@ static int mlx4_ib_query_device(struct i (dev->dev->caps.bmme_flags & MLX4_BMME_FLAG_REMOTE_INV) && (dev->dev->caps.bmme_flags & MLX4_BMME_FLAG_FAST_REG_WR)) props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_RAW_ETY) + props->max_raw_ethy_qp = dev->ib_dev.phys_port_cnt; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; From rostedt at goodmis.org Tue Aug 12 07:32:51 2008 From: rostedt at goodmis.org (Steven Rostedt) Date: Tue, 12 Aug 2008 10:32:51 -0400 (EDT) Subject: [ofa-general] [PATCH] infiniband: change flags from int to long Message-ID: It is a bug to have irq saved flags as an int and not long since some archs may use more that 32 bits in flags. (This patch was only compiled tested) [ Found by the -rt patch checks. These should now be in mainline, but it looks like they may not have been used. ] Signed-off-by: Steven Rostedt --- drivers/infiniband/hw/ipath/ipath_verbs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linus.git/drivers/infiniband/hw/ipath/ipath_verbs.c =================================================================== --- linus.git.orig/drivers/infiniband/hw/ipath/ipath_verbs.c 2008-08-12 10:23:25.000000000 -0400 +++ linus.git/drivers/infiniband/hw/ipath/ipath_verbs.c 2008-08-12 10:25:24.000000000 -0400 @@ -1021,7 +1021,7 @@ static void sdma_complete(void *cookie, struct ipath_verbs_txreq *tx = cookie; struct ipath_qp *qp = tx->qp; struct ipath_ibdev *dev = to_idev(qp->ibqp.device); - unsigned int flags; + unsigned long flags; enum ib_wc_status ibs = status == IPATH_SDMA_TXREQ_S_OK ? IB_WC_SUCCESS : IB_WC_WR_FLUSH_ERR; @@ -1051,7 +1051,7 @@ static void sdma_complete(void *cookie, static void decrement_dma_busy(struct ipath_qp *qp) { - unsigned int flags; + unsigned long flags; if (atomic_dec_and_test(&qp->s_dma_busy)) { spin_lock_irqsave(&qp->s_lock, flags); @@ -1221,7 +1221,7 @@ static int ipath_verbs_send_pio(struct i unsigned flush_wc; u32 control; int ret; - unsigned int flags; + unsigned long flags; piobuf = ipath_getpiobuf(dd, plen, NULL); if (unlikely(piobuf == NULL)) { From hal.rosenstock at gmail.com Tue Aug 12 07:48:56 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Tue, 12 Aug 2008 10:48:56 -0400 Subject: [ofa-general] OpenSM ran out of LIDs In-Reply-To: <829ded920808112213k429180bay9dce2db0add44112@mail.gmail.com> References: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> <829ded920808112213k429180bay9dce2db0add44112@mail.gmail.com> Message-ID: On Tue, Aug 12, 2008 at 1:13 AM, Keshetti Mahesh wrote: >>> I am getting the below errors in OpenSM while simulating a large >>> Infiniband network using 'ibsim'. >> >> What OpenSM version ? > > I am using "opensm-3.1.10", the same one which comes along with OFED-1.3. > >> >> How are you calculating this ? Are you including switch LIDs in this ? > > One LID per each port of HCA and one LID per each switch present in the network > >> How many switches in this subnet ? Are you using a non 0 LMC ? > > No. I am using LMC=0 only. > >>> Also, opensm is not considering the guid2lid provided (with -x) at the time of >>> starting. >> >> Is the guid2lid file in OSM_CACHE_DIR ? > > Yes. guid2lid fiile is is present in the OSM_CACHE_DIR. OpenSM instead > of reading from > that file, it directly writes to the guid2lid file I have provided. Are you sure it never gets read ? OpenSM will (also) update that file. Are there any other errors prior to this ? What log level is being used ? Can you turn it up to debug to see if more information pertaining to this is logged ? -- Hal > > -Mahesh > From jackm at dev.mellanox.co.il Tue Aug 12 08:31:03 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 12 Aug 2008 18:31:03 +0300 Subject: [ofa-general] [PATCH for 2.6.28] core: fix memory leak in XRC userspace cleanup. Message-ID: <200808121831.03964.jackm@dev.mellanox.co.il> core: fix memory leak in XRC userspace cleanup. When userspace invoked close_xrc_domain, the domain was deleted from the active-domains list for that process -- even if the process had active QPs and SRQs using that domain. The domain, however was not destroyed, and its resources were still allocated. As a result of deleting the domain from the active domains for the process, however, no attempt would be made to destroy the domain during cleanup. If it was the last process using that domain, the domain's resources remained allocated forever (memory/resource leak). The fix is to avoid deleting the domain from the active list if there are outstanding resources (QPs or SRQs) in the process which use it (same as is done for XRC_RCV qp's and xrc qp registration). Signed-off-by: Jack Morgenstein --- Roland, this is a provisional fix for the memory leak that I committed to the ofed 1.4 git tree. A better fix is to add a usecnt field to ib_xrcd_uobject, and increment/decrement it when creating/destroying XRC qp's and XRC srq's. To do this, however, I need to save the xrc domain handle in the ib_uqp_object for use when destroying the QP, to decrement the xrc domain use count (no problem -- such an object exists). However, I need to do the same thing for SRQs, which generates a "falling domino" set of changes. There is no separate srq uobject -- srq uses ib_uevent_object, and I was reluctant to add an xrc domain handle to ib_uevent_object. Furthermore, since there is no separate destroy_xrc_srq() function, I need to know when we are destroying an xrc or non-xrc srq. (I could possibly use an xrc handle = -1 to indicate that this is not an xrc srq). I think the correct thing to do is to define a ib_srq_uobject to replace using the ib_uevent_object for srq's, and deal with the various changes that entails. I am uncomfortable adding a "falling dominoes" change to the core to fix this bug just before the beta. However, since I know you're thinking about some major changes to XRC (migrating functionality from mlx4 to the core), I'm just alerting you to the leak problem. diff --git a/kernel_patches/fixes/core_0160_xrc_fix_memleak.patch b/kernel_patches/fixes/core_0160_xrc_fix_memleak.patch new file mode 100644 index 0000000..80a51c3 --- /dev/null +++ b/kernel_patches/fixes/core_0160_xrc_fix_memleak.patch @@ -0,0 +1,65 @@ +core: fix memory leak in XRC userspace cleanup. + +When userspace invoked close_xrc_domain, the domain was deleted from +the active-domains list for that process -- even if the process had +active QPs and SRQs using that domain. + +The domain, however was not destroyed, and its resources were still +allocated. + +As a result of deleting the domain from the active domains for the +process, however, no attempt would be made to destroy the domain +during cleanup. If it was the last process using that domain, +the domain's resources remained allocated forever (memory/resource leak). + +The fix is to avoid deleting the domain from the active list if there +are outstanding resources (QPs or SRQs) in the process which use it +(same as is done for XRC_RCV qp's and xrc qp registration). + +Signed-off-by: Jack Morgenstein + +Index: ofed_1_4/drivers/infiniband/core/uverbs_cmd.c +=================================================================== +--- ofed_1_4.orig/drivers/infiniband/core/uverbs_cmd.c 2008-08-07 19:24:52.000000000 +0300 ++++ ofed_1_4/drivers/infiniband/core/uverbs_cmd.c 2008-08-07 19:26:34.000000000 +0300 +@@ -2567,7 +2567,7 @@ ssize_t ib_uverbs_close_xrc_domain(struc + int out_len) + { + struct ib_uverbs_close_xrc_domain cmd; +- struct ib_uobject *uobj; ++ struct ib_uobject *uobj, *t_uobj; + struct ib_uxrcd_object *xrcd_uobj; + struct ib_xrcd *xrcd = NULL; + struct inode *inode = NULL; +@@ -2584,6 +2584,31 @@ ssize_t ib_uverbs_close_xrc_domain(struc + goto err_unlock_mutex; + } + ++ mutex_lock(&file->mutex); ++ if (!ret) { ++ list_for_each_entry(t_uobj, &file->ucontext->qp_list, list) { ++ struct ib_qp *qp = t_uobj->object; ++ if (qp->xrcd && qp->xrcd == uobj->object) { ++ ret = -EBUSY; ++ break; ++ } ++ } ++ } ++ if (!ret) { ++ list_for_each_entry(t_uobj, &file->ucontext->srq_list, list) { ++ struct ib_srq *srq = t_uobj->object; ++ if (srq->xrcd && srq->xrcd == uobj->object) { ++ ret = -EBUSY; ++ break; ++ } ++ } ++ } ++ mutex_unlock(&file->mutex); ++ if (ret) { ++ put_uobj_write(uobj); ++ goto err_unlock_mutex; ++ } ++ + xrcd_uobj = container_of(uobj, struct ib_uxrcd_object, uobject); + if (!list_empty(&xrcd_uobj->xrc_reg_qp_list)) { + ret = -EBUSY; From rdreier at cisco.com Tue Aug 12 08:39:17 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 08:39:17 -0700 Subject: [ofa-general] Re: [PATCH] infiniband: change flags from int to long In-Reply-To: (Steven Rostedt's message of "Tue, 12 Aug 2008 10:32:51 -0400 (EDT)") References: Message-ID: > It is a bug to have irq saved flags as an int and not long since > some archs may use more that 32 bits in flags. Isn't this already upstream for a while as 52fd8ca6? - R. From swise at opengridcomputing.com Tue Aug 12 08:49:30 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 12 Aug 2008 10:49:30 -0500 Subject: [ofa-general] cxgb3: MW Support In-Reply-To: References: Message-ID: <48A1B10A.6030605@opengridcomputing.com> Philip Frey1 wrote: > > Are memory windows supported by the Chelsio T3 adapter? If so, what is > the verb to bind/unbind them? > MWs are supported in ofed-1.4, which is about the beta. The support in linux/ofed is only for kernel-mode MWs. Are you doing user mode or kernel mode? For kernel mode, you allocate a mw with ib_alloc_mw(). And you bind it with ib_bind_mw(). You can invalidate it by posting a IB_WR_LOCAL_INV work request to the send queue. Then you can do a subsequent bind. Deallocate with ib_dealloc_mw(). For user mode, I have a patch series that adds support for MWs, but I hasn't been submitted yet for group review. Its on my (very long) todo list. :) Steve. From tziporet at mellanox.co.il Tue Aug 12 09:52:00 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 12 Aug 2008 19:52:00 +0300 Subject: [ofa-general] OFED meeting agenda for today (Aug 11, 2008) Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD49AABE@mtlexch01.mtl.com> OFED meeting summary for Aug 11, 2008 on OFED 1.4 beta readiness ================================================================ 1. OFED 1.4 status: OFED daily build is now based on kernel 2.6.27-rc1. iSER is disabled until backports are done (in 2-3 weeks) NFS/RDMA - no backport for distros yet 2. Beta release: - Tasks for the beta: - IPoIB intermediate bug fix - CMA fix from Steve (after approval from maintainers) - mlx_4 changes for RAW QP - Fix driver hang on RHEL4 - New code for ex_qp - from Voltaire (mainly user space changes) - The beta date: will be end of this week or early next week depending on progress of the above tasks. - Most important - no new features should be added after the beta release. 3. Testing description: I got descriptions from Voltaire, Intel and Mellanox All companies are requested to submit 4. OS supported: - We will add RHEL 4.7. Probably to RC1. - RHEL 4.4 - compilation and basic testing only; to be dropped at next OFED release 5. Suggestion for OS EOL policy: In version X we declare the OS is moved to "partially supported" state: Meaning compilation and basic tests only In X+1 version it will not be supported at all Tziporet From rostedt at goodmis.org Tue Aug 12 10:14:09 2008 From: rostedt at goodmis.org (Steven Rostedt) Date: Tue, 12 Aug 2008 13:14:09 -0400 (EDT) Subject: [ofa-general] Re: [PATCH] infiniband: change flags from int to long In-Reply-To: References: Message-ID: On Tue, 12 Aug 2008, Roland Dreier wrote: > > It is a bug to have irq saved flags as an int and not long since > > some archs may use more that 32 bits in flags. > > Isn't this already upstream for a while as 52fd8ca6? This is the problem with having multiple git repos lying around. You never know which one is updated. I should have done a git pull on the git repo I examined. Last commit on the repo I looked at: commit 94ad374a0751f40d25e22e036c37f7263569d24c Author: Linus Torvalds Date: Wed Jul 30 14:45:12 2008 -0700 And the commit you stated: commit 52fd8ca6ad4124c15952ded35cfcf6adbd7ae8d4 Author: Vegard Nossum Date: Wed Jul 30 09:29:06 2008 -0700 Which Linus pulled in on: commit 273b2578392bbf6e5c47a8a3d1ee461ce6fc7182 Merge: 8e43e12... 06a91a0... Author: Linus Torvalds Date: Thu Aug 7 18:14:07 2008 -0700 Sorry for the noise :-/ -- Steve From ralph.campbell at qlogic.com Tue Aug 12 10:15:37 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Tue, 12 Aug 2008 10:15:37 -0700 Subject: [ofa-general] [PATCH] IB/ipath - don't allow the QP path MTU to be set higher than port MTU Message-ID: <20080812171537.3699.62189.stgit@eng-46.mv.qlogic.com> OpenMPI uses a configuration file to pick the path MTU when modifying a QP to RTR. This can cause MPI programs to hang if the currently configured port neighbor MTU is less than the requested QP path MTU since packets will be continuously dropped by the switch. This patch fixes the problem by returning an error when attempting to set the QP path MTU greater than the port MTU. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_qp.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_qp.c b/drivers/infiniband/hw/ipath/ipath_qp.c index 4715911..8ba6267 100644 --- a/drivers/infiniband/hw/ipath/ipath_qp.c +++ b/drivers/infiniband/hw/ipath/ipath_qp.c @@ -488,13 +488,15 @@ int ipath_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, goto inval; /* - * don't allow invalid Path MTU values or greater than 2048 - * unless we are configured for a 4KB MTU + * Don't allow path MTU values greater than the currently + * configured port neighbor MTU. */ - if ((attr_mask & IB_QP_PATH_MTU) && - (ib_mtu_enum_to_int(attr->path_mtu) == -1 || - (attr->path_mtu > IB_MTU_2048 && !ipath_mtu4096))) - goto inval; + if (attr_mask & IB_QP_PATH_MTU) { + int mtu = ib_mtu_enum_to_int(attr->path_mtu); + + if (mtu == -1 || mtu > dev->dd->ipath_ibmtu) + goto inval; + } if (attr_mask & IB_QP_PATH_MIG_STATE) if (attr->path_mig_state != IB_MIG_MIGRATED && From sean.hefty at intel.com Tue Aug 12 10:34:36 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Tue, 12 Aug 2008 10:34:36 -0700 Subject: [ofa-general] 2.6.27-rc2 compile hangs on net/sunrpc/xprtrdma/svc_rdma_transport.o Message-ID: I'm seeing make hang while trying to build 2.6.27-rc2. The hang occurs on net/sunrpc/xprtrdma/svc_rdma_transport.o. Has anyone else seen this issue? (I'm just starting to look into the details.) - Sean From rdreier at cisco.com Tue Aug 12 11:35:48 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 11:35:48 -0700 Subject: [ofa-general] Re: [PATCH 5/5 try2] ib/ehca: discard double CQE for one WR In-Reply-To: <200808121546.31057.alexs@linux.vnet.ibm.com> (Alexander Schmidt's message of "Tue, 12 Aug 2008 15:46:30 +0200") References: <200808121546.31057.alexs@linux.vnet.ibm.com> Message-ID: thanks, applied all 5. From rdreier at cisco.com Tue Aug 12 11:40:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 11:40:33 -0700 Subject: [ofa-general] [PATCH] IB/ipath - don't allow the QP path MTU to be set higher than port MTU In-Reply-To: <20080812171537.3699.62189.stgit@eng-46.mv.qlogic.com> (Ralph Campbell's message of "Tue, 12 Aug 2008 10:15:37 -0700") References: <20080812171537.3699.62189.stgit@eng-46.mv.qlogic.com> Message-ID: > This patch fixes the problem by returning an error when attempting > to set the QP path MTU greater than the port MTU. I don't believe this is correct according to the IB spec -- modify QP is supposed to return an error only when the requested MTU is bigger than the MTUcap of the port. Even if we do have this patch, it seems there could still be trouble if an intermediate hop has a smaller MTU. Why is Open MPI trying to set a path MTU bigger than what the fabric actually supports? - R. From ralph.campbell at qlogic.com Tue Aug 12 11:55:18 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Tue, 12 Aug 2008 11:55:18 -0700 Subject: [ofa-general] [PATCH] IB/ipath - don't allow the QP path MTU to be set higher than port MTU In-Reply-To: References: <20080812171537.3699.62189.stgit@eng-46.mv.qlogic.com> Message-ID: <1218567318.620.435.camel@chromite.mv.qlogic.com> On Tue, 2008-08-12 at 11:40 -0700, Roland Dreier wrote: > > This patch fixes the problem by returning an error when attempting > > to set the QP path MTU greater than the port MTU. > > I don't believe this is correct according to the IB spec -- modify QP is > supposed to return an error only when the requested MTU is bigger than > the MTUcap of the port. Even if we do have this patch, it seems there > could still be trouble if an intermediate hop has a smaller MTU. > > Why is Open MPI trying to set a path MTU bigger than what the fabric > actually supports? Because at one point it didn't check the portinfo, it only relied on the config file. I did send email about this some time ago so it may be fixed in newer versions. From rdreier at cisco.com Tue Aug 12 12:38:13 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 12:38:13 -0700 Subject: [ofa-general] Re: [PATCH 1 of 2 for 2.6.28] core: Fix Raw Ethertype QP support In-Reply-To: <200808121720.11878.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Tue, 12 Aug 2008 17:20:11 +0300") References: <200808121720.11878.jackm@dev.mellanox.co.il> Message-ID: What is going to be the consumer of raw QPs? - R. From rdreier at cisco.com Tue Aug 12 12:42:45 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 12:42:45 -0700 Subject: [ofa-general] Re: [PATCH for 2.6.28] core: fix memory leak in XRC userspace cleanup. In-Reply-To: <200808121831.03964.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Tue, 12 Aug 2008 18:31:03 +0300") References: <200808121831.03964.jackm@dev.mellanox.co.il> Message-ID: > ++ if (!ret) { > ++ list_for_each_entry(t_uobj, &file->ucontext->qp_list, list) { > ++ struct ib_qp *qp = t_uobj->object; > ++ if (qp->xrcd && qp->xrcd == uobj->object) { > ++ ret = -EBUSY; > ++ break; > ++ } > ++ } > ++ } > ++ if (!ret) { > ++ list_for_each_entry(t_uobj, &file->ucontext->srq_list, list) { > ++ struct ib_srq *srq = t_uobj->object; > ++ if (srq->xrcd && srq->xrcd == uobj->object) { > ++ ret = -EBUSY; > ++ break; > ++ } > ++ } > ++ } This is obviously pretty gross. Let me see it I can come up with a more palatable fix in my tree. - R. From Albert at altima.net Tue Aug 12 12:43:01 2008 From: Albert at altima.net (Albert Wilson) Date: Tue, 12 Aug 2008 23:43:01 +0400 Subject: [ofa-general] BBC NEWS Message-ID: <169301c8fcb3$a1602f40$c0a80102@Albert> Kameron Dias intime love Tapes. http://www.kedziora.de/thebest/best.php -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Tue Aug 12 12:43:29 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 12:43:29 -0700 Subject: [ofa-general] Re: [PATCH v3] ib/core: fix for send multicast group send leave retry In-Reply-To: <48A06A66.7070605@Voltaire.COM> (Yossi Etigin's message of "Mon, 11 Aug 2008 19:35:50 +0300") References: <48A06A66.7070605@Voltaire.COM> Message-ID: Is there any urgent need to merge this for 2.6.27? If so, what is the situation where this causes a big problem? Otherwise I'll add this for 2.6.28. From rdreier at cisco.com Tue Aug 12 13:55:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 12 Aug 2008 13:55:33 -0700 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get the following fixes: Alexander Schmidt (5): IB/ehca: Update qp_state on cached modify_qp() IB/ehca: Rename goto label in ehca_poll_cq_one() IB/ehca: Repoll CQ on invalid opcode IB/ehca: Check idr_find() return value IB/ehca: Discard double CQE for one WR David J. Wilder (1): IPoIB/cm: Use vmalloc() to allocate rx_rings Roland Dreier (1): Merge branches 'ehca' and 'ipoib' into for-linus drivers/infiniband/hw/ehca/ehca_classes.h | 9 ++++ drivers/infiniband/hw/ehca/ehca_qes.h | 1 + drivers/infiniband/hw/ehca/ehca_qp.c | 48 +++++++++++++++++------ drivers/infiniband/hw/ehca/ehca_reqs.c | 60 ++++++++++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_cm.c | 17 ++++++-- 5 files changed, 104 insertions(+), 31 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 0b0618e..1ab919f 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -156,6 +156,14 @@ struct ehca_mod_qp_parm { #define EHCA_MOD_QP_PARM_MAX 4 +#define QMAP_IDX_MASK 0xFFFFULL + +/* struct for tracking if cqes have been reported to the application */ +struct ehca_qmap_entry { + u16 app_wr_id; + u16 reported; +}; + struct ehca_qp { union { struct ib_qp ib_qp; @@ -165,6 +173,7 @@ struct ehca_qp { enum ehca_ext_qp_type ext_type; enum ib_qp_state state; struct ipz_queue ipz_squeue; + struct ehca_qmap_entry *sq_map; struct ipz_queue ipz_rqueue; struct h_galpas galpas; u32 qkey; diff --git a/drivers/infiniband/hw/ehca/ehca_qes.h b/drivers/infiniband/hw/ehca/ehca_qes.h index 8188030..5d28e3e 100644 --- a/drivers/infiniband/hw/ehca/ehca_qes.h +++ b/drivers/infiniband/hw/ehca/ehca_qes.h @@ -213,6 +213,7 @@ struct ehca_wqe { #define WC_STATUS_ERROR_BIT 0x80000000 #define WC_STATUS_REMOTE_ERROR_FLAGS 0x0000F800 #define WC_STATUS_PURGE_BIT 0x10 +#define WC_SEND_RECEIVE_BIT 0x80 struct ehca_cqe { u64 work_request_id; diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index ea13efd..b6bcee0 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -412,6 +412,7 @@ static struct ehca_qp *internal_create_qp( struct ehca_shca *shca = container_of(pd->device, struct ehca_shca, ib_device); struct ib_ucontext *context = NULL; + u32 nr_qes; u64 h_ret; int is_llqp = 0, has_srq = 0; int qp_type, max_send_sge, max_recv_sge, ret; @@ -715,6 +716,15 @@ static struct ehca_qp *internal_create_qp( "and pages ret=%i", ret); goto create_qp_exit2; } + nr_qes = my_qp->ipz_squeue.queue_length / + my_qp->ipz_squeue.qe_size; + my_qp->sq_map = vmalloc(nr_qes * + sizeof(struct ehca_qmap_entry)); + if (!my_qp->sq_map) { + ehca_err(pd->device, "Couldn't allocate squeue " + "map ret=%i", ret); + goto create_qp_exit3; + } } if (HAS_RQ(my_qp)) { @@ -724,7 +734,7 @@ static struct ehca_qp *internal_create_qp( if (ret) { ehca_err(pd->device, "Couldn't initialize rqueue " "and pages ret=%i", ret); - goto create_qp_exit3; + goto create_qp_exit4; } } @@ -770,7 +780,7 @@ static struct ehca_qp *internal_create_qp( if (!my_qp->mod_qp_parm) { ehca_err(pd->device, "Could not alloc mod_qp_parm"); - goto create_qp_exit4; + goto create_qp_exit5; } } } @@ -780,7 +790,7 @@ static struct ehca_qp *internal_create_qp( h_ret = ehca_define_sqp(shca, my_qp, init_attr); if (h_ret != H_SUCCESS) { ret = ehca2ib_return_code(h_ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -789,7 +799,7 @@ static struct ehca_qp *internal_create_qp( if (ret) { ehca_err(pd->device, "Couldn't assign qp to send_cq ret=%i", ret); - goto create_qp_exit5; + goto create_qp_exit6; } } @@ -815,22 +825,26 @@ static struct ehca_qp *internal_create_qp( if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; - goto create_qp_exit6; + goto create_qp_exit7; } } return my_qp; -create_qp_exit6: +create_qp_exit7: ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); -create_qp_exit5: +create_qp_exit6: kfree(my_qp->mod_qp_parm); -create_qp_exit4: +create_qp_exit5: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); +create_qp_exit4: + if (HAS_SQ(my_qp)) + vfree(my_qp->sq_map); + create_qp_exit3: if (HAS_SQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); @@ -1534,8 +1548,6 @@ static int internal_modify_qp(struct ib_qp *ibqp, if (attr_mask & IB_QP_QKEY) my_qp->qkey = attr->qkey; - my_qp->state = qp_new_state; - modify_qp_exit2: if (squeue_locked) { /* this means: sqe -> rts */ spin_unlock_irqrestore(&my_qp->spinlock_s, flags); @@ -1551,6 +1563,8 @@ modify_qp_exit1: int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata) { + int ret = 0; + struct ehca_shca *shca = container_of(ibqp->device, struct ehca_shca, ib_device); struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); @@ -1597,12 +1611,18 @@ int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, attr->qp_state, my_qp->init_attr.port_num, ibqp->qp_type); spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); - return 0; + goto out; } spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); } - return internal_modify_qp(ibqp, attr, attr_mask, 0); + ret = internal_modify_qp(ibqp, attr, attr_mask, 0); + +out: + if ((ret == 0) && (attr_mask & IB_QP_STATE)) + my_qp->state = attr->qp_state; + + return ret; } void ehca_recover_sqp(struct ib_qp *sqp) @@ -1973,8 +1993,10 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); - if (HAS_SQ(my_qp)) + if (HAS_SQ(my_qp)) { ipz_queue_dtor(my_pd, &my_qp->ipz_squeue); + vfree(my_qp->sq_map); + } kmem_cache_free(qp_cache, my_qp); atomic_dec(&shca->num_qps); return 0; diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 898c8b5..4426d82 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -139,6 +139,7 @@ static void trace_send_wr_ud(const struct ib_send_wr *send_wr) static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, const struct ib_send_wr *send_wr, + u32 sq_map_idx, int hidden) { u32 idx; @@ -157,7 +158,11 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, /* clear wqe header until sglist */ memset(wqe_p, 0, offsetof(struct ehca_wqe, u.ud_av.sg_list)); - wqe_p->work_request_id = send_wr->wr_id; + wqe_p->work_request_id = send_wr->wr_id & ~QMAP_IDX_MASK; + wqe_p->work_request_id |= sq_map_idx & QMAP_IDX_MASK; + + qp->sq_map[sq_map_idx].app_wr_id = send_wr->wr_id & QMAP_IDX_MASK; + qp->sq_map[sq_map_idx].reported = 0; switch (send_wr->opcode) { case IB_WR_SEND: @@ -381,6 +386,7 @@ static inline int post_one_send(struct ehca_qp *my_qp, { struct ehca_wqe *wqe_p; int ret; + u32 sq_map_idx; u64 start_offset = my_qp->ipz_squeue.current_q_offset; /* get pointer next to free WQE */ @@ -393,8 +399,15 @@ static inline int post_one_send(struct ehca_qp *my_qp, "qp_num=%x", my_qp->ib_qp.qp_num); return -ENOMEM; } + + /* + * Get the index of the WQE in the send queue. The same index is used + * for writing into the sq_map. + */ + sq_map_idx = start_offset / my_qp->ipz_squeue.qe_size; + /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, sq_map_idx, hidden); /* * if something failed, * reset the free entry pointer to the start value @@ -589,7 +602,7 @@ static inline int ehca_poll_cq_one(struct ib_cq *cq, struct ib_wc *wc) struct ehca_qp *my_qp; int cqe_count = 0, is_error; -poll_cq_one_read_cqe: +repoll: cqe = (struct ehca_cqe *) ipz_qeit_get_inc_valid(&my_cq->ipz_queue); if (!cqe) { @@ -617,7 +630,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, "cq_num=%x qp_num=%x", my_cq->cq_number, cqe->local_qp_number); /* ignore this purged cqe */ - goto poll_cq_one_read_cqe; + goto repoll; } spin_lock_irqsave(&qp->spinlock_s, flags); purgeflag = qp->sqerr_purgeflag; @@ -636,7 +649,7 @@ poll_cq_one_read_cqe: * that caused sqe and turn off purge flag */ qp->sqerr_purgeflag = 0; - goto poll_cq_one_read_cqe; + goto repoll; } } @@ -654,8 +667,34 @@ poll_cq_one_read_cqe: my_cq, my_cq->cq_number); } - /* we got a completion! */ - wc->wr_id = cqe->work_request_id; + read_lock(&ehca_qp_idr_lock); + my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); + read_unlock(&ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc->qp = &my_qp->ib_qp; + + if (!(cqe->w_completion_flags & WC_SEND_RECEIVE_BIT)) { + struct ehca_qmap_entry *qmap_entry; + /* + * We got a send completion and need to restore the original + * wr_id. + */ + qmap_entry = &my_qp->sq_map[cqe->work_request_id & + QMAP_IDX_MASK]; + + if (qmap_entry->reported) { + ehca_warn(cq->device, "Double cqe on qp_num=%#x", + my_qp->real_qp_num); + /* found a double cqe, discard it and read next one */ + goto repoll; + } + wc->wr_id = cqe->work_request_id & ~QMAP_IDX_MASK; + wc->wr_id |= qmap_entry->app_wr_id; + qmap_entry->reported = 1; + } else + /* We got a receive completion. */ + wc->wr_id = cqe->work_request_id; /* eval ib_wc_opcode */ wc->opcode = ib_wc_opcode[cqe->optype]-1; @@ -667,7 +706,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, "ehca_cq=%p cq_num=%x", my_cq, my_cq->cq_number); /* update also queue adder to throw away this entry!!! */ - goto poll_cq_one_exit0; + goto repoll; } /* eval ib_wc_status */ @@ -678,11 +717,6 @@ poll_cq_one_read_cqe: } else wc->status = IB_WC_SUCCESS; - read_lock(&ehca_qp_idr_lock); - my_qp = idr_find(&ehca_qp_idr, cqe->qp_token); - wc->qp = &my_qp->ib_qp; - read_unlock(&ehca_qp_idr_lock); - wc->byte_len = cqe->nr_bytes_transferred; wc->pkey_index = cqe->pkey_index; wc->slid = cqe->rlid; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 7ebc400..341ffed 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -202,7 +202,7 @@ static void ipoib_cm_free_rx_ring(struct net_device *dev, dev_kfree_skb_any(rx_ring[i].skb); } - kfree(rx_ring); + vfree(rx_ring); } static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) @@ -352,9 +352,14 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i int ret; int i; - rx->rx_ring = kcalloc(ipoib_recvq_size, sizeof *rx->rx_ring, GFP_KERNEL); - if (!rx->rx_ring) + rx->rx_ring = vmalloc(ipoib_recvq_size * sizeof *rx->rx_ring); + if (!rx->rx_ring) { + printk(KERN_WARNING "%s: failed to allocate CM non-SRQ ring (%d entries)\n", + priv->ca->name, ipoib_recvq_size); return -ENOMEM; + } + + memset(rx->rx_ring, 0, ipoib_recvq_size * sizeof *rx->rx_ring); t = kmalloc(sizeof *t, GFP_KERNEL); if (!t) { @@ -1494,14 +1499,16 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) return; } - priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring, - GFP_KERNEL); + priv->cm.srq_ring = vmalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring); if (!priv->cm.srq_ring) { printk(KERN_WARNING "%s: failed to allocate CM SRQ ring (%d entries)\n", priv->ca->name, ipoib_recvq_size); ib_destroy_srq(priv->cm.srq); priv->cm.srq = NULL; + return; } + + memset(priv->cm.srq_ring, 0, ipoib_recvq_size * sizeof *priv->cm.srq_ring); } int ipoib_cm_dev_init(struct net_device *dev) From keshetti.mahesh at gmail.com Tue Aug 12 21:43:29 2008 From: keshetti.mahesh at gmail.com (Keshetti Mahesh) Date: Wed, 13 Aug 2008 10:13:29 +0530 Subject: [ofa-general] OpenSM ran out of LIDs In-Reply-To: References: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> <829ded920808112213k429180bay9dce2db0add44112@mail.gmail.com> Message-ID: <829ded920808122143lc1962f3sbfdd671c3dcdae96@mail.gmail.com> > Are you sure it never gets read ? OpenSM will (also) update that file. Now OpenSM is accepting the guid2lid with some changes in the format. (Initially I was not using newline between entries in the guid2lid file). But shouldn't OpenSM be able to find free lids on its own ? After the LID assignment, I am getting different errors. ============================================== Aug 13 08:42:47 479488 [41401960] 0x01 -> osm_lft_rcv_process: ERR 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) Switch 0x200000 Aug 13 08:42:47 480381 [44606960] 0x01 -> osm_lft_rcv_process: ERR 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) Switch 0x200000 Aug 13 08:42:47 487412 [41E02960] 0x01 -> osm_lft_rcv_process: ERR 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) Switch 0x200100 Aug 13 08:42:47 487424 [45A08960] 0x01 -> osm_lft_rcv_process: ERR 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) Switch 0x200100 Aug 13 08:42:47 487455 [42803960] 0x01 -> osm_lft_rcv_process: ERR 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) Switch 0x200100 . . . . (On all the switches) ================================================================================== What could be the reason for failure of setting LIN FWD tables when the memory required for LIN FWD tables is already allocated ? -Mahesh From olgas at voltaire.com Wed Aug 13 00:33:24 2008 From: olgas at voltaire.com (Olga Shern) Date: Wed, 13 Aug 2008 10:33:24 +0300 Subject: [ofa-general] RE: [PATCH v3] ib/core: fix for send multicast group send leave retry In-Reply-To: References: <48A06A66.7070605@Voltaire.COM> Message-ID: <39C75744D164D948A170E9792AF8E7CA0167B61C@exil.voltaire.com> I think you can add this to 2.6.28 Thanks Olga -----Original Message----- From: Roland Dreier [mailto:rdreier at cisco.com] Sent: Tuesday, August 12, 2008 10:43 PM To: Yosef Eitgin Cc: Roland Drier; general list; Olga Shern; Ron Livne; sean.hefty at intel.com Subject: Re: [PATCH v3] ib/core: fix for send multicast group send leave retry Is there any urgent need to merge this for 2.6.27? If so, what is the situation where this causes a big problem? Otherwise I'll add this for 2.6.28. From alekseys at voltaire.com Wed Aug 13 01:04:10 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:04:10 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 1/9] sockaddr_storage in addr_req Message-ID: <1218614650.19941.4.camel@linux-zn6t.site> >From 6d2ac2bee7831f21049151678071dbf2a6146805 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 09:46:51 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 1/9] sockaddr_storage in addr_req Using sockaddr_storage in addr_req instead of sockaddr Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 09a2bec..c5b623b 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -49,8 +49,8 @@ MODULE_LICENSE("Dual BSD/GPL"); struct addr_req { struct list_head list; - struct sockaddr src_addr; - struct sockaddr dst_addr; + struct sockaddr_storage src_addr; + struct sockaddr_storage dst_addr; struct rdma_dev_addr *addr; struct rdma_addr_client *client; void *context; -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:06:21 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:06:21 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 2/9] sockaddr_in substitution In-Reply-To: <1218614650.19941.4.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> Message-ID: <1218614781.19941.6.camel@linux-zn6t.site> >From 63ef6d190862438cf439d5da2a3a670d919dc980 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 09:51:39 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 2/9] sockaddr_in substitution In order to prepare RDMA CM work with IPv6 these functions changed to obtain as argument struct sockaddr* pointer and not sockaddr_in* addr_resolve_remote addr_resolve_local rdma_resolve_ip Changes in process_req function are side effect of modifications in functions above Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 46 ++++++++++++++++++++-------------------- 1 files changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index c5b623b..b59ad53 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -171,12 +171,12 @@ static void addr_send_arp(struct sockaddr_in *dst_in) ip_rt_put(rt); } -static int addr_resolve_remote(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_remote(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; struct flowi fl; struct rtable *rt; struct neighbour *neigh; @@ -207,8 +207,8 @@ static int addr_resolve_remote(struct sockaddr_in *src_in, } if (!src_ip) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = rt->rt_src; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; } ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); @@ -223,7 +223,7 @@ out: static void process_req(struct work_struct *work) { struct addr_req *req, *temp_req; - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct list_head done_list; INIT_LIST_HEAD(&done_list); @@ -231,8 +231,8 @@ static void process_req(struct work_struct *work) mutex_lock(&lock); list_for_each_entry_safe(req, temp_req, &req_list, list) { if (req->status == -ENODATA) { - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_remote(src_in, dst_in, req->addr); if (req->status && time_after_eq(jiffies, req->timeout)) @@ -251,20 +251,20 @@ static void process_req(struct work_struct *work) list_for_each_entry_safe(req, temp_req, &done_list, list) { list_del(&req->list); - req->callback(req->status, &req->src_addr, req->addr, - req->context); + req->callback(req->status, (struct sockaddr *) &req->src_addr, \ + req->addr, req->context); put_client(req->client); kfree(req); } } -static int addr_resolve_local(struct sockaddr_in *src_in, - struct sockaddr_in *dst_in, +static int addr_resolve_local(struct sockaddr *src_in, + struct sockaddr *dst_in, struct rdma_dev_addr *addr) { struct net_device *dev; - __be32 src_ip = src_in->sin_addr.s_addr; - __be32 dst_ip = dst_in->sin_addr.s_addr; + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; int ret; dev = ip_dev_find(&init_net, dst_ip); @@ -272,15 +272,15 @@ static int addr_resolve_local(struct sockaddr_in *src_in, return -EADDRNOTAVAIL; if (ipv4_is_zeronet(src_ip)) { - src_in->sin_family = dst_in->sin_family; - src_in->sin_addr.s_addr = dst_ip; + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; ret = rdma_copy_addr(addr, dev, dev->dev_addr); } else if (ipv4_is_loopback(src_ip)) { - ret = rdma_translate_ip((struct sockaddr *)dst_in, addr); + ret = rdma_translate_ip(dst_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } else { - ret = rdma_translate_ip((struct sockaddr *)src_in, addr); + ret = rdma_translate_ip(src_in, addr); if (!ret) memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); } @@ -296,7 +296,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, struct rdma_dev_addr *addr, void *context), void *context) { - struct sockaddr_in *src_in, *dst_in; + struct sockaddr *src_in, *dst_in; struct addr_req *req; int ret = 0; @@ -313,8 +313,8 @@ int rdma_resolve_ip(struct rdma_addr_client *client, req->client = client; atomic_inc(&client->refcount); - src_in = (struct sockaddr_in *) &req->src_addr; - dst_in = (struct sockaddr_in *) &req->dst_addr; + src_in = (struct sockaddr *) &req->src_addr; + dst_in = (struct sockaddr *) &req->dst_addr; req->status = addr_resolve_local(src_in, dst_in, addr); if (req->status == -EADDRNOTAVAIL) @@ -328,7 +328,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, case -ENODATA: req->timeout = msecs_to_jiffies(timeout_ms) + jiffies; queue_req(req); - addr_send_arp(dst_in); + addr_send_arp((struct sockaddr_in *)dst_in); break; default: ret = req->status; -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:09:39 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:09:39 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 3/9] IPv6 support in rdma_translate_ip In-Reply-To: <1218614781.19941.6.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> Message-ID: <1218614979.19941.9.camel@linux-zn6t.site> >From ba9050a683e8c4ede8aa0d0739c6828b9dcf949f Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 09:53:16 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 3/9] IPv6 support in rdma_translate_ip Added support for IPv6 family in rdma_translate_ip function New function cma_ipv6_dev_find used for searching netdevice with given IP address Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 31 ++++++++++++++++++++++++------- 1 files changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index b59ad53..dd5997b 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -42,6 +42,7 @@ #include #include #include +#include MODULE_AUTHOR("Sean Hefty"); MODULE_DESCRIPTION("IB Address Translation"); @@ -110,18 +111,34 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev, } EXPORT_SYMBOL(rdma_copy_addr); +static struct net_device *cma_ipv6_dev_find(struct in6_addr *addr) +{ + struct net_device *dev; + for_each_netdev(&init_net, dev) + if (ipv6_chk_addr(&init_net, addr, dev, 1)) + return dev; + return NULL; +} + int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr) { struct net_device *dev; - __be32 ip = ((struct sockaddr_in *) addr)->sin_addr.s_addr; - int ret; - - dev = ip_dev_find(&init_net, ip); - if (!dev) - return -EADDRNOTAVAIL; + int ret = -EADDRNOTAVAIL; + + if (addr->sa_family == AF_INET) { + dev = ip_dev_find(&init_net, + ((struct sockaddr_in *) addr)->sin_addr.s_addr); + if (dev) { + ret = rdma_copy_addr(dev_addr, dev, NULL); + dev_put(dev); + } + } else { + dev = cma_ipv6_dev_find( + &((struct sockaddr_in6 *)addr)->sin6_addr); + if (dev) + ret = rdma_copy_addr(dev_addr, dev, NULL); + } - ret = rdma_copy_addr(dev_addr, dev, NULL); - dev_put(dev); return ret; } EXPORT_SYMBOL(rdma_translate_ip); -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:11:12 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:11:12 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 4/9] AF_INET6 support for rdma_bind_addr In-Reply-To: <1218614979.19941.9.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> Message-ID: <1218615072.19941.11.camel@linux-zn6t.site> >From dc34ee826793bc42376db02b3d31f3fd47f142bf Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 09:55:33 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 4/9] AF_INET6 support for rdma_bind_addr Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d951896..4728265 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2073,7 +2073,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) struct rdma_id_private *id_priv; int ret; - if (addr->sa_family != AF_INET) + if (addr->sa_family != AF_INET && addr->sa_family != AF_INET6) return -EAFNOSUPPORT; id_priv = container_of(id, struct rdma_id_private, id); -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:12:20 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:12:20 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 5/9] AF_INET6 case to cma_format_hdr function In-Reply-To: <1218615072.19941.11.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> <1218615072.19941.11.camel@linux-zn6t.site> Message-ID: <1218615140.19941.13.camel@linux-zn6t.site> >From 946f2b54c4b884cac1c5d5fd17464032ddebf8f3 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 10:01:05 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 5/9] AF_INET6 case to cma_format_hdr function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 73 ++++++++++++++++++++++++++++------------ 1 files changed, 51 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 4728265..31f2aa2 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2113,32 +2113,61 @@ EXPORT_SYMBOL(rdma_bind_addr); static int cma_format_hdr(void *hdr, enum rdma_port_space ps, struct rdma_route *route) { - struct sockaddr_in *src4, *dst4; struct cma_hdr *cma_hdr; struct sdp_hh *sdp_hdr; - src4 = (struct sockaddr_in *) &route->addr.src_addr; - dst4 = (struct sockaddr_in *) &route->addr.dst_addr; - - switch (ps) { - case RDMA_PS_SDP: - sdp_hdr = hdr; - if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) - return -EINVAL; - sdp_set_ip_ver(sdp_hdr, 4); - sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; - sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; - sdp_hdr->port = src4->sin_port; - break; - default: - cma_hdr = hdr; - cma_hdr->cma_version = CMA_VERSION; - cma_set_ip_ver(cma_hdr, 4); - cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; - cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; - cma_hdr->port = src4->sin_port; - break; + if (route->addr.src_addr.ss_family == AF_INET) { + struct sockaddr_in *src4, *dst4; + + src4 = (struct sockaddr_in *) &route->addr.src_addr; + dst4 = (struct sockaddr_in *) &route->addr.dst_addr; + + switch (ps) { + case RDMA_PS_SDP: + sdp_hdr = hdr; + if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) + return -EINVAL; + sdp_set_ip_ver(sdp_hdr, 4); + sdp_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; + sdp_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; + sdp_hdr->port = src4->sin_port; + break; + default: + cma_hdr = hdr; + cma_hdr->cma_version = CMA_VERSION; + cma_set_ip_ver(cma_hdr, 4); + cma_hdr->src_addr.ip4.addr = src4->sin_addr.s_addr; + cma_hdr->dst_addr.ip4.addr = dst4->sin_addr.s_addr; + cma_hdr->port = src4->sin_port; + break; + } + } else { + struct sockaddr_in6 *src6, *dst6; + + src6 = (struct sockaddr_in6 *) &route->addr.src_addr; + dst6 = (struct sockaddr_in6 *) &route->addr.dst_addr; + + switch (ps) { + case RDMA_PS_SDP: + sdp_hdr = hdr; + if (sdp_get_majv(sdp_hdr->sdp_version) != SDP_MAJ_VERSION) + return -EINVAL; + sdp_set_ip_ver(sdp_hdr, 6); + sdp_hdr->src_addr.ip6 = src6->sin6_addr; + sdp_hdr->dst_addr.ip6 = dst6->sin6_addr; + sdp_hdr->port = src6->sin6_port; + break; + default: + cma_hdr = hdr; + cma_hdr->cma_version = CMA_VERSION; + cma_set_ip_ver(cma_hdr, 6); + cma_hdr->src_addr.ip6 = src6->sin6_addr; + cma_hdr->dst_addr.ip6 = dst6->sin6_addr; + cma_hdr->port = src6->sin6_port; + break; + } } + return 0; } -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:14:06 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:14:06 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 6/9] IPv6 support in cma_bind_any In-Reply-To: <1218615140.19941.13.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> <1218615072.19941.11.camel@linux-zn6t.site> <1218615140.19941.13.camel@linux-zn6t.site> Message-ID: <1218615246.19941.15.camel@linux-zn6t.site> >From a68edf613672c6d9effedb6d73dfca13e071d7b8 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 10:03:16 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 6/9] IPv6 support in cma_bind_any Using sockaddr_storage structure instead of sockaddr_in for catching IPv6 protocol Signed-off-by: Aleksey Senin --- drivers/infiniband/core/cma.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 31f2aa2..df22c5c 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1467,10 +1467,10 @@ static void cma_listen_on_all(struct rdma_id_private *id_priv) static int cma_bind_any(struct rdma_cm_id *id, sa_family_t af) { - struct sockaddr_in addr_in; + struct sockaddr_storage addr_in; memset(&addr_in, 0, sizeof addr_in); - addr_in.sin_family = af; + addr_in.ss_family = af; return rdma_bind_addr(id, (struct sockaddr *) &addr_in); } -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:15:22 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:15:22 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 7/9] IPv6 local address resolution In-Reply-To: <1218615246.19941.15.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> <1218615072.19941.11.camel@linux-zn6t.site> <1218615140.19941.13.camel@linux-zn6t.site> <1218615246.19941.15.camel@linux-zn6t.site> Message-ID: <1218615322.19941.17.camel@linux-zn6t.site> >From 0b137fa4506811433dadec2dac94b4fb7bec0063 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 10:10:54 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 7/9] IPv6 local address resolution RDMA CM support on the local machine. Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 64 ++++++++++++++++++++++++++++------------ 1 files changed, 45 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index dd5997b..077d051 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -280,29 +280,55 @@ static int addr_resolve_local(struct sockaddr *src_in, struct rdma_dev_addr *addr) { struct net_device *dev; - __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; - __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; - int ret; - - dev = ip_dev_find(&init_net, dst_ip); - if (!dev) - return -EADDRNOTAVAIL; + int ret = -EADDRNOTAVAIL; - if (ipv4_is_zeronet(src_ip)) { - src_in->sa_family = dst_in->sa_family; - ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; - ret = rdma_copy_addr(addr, dev, dev->dev_addr); - } else if (ipv4_is_loopback(src_ip)) { - ret = rdma_translate_ip(dst_in, addr); - if (!ret) - memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + if (dst_in->sa_family == AF_INET) { + __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; + __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; + + dev = ip_dev_find(&init_net, dst_ip); + if (!dev) + return -EADDRNOTAVAIL; + + if (ipv4_is_zeronet(src_ip)) { + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in *)src_in)->sin_addr.s_addr = dst_ip; + ret = rdma_copy_addr(addr, dev, dev->dev_addr); + } else if (ipv4_is_loopback(src_ip)) { + ret = rdma_translate_ip(dst_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } else { + ret = rdma_translate_ip(src_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } + dev_put(dev); } else { - ret = rdma_translate_ip(src_in, addr); - if (!ret) - memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + struct in6_addr *a = &((struct sockaddr_in6 *)dst_in)->sin6_addr; + + dev = cma_ipv6_dev_find(a); + if (!dev) + return -EADDRNOTAVAIL; + + a = &((struct sockaddr_in6 *)src_in)->sin6_addr; + + if (ipv6_addr_any(a)) { + src_in->sa_family = dst_in->sa_family; + ((struct sockaddr_in6 *)src_in)->sin6_addr = + ((struct sockaddr_in6 *)dst_in)->sin6_addr; + ret = rdma_copy_addr(addr, dev, dev->dev_addr); + } else if (ipv6_addr_loopback(a)) { + ret = rdma_translate_ip(dst_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } else { + ret = rdma_translate_ip(src_in, addr); + if (!ret) + memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN); + } } - dev_put(dev); return ret; } -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:16:12 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:16:12 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 8/9] IPv6 support for network discovery In-Reply-To: <1218615322.19941.17.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> <1218615072.19941.11.camel@linux-zn6t.site> <1218615140.19941.13.camel@linux-zn6t.site> <1218615246.19941.15.camel@linux-zn6t.site> <1218615322.19941.17.camel@linux-zn6t.site> Message-ID: <1218615372.19941.19.camel@linux-zn6t.site> >From b61f975bf173e7770561d0ad948e82e32a53b9c9 Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 10:19:13 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 8/9] IPv6 support for network discovery Added support for network discovery in addr_send_arp function Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 32 ++++++++++++++++++++++++-------- 1 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 077d051..c949ab0 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -43,6 +43,7 @@ #include #include #include +#include MODULE_AUTHOR("Sean Hefty"); MODULE_DESCRIPTION("IB Address Translation"); @@ -173,19 +174,34 @@ static void queue_req(struct addr_req *req) mutex_unlock(&lock); } -static void addr_send_arp(struct sockaddr_in *dst_in) +static void addr_send_arp(struct sockaddr *dst_in) { struct rtable *rt; struct flowi fl; - __be32 dst_ip = dst_in->sin_addr.s_addr; + struct dst_entry *dst; memset(&fl, 0, sizeof fl); - fl.nl_u.ip4_u.daddr = dst_ip; - if (ip_route_output_key(&init_net, &rt, &fl)) - return; + if (dst_in->sa_family == AF_INET) { + fl.nl_u.ip4_u.daddr = + ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; - neigh_event_send(rt->u.dst.neighbour, NULL); - ip_rt_put(rt); + if (ip_route_output_key(&init_net, &rt, &fl)) + return; + + neigh_event_send(rt->u.dst.neighbour, NULL); + ip_rt_put(rt); + + } else { + fl.nl_u.ip6_u.daddr = + ((struct sockaddr_in6 *)dst_in)->sin6_addr; + + dst = ip6_route_output(&init_net, NULL, &fl); + if (!dst) + return; + + neigh_event_send(dst->neighbour, NULL); + dst_release(dst); + } } static int addr_resolve_remote(struct sockaddr *src_in, @@ -371,7 +387,7 @@ int rdma_resolve_ip(struct rdma_addr_client *client, case -ENODATA: req->timeout = msecs_to_jiffies(timeout_ms) + jiffies; queue_req(req); - addr_send_arp((struct sockaddr_in *)dst_in); + addr_send_arp(dst_in); break; default: ret = req->status; -- 1.5.6.dirty From alekseys at voltaire.com Wed Aug 13 01:19:21 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:19:21 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 9/9] Remote IPv6 resolution In-Reply-To: <1218615372.19941.19.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218614781.19941.6.camel@linux-zn6t.site> <1218614979.19941.9.camel@linux-zn6t.site> <1218615072.19941.11.camel@linux-zn6t.site> <1218615140.19941.13.camel@linux-zn6t.site> <1218615246.19941.15.camel@linux-zn6t.site> <1218615322.19941.17.camel@linux-zn6t.site> <1218615372.19941.19.camel@linux-zn6t.site> Message-ID: <1218615561.19941.23.camel@linux-zn6t.site> >From ff5fb35f1e78bdfefab4702ab33127ca28de20ae Mon Sep 17 00:00:00 2001 From: Aleksey Senin Date: Wed, 13 Aug 2008 10:53:24 +0300 Subject: [RDMA CM IPv6 support. PATCHv3 9/9] Remote IPv6 resolution Added remote address resolusion for RDMA CM Function addr_resolve_remote used as wrapper for two other functions addr4_resolve_remote ( original addr_resolve_remote ) addr6_resolve_remote ( new function )  It seems like in the one of previews patches arguments changed to sockaddr and now switched back, but it's not true. It happens because of function names have heen changed. Signed-off-by: Aleksey Senin --- drivers/infiniband/core/addr.c | 53 +++++++++++++++++++++++++++++++++++---- 1 files changed, 47 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index c949ab0..83987e7 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -204,12 +204,12 @@ static void addr_send_arp(struct sockaddr *dst_in) } } -static int addr_resolve_remote(struct sockaddr *src_in, - struct sockaddr *dst_in, +static int addr4_resolve_remote(struct sockaddr_in *src_in, + struct sockaddr_in *dst_in, struct rdma_dev_addr *addr) { - __be32 src_ip = ((struct sockaddr_in *)src_in)->sin_addr.s_addr; - __be32 dst_ip = ((struct sockaddr_in *)dst_in)->sin_addr.s_addr; + __be32 src_ip = src_in->sin_addr.s_addr; + __be32 dst_ip = dst_in->sin_addr.s_addr; struct flowi fl; struct rtable *rt; struct neighbour *neigh; @@ -240,8 +240,8 @@ static int addr_resolve_remote(struct sockaddr *src_in, } if (!src_ip) { - src_in->sa_family = dst_in->sa_family; - ((struct sockaddr_in *)src_in)->sin_addr.s_addr = rt->rt_src; + src_in->sin_family = dst_in->sin_family; + src_in->sin_addr.s_addr = rt->rt_src; } ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); @@ -253,6 +253,47 @@ out: return ret; } +static int addr6_resolve_remote(struct sockaddr_in6 *src_in, + struct sockaddr_in6 *dst_in, + struct rdma_dev_addr *addr) +{ + struct flowi fl; + struct neighbour *neigh; + struct dst_entry *dst; + int ret = -ENODATA; + + memset(&fl, 0, sizeof fl); + fl.nl_u.ip6_u.daddr = dst_in->sin6_addr; + fl.nl_u.ip6_u.saddr = src_in->sin6_addr; + + dst = ip6_route_output(&init_net, NULL, &fl); + if (!dst) + return ret; + + if (dst->dev->flags & IFF_NOARP) { + ret = rdma_copy_addr(addr, dst->dev, NULL); + } else { + neigh = dst->neighbour; + if (neigh && (neigh->nud_state & NUD_VALID)) + ret = rdma_copy_addr(addr, neigh->dev, neigh->ha); + } + + dst_release(dst); + return ret; +} + +static int addr_resolve_remote(struct sockaddr *src_in, + struct sockaddr *dst_in, + struct rdma_dev_addr *addr) +{ + if (src_in->sa_family == AF_INET) { + return addr4_resolve_remote((struct sockaddr_in *)src_in, + (struct sockaddr_in *)dst_in, addr); + } else + return addr6_resolve_remote((struct sockaddr_in6 *)src_in, + (struct sockaddr_in6 *)dst_in, addr); +} + static void process_req(struct work_struct *work) { struct addr_req *req, *temp_req; -- 1.5.6.dirty From PHF at zurich.ibm.com Wed Aug 13 01:24:04 2008 From: PHF at zurich.ibm.com (Philip Frey1) Date: Wed, 13 Aug 2008 10:24:04 +0200 Subject: [ofa-general] cxgb3: MW Support In-Reply-To: <48A1B10A.6030605@opengridcomputing.com> References: <48A1B10A.6030605@opengridcomputing.com> Message-ID: > Philip Frey1 wrote: > > > > Are memory windows supported by the Chelsio T3 adapter? If so, what is > > the verb to bind/unbind them? > > > > MWs are supported in ofed-1.4, which is about the beta. The support in > linux/ofed is only for kernel-mode MWs. > > Are you doing user mode or kernel mode? User mode :) > For kernel mode, you allocate a mw with ib_alloc_mw(). And you bind it > with ib_bind_mw(). You can invalidate it by posting a IB_WR_LOCAL_INV > work request to the send queue. Then you can do a subsequent bind. > Deallocate with ib_dealloc_mw(). Ok, thanks. I will keep that in mind for later use (as soon as it is available). > For user mode, I have a patch series that adds support for MWs, but I > hasn't been submitted yet for group review. Its on my (very long) todo > list. :) No worries, it is not urgent. I was just wondering how expensive it is to allocate a memory window as opposed to allocating a memory region - way cheaper I guess. Thanks, Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: From alekseys at voltaire.com Wed Aug 13 01:39:39 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Wed, 13 Aug 2008 11:39:39 +0300 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 ] Comparision to PATCHv2 In-Reply-To: <1218614650.19941.4.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> Message-ID: <1218616779.4186.4.camel@linux-zn6t.site> Differences from V2: Sean notes were taken into account Patch 8 was split on two patches and patch for network discovery coming before remote address resolution From felix at chelsio.com Wed Aug 13 02:28:52 2008 From: felix at chelsio.com (Felix Marti) Date: Wed, 13 Aug 2008 02:28:52 -0700 Subject: [ofa-general] cxgb3: MW Support References: <48A1B10A.6030605@opengridcomputing.com> Message-ID: <8A71B368A89016469F72CD08050AD334033A53C9@maui.asicdesigners.com> From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Philip Frey1 Sent: Wednesday, August 13, 2008 1:24 AM To: Steve Wise Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] cxgb3: MW Support > Philip Frey1 wrote: > > > > Are memory windows supported by the Chelsio T3 adapter? If so, what is > > the verb to bind/unbind them? > > > > MWs are supported in ofed-1.4, which is about the beta. The support in > linux/ofed is only for kernel-mode MWs. > > Are you doing user mode or kernel mode? User mode :) > For kernel mode, you allocate a mw with ib_alloc_mw(). And you bind it > with ib_bind_mw(). You can invalidate it by posting a IB_WR_LOCAL_INV > work request to the send queue. Then you can do a subsequent bind. > Deallocate with ib_dealloc_mw(). Ok, thanks. I will keep that in mind for later use (as soon as it is available). > For user mode, I have a patch series that adds support for MWs, but I > hasn't been submitted yet for group review. Its on my (very long) todo > list. :) No worries, it is not urgent. I was just wondering how expensive it is to allocate a memory window as opposed to allocating a memory region - way cheaper I guess. [felix] Yes. Note that QP operations are considered 'fast path' (and are fully pipelined in T3) with respect to memory/STag management they include - Fast Register Non Shared MR (priv QP only) - Bind MW - Local Invalidate STag You can issue e.g. a Bind MW and then immediately use the MW STag in e.g. a following RDMA Write. Ordering guarantees that MW is ready by the time the RDMA Write is executed. Thanks, Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Wed Aug 13 02:52:54 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 13 Aug 2008 02:52:54 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080813-0200 daily build status Message-ID: <20080813095254.BC1D7E60DF6@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080813-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From jackm at dev.mellanox.co.il Wed Aug 13 03:09:56 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 13 Aug 2008 13:09:56 +0300 Subject: [ofa-general] Re: [PATCH 1 of 2 for 2.6.28] core: Fix Raw Ethertype QP support In-Reply-To: References: <200808121720.11878.jackm@dev.mellanox.co.il> Message-ID: <200808131309.57086.jackm@dev.mellanox.co.il> On Tuesday 12 August 2008 22:38, Roland Dreier wrote: > What is going to be the consumer of raw QPs? > > - R. > Raw QP will be used for an IB sniffer for switch packets. - Jack From jackm at dev.mellanox.co.il Wed Aug 13 03:13:53 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 13 Aug 2008 13:13:53 +0300 Subject: [ofa-general] Re: [PATCH for 2.6.28] core: fix memory leak in XRC userspace cleanup. In-Reply-To: References: <200808121831.03964.jackm@dev.mellanox.co.il> Message-ID: <200808131313.53344.jackm@dev.mellanox.co.il> On Tuesday 12 August 2008 22:42, Roland Dreier wrote: > This is obviously pretty gross.  Let me see it I can come up with a more > palatable fix in my tree. > >  - I know it's gross. I think the proper way to go is with resource counting, but as I mentioned, that can be a "falling dominoes" fix, with the required SRQ changes. I doubt I can do something nice for the beta release of ofed 1.4 -- but hopefully in a couple of weeks we'll have something better. - Jack From hal.rosenstock at gmail.com Wed Aug 13 06:53:10 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Wed, 13 Aug 2008 09:53:10 -0400 Subject: [ofa-general] OpenSM ran out of LIDs In-Reply-To: <829ded920808122143lc1962f3sbfdd671c3dcdae96@mail.gmail.com> References: <829ded920808110432x151d594cs12f7834aeb66a6c6@mail.gmail.com> <829ded920808112213k429180bay9dce2db0add44112@mail.gmail.com> <829ded920808122143lc1962f3sbfdd671c3dcdae96@mail.gmail.com> Message-ID: On Wed, Aug 13, 2008 at 12:43 AM, Keshetti Mahesh wrote: >> Are you sure it never gets read ? OpenSM will (also) update that file. > > Now OpenSM is accepting the guid2lid with some changes in the format. > (Initially I was not using newline between entries in the guid2lid file). So this was a hand edited file rather than one produced by OpenSM ? > But shouldn't OpenSM be able to find free lids on its own ? Yes. > After the LID assignment, I am getting different errors. > > ============================================== > Aug 13 08:42:47 479488 [41401960] 0x01 -> osm_lft_rcv_process: ERR > 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) > Switch 0x200000 > Aug 13 08:42:47 480381 [44606960] 0x01 -> osm_lft_rcv_process: ERR > 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) > Switch 0x200000 > Aug 13 08:42:47 487412 [41E02960] 0x01 -> osm_lft_rcv_process: ERR > 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) > Switch 0x200100 > Aug 13 08:42:47 487424 [45A08960] 0x01 -> osm_lft_rcv_process: ERR > 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) > Switch 0x200100 > Aug 13 08:42:47 487455 [42803960] 0x01 -> osm_lft_rcv_process: ERR > 0402: Setting forwarding table block failed (IB_INVALID_PARAMETER) > Switch 0x200100 > . > . > . > . > > > (On all the switches) > ================================================================================== > > What could be the reason for failure of setting LIN FWD tables when > the memory required for LIN FWD > tables is already allocated ? Not sure what you mean by that question. I think that error means OpenSM was unable to set the switch LFT. I think this is a simulator configuration issue. In ibsim/sim.h, try changing MAXLINEARCAP to (48*1024). -- Hal > -Mahesh > From kliteyn at mellanox.co.il Wed Aug 13 07:24:22 2008 From: kliteyn at mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 13 Aug 2008 17:24:22 +0300 Subject: [ofa-general] [PATCH] opensm/osm_ucast_mgr: code consolidation and cleanup In-Reply-To: <20080625101734.GB22159@sashak.voltaire.com> References: <1214252698.5369.537.camel@cardanus.llnl.gov> <20080624130950.GL7341@sashak.voltaire.com> <20080624204340.GR7341@sashak.voltaire.com> <20080624204509.GS7341@sashak.voltaire.com> <4861F98F.8080308@mellanox.co.il> <20080625101734.GB22159@sashak.voltaire.com> Message-ID: <48A2EE96.9000408@mellanox.co.il> Hi Sasha, Sasha Khapyorsky wrote: > Hi Yevgeny, > > On 10:53 Wed 25 Jun , Yevgeny Kliteynik wrote: > >> OpenSM crashed in cl_qlist_insert_tail() on the following assert: >> >> CL_ASSERT(p_list_item->p_list != p_list); >> > > Yes, I see why it does. Thanks for finding this. > > Something like this: > > diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c > index b9e484e..39ce2bd 100644 > --- a/opensm/opensm/osm_ucast_mgr.c > +++ b/opensm/opensm/osm_ucast_mgr.c > @@ -743,6 +743,9 @@ static void ucast_mgr_build_lfts(osm_ucast_mgr_t *p_mgr) > > cl_qmap_apply_func(&p_mgr->p_subn->sw_guid_tbl, > __osm_ucast_mgr_process_tbl, p_mgr); > + > + while(cl_is_qlist_empty(&p_mgr->port_order_list)) > + cl_qlist_remove_head(&p_mgr->port_order_list); > Perhaps you mean this: + while(!cl_is_qlist_empty(&p_mgr->port_order_list)) + cl_qlist_remove_head(&p_mgr->port_order_list); Or: + cl_qlist_remove_all(&p_mgr->port_order_list)) Patch shortly. -- Yevgeny > Sasha > From kliteyn at dev.mellanox.co.il Wed Aug 13 07:35:04 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 13 Aug 2008 17:35:04 +0300 Subject: [ofa-general] [PATCH] opensm/osm_ucast_mgr.c: cleaning port_order_list Message-ID: <48A2F118.30505@dev.mellanox.co.il> Hi Sasha, Small bug fix in cleaning the port order list. This bug was causing assertion in list handling. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_ucast_mgr.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index a686dc9..9d0ad13 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -785,8 +785,7 @@ static void ucast_mgr_build_lfts(osm_ucast_mgr_t *p_mgr) cl_qmap_apply_func(&p_mgr->p_subn->sw_guid_tbl, __osm_ucast_mgr_process_tbl, p_mgr); - while(cl_is_qlist_empty(&p_mgr->port_order_list)) - cl_qlist_remove_head(&p_mgr->port_order_list); + cl_qlist_remove_all(&p_mgr->port_order_list); } /********************************************************************** -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Wed Aug 13 07:52:00 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 13 Aug 2008 17:52:00 +0300 Subject: [ofa-general] [PATCH] opensm/libvendor/Makefile.am: create symbolic link to osmvendor library Message-ID: <48A2F510.4030903@dev.mellanox.co.il> Hi Sasha, Creating a symbolic link to the vendor library that denotes which type of vendor it is. This is needed for ibutils/ibis. This instal-exec-hook (among many others) was removed by the following patch: http://lists.openfabrics.org/pipermail/general/2008-July/052742.html Signed-off-by: Yevgeny Kliteynik --- opensm/libvendor/Makefile.am | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/opensm/libvendor/Makefile.am b/opensm/libvendor/Makefile.am index f72dbbe..a6dd0b9 100644 --- a/opensm/libvendor/Makefile.am +++ b/opensm/libvendor/Makefile.am @@ -88,3 +88,8 @@ libosmvendorinclude_HEADERS = $(HDRS) # headers are distributed as part of the include dir EXTRA_DIST = $(srcdir)/libosmvendor.map $(srcdir)/libosmvendor.ver + +# Create a link to the installed vendor lib to +# mark the type of the vendor library +install-exec-hook: + ln -sf $(DESTDIR)/$(libdir)/libosmvendor.so $(DESTDIR)/$(libdir)/libosmvendor_$(with_osmv).so -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Wed Aug 13 08:13:27 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 13 Aug 2008 18:13:27 +0300 Subject: [ofa-general] [PATCH v2] opensm/libvendor/Makefile.am: create symbolic link to osmvendor library Message-ID: <48A2FA17.3090802@dev.mellanox.co.il> Hi Sasha, Creating a symbolic link to the vendor library that denotes which type of vendor it is. This is needed for ibutils/ibis. This install-exec-hook (among many others) was removed by the following patch: http://lists.openfabrics.org/pipermail/general/2008-July/052742.html [V2] The previous version was creating a link to a full path file name. This one creates local link, to relative path file name. Signed-off-by: Yevgeny Kliteynik --- opensm/libvendor/Makefile.am | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/opensm/libvendor/Makefile.am b/opensm/libvendor/Makefile.am index f72dbbe..db1b319 100644 --- a/opensm/libvendor/Makefile.am +++ b/opensm/libvendor/Makefile.am @@ -88,3 +88,9 @@ libosmvendorinclude_HEADERS = $(HDRS) # headers are distributed as part of the include dir EXTRA_DIST = $(srcdir)/libosmvendor.map $(srcdir)/libosmvendor.ver + +# Create a link to the installed vendor lib to +# mark the type of the vendor library +install-exec-hook: + lname=`\ls -l $(DESTDIR)/$(libdir)/libosmvendor.so | awk '{print $$NF}'`; \ + ln -sf $$lname $(DESTDIR)/$(libdir)/libosmvendor_$(with_osmv).so; -- 1.5.1.4 From hrosenstock at obsidianresearch.com Wed Aug 13 08:43:04 2008 From: hrosenstock at obsidianresearch.com (Hal Rosenstock) Date: Wed, 13 Aug 2008 09:43:04 -0600 Subject: [ofa-general] [PATCH] ibsim: Add support for vendor ID and system image GUID Message-ID: <48A30108.4010307@obsidianresearch.com> Sasha, Please see attached file. Hopefully this works better. -- Hal -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch-ibsim-sysimg2 URL: From chu11 at llnl.gov Wed Aug 13 10:04:46 2008 From: chu11 at llnl.gov (Al Chu) Date: Wed, 13 Aug 2008 10:04:46 -0700 Subject: [ofa-general] [IBSIM][Trivial] initialize netstarted to 0 Message-ID: <1218647086.16508.566.camel@cardanus.llnl.gov> Hey Sasha, Noticed it. Nothing fancy. Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-initialize-netstarted-to-0.patch Type: text/x-patch Size: 613 bytes Desc: not available URL: From chu11 at llnl.gov Wed Aug 13 10:04:46 2008 From: chu11 at llnl.gov (Al Chu) Date: Wed, 13 Aug 2008 10:04:46 -0700 Subject: [ofa-general] [IBSIM][Trivial] initialize netstarted to 0 Message-ID: <1218647086.16508.566.camel@cardanus.llnl.gov> Hey Sasha, Noticed it. Nothing fancy. Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-initialize-netstarted-to-0.patch Type: text/x-patch Size: 613 bytes Desc: not available URL: From chu11 at llnl.gov Wed Aug 13 10:04:48 2008 From: chu11 at llnl.gov (Al Chu) Date: Wed, 13 Aug 2008 10:04:48 -0700 Subject: [ofa-general] [IBSIM][Trivial] document 'start' command Message-ID: <1218647088.16508.567.camel@cardanus.llnl.gov> Hey Sasha, I didn't know what 's' or 'S' did until I looked into the code. This documents it. Nothing fancy. Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-document-start-command.patch Type: text/x-patch Size: 809 bytes Desc: not available URL: From teli at aquaglass.com Wed Aug 13 10:36:55 2008 From: teli at aquaglass.com (Philip Painter) Date: Thu, 14 Aug 2008 01:36:55 +0800 Subject: [ofa-general] Re: Message-ID: <01c8fdae$3bd4d580$9df2bd3b@teli> IPTV � Die Zukunft des Fernsehens hat begonnen IPTV � The future for television is now To receive your favourite tv channelworldwide � IPTV does it. Not on your tv screen, but on your tv! And maxx-tv AGis one of the first-movers in this growing market. Due to the company, they will even start toenter the world�s largest tv market USA in Fall 2008. German program for morethan 40 million German-speaking people. And on top: The costs of the maketingcampaign are covered by the ste-top-box producer! This given, we recommend: BUY (SPECULATIVE) Maxx-TV AGWKN: A0M0KX, Symbol: M55FrankfurtStock ExchangeShare price: � 0.186 months-target: � 0.35 / 0.45 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralph.campbell at qlogic.com Wed Aug 13 13:27:49 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 13 Aug 2008 13:27:49 -0700 Subject: [ofa-general] [PATCH] IB/ipath - don't allow the QP path MTU to be set higher than port MTU In-Reply-To: References: <20080812171537.3699.62189.stgit@eng-46.mv.qlogic.com> Message-ID: <1218659269.620.458.camel@chromite.mv.qlogic.com> On Tue, 2008-08-12 at 11:40 -0700, Roland Dreier wrote: > > This patch fixes the problem by returning an error when attempting > > to set the QP path MTU greater than the port MTU. > > I don't believe this is correct according to the IB spec -- modify QP is > supposed to return an error only when the requested MTU is bigger than > the MTUcap of the port. Even if we do have this patch, it seems there > could still be trouble if an intermediate hop has a smaller MTU. > > Why is Open MPI trying to set a path MTU bigger than what the fabric > actually supports? > > - R. OK. I will withdraw this patch and make sure any needed changes are in OpenMPI. From chu11 at llnl.gov Wed Aug 13 14:07:06 2008 From: chu11 at llnl.gov (Al Chu) Date: Wed, 13 Aug 2008 14:07:06 -0700 Subject: [ofa-general] [IBSIM] Parse sim cmds by name not first character Message-ID: <1218661626.16508.579.camel@cardanus.llnl.gov> Hey Sasha, I was looking into adding a new command to ibsim, but since the original cmd-parsing function only checks for the first char of the inputted command, it limits the ability to add a reasonable-sounding new command name. The patch changes the function to check the entire command name. Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-parse-sim-cmds-via-full-name.patch Type: text/x-patch Size: 3210 bytes Desc: not available URL: From villey at vicsail.com Thu Aug 14 04:27:09 2008 From: villey at vicsail.com (judson eamon) Date: Thu, 14 Aug 2008 11:27:09 +0000 Subject: [ofa-general] Vacancy of the manager Message-ID: <000701c8fe0f$07a642c2$1d1b2881@rewqb> I'm Manager of the personnel department Antonio Barros. Your profile have been considered on job search site. I want to offer You vacant position Tour-Manager in our travel agency. Many people live for their vacations. Have you ever thought of making vacations your life? If you love working with people and helping them enjoy themselves, consider working in travel and tourism. Travel and tourism actually consists of many different industries, all combining to create a vibrant and exciting whole. Hotels, transportation, recreational parks and restaurants are just a few of the different areas that use the services of travel and tourism specialists. The most obvious course of joining the travel and tourism industry is to become a Tour-Manager. As a Tour-Manager, you work with your customers and clients to help them design the perfect trip for their needs, whether it is business or pleasure. You might help them chose a destination, purchase tickets, make hotel or car rental reservations, and advise them on tours and other recreational activities in an area. Hospitality jobs are abundant in the United States working in a hotel, resort or on a cruise ship. This can be a wonderful and fulfilling career for someone who really enjoys meeting new people and helping them enjoy themselves. With the most important part of their job making sure that visitors have everything they need and advising them on what to do in an area, it can be the perfect field for somebody who is both enthusiastic and articulate with a strong attention to detail. There are also an abundance of travel and tourism jobs, such as working at a car rental agency, as a tour guide, or in casinos, spas or convention centers. And there are some unique perks to be found in working in travel and tourism! Hotels, airlines and resorts often partner with one another, offering discount packages or reduced rates for travel or lodging. Travel and tourism is an area in which there always seems to be activity, making it an exciting world to work in! Whether you enjoy helping somebody live their dream vacation or working to make sure that a hurried businessperson gets everything they need to accomplish their trip, Career Explorer can help with your career planning process. The applicant must have the following qualifications: - Be able to check your email several times a day - Confident PC user (SW package Office), mail programs, Internet - U.S. authorized work status - Cell phone - Adult age Starting salary is 3,500.00 USD per month. We are proud to be an equal opportunity employer. If you think you are ready for this job send your CV or resume to: sentiretravel.antonio.barros at gmail.com Wait your answer. Antonio Barros. From jwong at datallegro.com Wed Aug 13 15:16:14 2008 From: jwong at datallegro.com (Jeffrey Wong) Date: Wed, 13 Aug 2008 18:16:14 -0400 Subject: [ofa-general] ibv_poll_cq calling kill signal Message-ID: I am running OFED 1.2.5.1 I have an application that contains a ibv completion event handler calling the following: struct ibv_cq *cq; struct ibv_comp_channel *channel; struct ibv_context *device_context; struct ibv_wc wc; channel = ibv_create_comp_channel(device_context); ret = ibv_get_cq_event(channel, &cq, &cq_ctx); ibv_ack_cq_events(cq, 1); ret = ibv_poll_cq(cq, 1 &wc); Periodically I get the following error which kills my application. [Switching to thread 251 (process 22110)]#0 0x00002b3645d2ee47 in kill () from /lib64/libc.so.6 (gdb) bt #0 0x00002b3645d2ee47 in kill () from /lib64/libc.so.6 #1 0x00002b3644cd9076 in EXsignal () from #2 0x00002b3644cd9225 in i_EXcatch () #3 #4 0x00002aaaed423040 in ?? () #5 0x00002b364412385b in ibv_poll_cq (cq=0x2aaaec014f20, num_entries=1, wc=0x43001000) at /usr/include/infiniband/verbs.h:883 Any suggestions. Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralph.campbell at qlogic.com Wed Aug 13 17:08:00 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 13 Aug 2008 17:08:00 -0700 Subject: [ofa-general] [PATCH 0/2] IB/ipath -- minor fixes for 2.6.27 Message-ID: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> The following patches fix a couple of minor bugs. IB/ipath - fix lost UD send work request IB/ipath - Fixed incorrect check for max physical address in TID From ralph.campbell at qlogic.com Wed Aug 13 17:08:06 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 13 Aug 2008 17:08:06 -0700 Subject: [ofa-general] [PATCH 1/2] IB/ipath - fix lost UD send work request In-Reply-To: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> References: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> Message-ID: <20080814000805.7874.15471.stgit@eng-46.mv.qlogic.com> If a UD QP has some work requests queued to be sent by the DMA engine followed by a local loopback work request, we have to wait for the previous work requests to finish or the completion for the local loopback work request would be generated out of order. The problem was that the work request queue pointer was already updated so that the request would not be processed when the DMA queue drained. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_ud.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_ud.c b/drivers/infiniband/hw/ipath/ipath_ud.c index 36aa242..729446f 100644 --- a/drivers/infiniband/hw/ipath/ipath_ud.c +++ b/drivers/infiniband/hw/ipath/ipath_ud.c @@ -267,6 +267,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) u16 lrh0; u16 lid; int ret = 0; + int next_cur; spin_lock_irqsave(&qp->s_lock, flags); @@ -290,8 +291,9 @@ int ipath_make_ud_req(struct ipath_qp *qp) goto bail; wqe = get_swqe_ptr(qp, qp->s_cur); - if (++qp->s_cur >= qp->s_size) - qp->s_cur = 0; + next_cur = qp->s_cur + 1; + if (next_cur >= qp->s_size) + next_cur = 0; /* Construct the header. */ ah_attr = &to_iah(wqe->wr.wr.ud.ah)->attr; @@ -315,6 +317,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) qp->s_flags |= IPATH_S_WAIT_DMA; goto bail; } + qp->s_cur = next_cur; spin_unlock_irqrestore(&qp->s_lock, flags); ipath_ud_loopback(qp, wqe); spin_lock_irqsave(&qp->s_lock, flags); @@ -323,6 +326,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) } } + qp->s_cur = next_cur; extra_bytes = -wqe->length & 3; nwords = (wqe->length + extra_bytes) >> 2; From ralph.campbell at qlogic.com Wed Aug 13 17:08:11 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 13 Aug 2008 17:08:11 -0700 Subject: [ofa-general] [PATCH 2/2] IB/ipath - Fixed incorrect check for max physical address in TID In-Reply-To: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> References: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> Message-ID: <20080814000811.7874.52077.stgit@eng-46.mv.qlogic.com> From: Dave Olson The check for max physical address was incorrect, thus limiting the range of allowed physical addresses. Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_iba7220.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_iba7220.c b/drivers/infiniband/hw/ipath/ipath_iba7220.c index d90f5e9..9839e20 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba7220.c +++ b/drivers/infiniband/hw/ipath/ipath_iba7220.c @@ -1720,7 +1720,7 @@ static void ipath_7220_put_tid(struct ipath_devdata *dd, u64 __iomem *tidptr, "not 2KB aligned!\n", pa); return; } - if (pa >= (1UL << IBA7220_TID_SZ_SHIFT)) { + if (chippa >= (1UL << IBA7220_TID_SZ_SHIFT)) { ipath_dev_err(dd, "BUG: Physical page address 0x%lx " "larger than supported\n", pa); From alekseys at voltaire.com Thu Aug 14 02:00:22 2008 From: alekseys at voltaire.com (Aleksey Senin) Date: Thu, 14 Aug 2008 12:00:22 +0300 Subject: [ofa-general] RFC PATCHv2 IPv6 support in rping Message-ID: <1218704422.19372.9.camel@linux-zn6t.site> Differences between version 1 Using new code from Sean Less complicated logic due to using getaddrinfo function Using sockaddr_storage in rping_cb instead of sockaddr_in in order to support IPv6 Return code of get_addr is return code of getaddrinfo function Removed label out from get_addr function Extended help to explaing using how to bind ANY address when using IPv6 Signed-off-by: Aleksey Senin --- examples/rping.c | 29 ++++++++++++++++------------- 1 files changed, 16 insertions(+), 13 deletions(-) diff --git a/examples/rping.c b/examples/rping.c index f5dd701..17782b2 100644 --- a/examples/rping.c +++ b/examples/rping.c @@ -143,7 +143,7 @@ struct rping_cb { enum test_state state; /* used for cond/signalling */ sem_t sem; - struct sockaddr_in sin; + struct sockaddr_storage sin; uint16_t port; /* dst port in NBO */ int verbose; /* verbose logging */ int count; /* ping count */ @@ -728,8 +728,11 @@ static int rping_test_server(struct rping_cb *cb) static int rping_bind_server(struct rping_cb *cb) { int ret; + if (cb->sin.ss_family == AF_INET) + ((struct sockaddr_in *)&cb->sin)->sin_port = cb->port; + else + ((struct sockaddr_in6 *)&cb->sin)->sin6_port = cb->port; - cb->sin.sin_port = cb->port; ret = rdma_bind_addr(cb->cm_id, (struct sockaddr *) &cb->sin); if (ret) { fprintf(stderr, "rdma_bind_addr error %d\n", ret); @@ -991,8 +994,11 @@ static int rping_connect_client(struct rping_cb *cb) static int rping_bind_client(struct rping_cb *cb) { int ret; + if (cb->sin.ss_family == AF_INET ) + ((struct sockaddr_in *)&cb->sin)->sin_port = cb->port; + else + ((struct sockaddr_in6 *)&cb->sin)->sin6_port = cb->port; - cb->sin.sin_port = cb->port; ret = rdma_resolve_addr(cb->cm_id, NULL, (struct sockaddr *) &cb->sin, 2000); if (ret) { fprintf(stderr, "rdma_resolve_addr error %d\n", ret); @@ -1055,7 +1061,7 @@ err1: return ret; } -static int get_addr(char *dst, struct sockaddr_in *addr) +static int get_addr(char *dst, struct sockaddr *addr) { struct addrinfo *res; int ret; @@ -1066,13 +1072,10 @@ static int get_addr(char *dst, struct sockaddr_in *addr) return ret; } - if (res->ai_family != PF_INET) { - ret = -1; - goto out; + if (res->ai_family == PF_INET || res->ai_family == PF_INET6 ) { + *(struct sockaddr_storage *)addr = + *(struct sockaddr_storage *)res->ai_addr; } - - *addr = *(struct sockaddr_in *) res->ai_addr; -out: freeaddrinfo(res); return ret; } @@ -1084,7 +1087,7 @@ static void usage(char *name) printf("%s -c [-vVd] [-S size] [-C count] -a addr [-p port]\n", basename(name)); printf("\t-c\t\tclient side\n"); - printf("\t-s\t\tserver side\n"); + printf("\t-s\t\tserver side. To bind any address with IPv6 use -a ::0 argument\n"); printf("\t-v\t\tdisplay ping data to stdout\n"); printf("\t-V\t\tvalidate ping data\n"); printf("\t-d\t\tdebug printfs\n"); @@ -1110,7 +1113,7 @@ int main(int argc, char *argv[]) cb->server = -1; cb->state = IDLE; cb->size = 64; - cb->sin.sin_family = PF_INET; + cb->sin.ss_family = PF_INET; cb->port = htons(7174); sem_init(&cb->sem, 0, 0); @@ -1118,7 +1121,7 @@ int main(int argc, char *argv[]) while ((op=getopt(argc, argv, "a:Pp:C:S:t:scvVd")) != -1) { switch (op) { case 'a': - ret = get_addr(optarg, &cb->sin); + ret = get_addr(optarg, (struct sockaddr *) &cb->sin); break; case 'P': persistent_server = 1; -- 1.5.6.dirty From vlad at lists.openfabrics.org Thu Aug 14 02:52:43 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 14 Aug 2008 02:52:43 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080814-0200 daily build status Message-ID: <20080814095243.D24D8E60E27@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080814-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From ogerlitz at voltaire.com Thu Aug 14 04:31:38 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 14 Aug 2008 14:31:38 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: <48A4179A.8030206@voltaire.com> Roland Dreier wrote: > > > I don't think moving to a different workqueue helps, does it? Because > > > we just have to flush *that* workqueue somewhere too. > > > Yes, but it won't have to be from ipoib_stop, it can be from a place > > where rtnl_lock is not held. > > That's kind of the direction I've been looking, except I don't think we > need to invent a new workqueue to do this. It seems that ipoib_stop is > the wrong place to flush our workqueue in general. I get this with 2.6.27-rc3 which I assume is what this thread talks about... can you fix that guys... rebooting after each time the device goes down is hard to work with. I'll use the current version of the patch for now, but would be happy to see something goes upstream Or. > ipoib D ffffffff805ea3c0 0 1905 2 > ffff880037c1ddc0 0000000000000046 0000000000000000 ffff88007e4b9230 > ffff88007f842f80 0000000100000003 ffff88007e827910 ffff88007e8276c0 > 0000000000000001 0000000000000000 0000000000000000 00000000000000ff > Call Trace: > [] __mutex_lock_slowpath+0x69/0xa6 > [] mutex_lock+0x24/0x28 > [] mthca_query_port+0x1c4/0x1d6 [ib_mthca] > [] mthca_query_port+0x1c4/0x1d6 [ib_mthca] > [] ipoib_mcast_join_task+0x244/0x290 [ib_ipoib] > [] ipoib_mcast_join_task+0x0/0x290 [ib_ipoib] > [] run_workqueue+0x8f/0x114 > [] worker_thread+0x0/0xec > [] worker_thread+0xe2/0xec > [] autoremove_wake_function+0x0/0x2e > [] autoremove_wake_function+0x0/0x2e > [] kthread+0x3d/0x63 > [] child_rip+0xa/0x11 > [] kthread+0x0/0x63 > [] child_rip+0x0/0x11 > > > ifconfig D 0000000000000002 0 19218 5470 > ffff880058d4dc18 0000000000000082 0000000000000000 ffffffff80555a40 > ffffffff80550020 0000000000000001 ffff88007a30a000 ffff88007a309db0 > 0000000058d4dbe8 0000000000000000 00000000ffffffff 00000000000000ff > Call Trace: > [] schedule_timeout+0x1e/0xad > [] schedule_timeout+0x1e/0xad > [] wait_for_common+0xfb/0x178 > [] default_wake_function+0x0/0xe > [] default_wake_function+0x0/0xe > [] flush_cpu_workqueue+0x62/0x6b > [] wq_barrier_func+0x0/0x9 > [] flush_workqueue+0x38/0x4e > [] ipoib_stop+0x75/0x10c [ib_ipoib] > [] dev_close+0x6f/0x87 > [] dev_change_flags+0xa3/0x15b > [] devinet_ioctl+0x293/0x5d3 > [] inet_ioctl+0x8f/0xa7 > [] sock_ioctl+0x0/0x1f6 > [] sock_ioctl+0x1d2/0x1f6 > [] vfs_ioctl+0x29/0x6f > [] do_vfs_ioctl+0x256/0x265 > [] sys_ioctl+0x51/0x74 > [] system_call_fastpath+0x16/0x1b > From sean.hefty at intel.com Thu Aug 14 08:39:16 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Thu, 14 Aug 2008 08:39:16 -0700 Subject: [ofa-general] RE: RFC PATCHv2 IPv6 support in rping In-Reply-To: <1218704422.19372.9.camel@linux-zn6t.site> References: <1218704422.19372.9.camel@linux-zn6t.site> Message-ID: thanks - I'll add this to my git tree From sean.hefty at intel.com Thu Aug 14 09:43:05 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Thu, 14 Aug 2008 09:43:05 -0700 Subject: [ofa-general] RE: RFC PATCHv2 IPv6 support in rping In-Reply-To: <1218704422.19372.9.camel@linux-zn6t.site> References: <1218704422.19372.9.camel@linux-zn6t.site> Message-ID: >+ if (res->ai_family == PF_INET || res->ai_family == PF_INET6 ) { >+ *(struct sockaddr_storage *)addr = >+ *(struct sockaddr_storage *)res->ai_addr; I changed this part around to this: if (res->ai_family == PF_INET) memcpy(addr, res->ai_addr, sizeof(struct sockaddr_in)); else if (res->ai_family == PF_INET6) memcpy(addr, res->ai_addr, sizeof(struct sockaddr_in6)); else ret = -1; I don't think always res->ai_addr points to memory the sizeof sockaddr_storage. - Sean From aj.guillon at gmail.com Thu Aug 14 14:22:45 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 14 Aug 2008 17:22:45 -0400 Subject: [ofa-general] Are Memory Locations Accessed via RDMA Cached? Message-ID: <9870a2060808141422h223404abvc18e37c3f83be466@mail.gmail.com> Does RDMA have support for memory caching? So if one node accesses the memory of another, and the contents haven't changed, does it have to go out onto the network to get all the data again? Thanks, AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj.guillon at gmail.com Thu Aug 14 14:24:44 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 14 Aug 2008 17:24:44 -0400 Subject: [ofa-general] ***SPAM*** Accessing RDMA Memory Locally Message-ID: <9870a2060808141424s3f2da738y372741dae79187e1@mail.gmail.com> If I allocate memory to be accessible by others using RDMA operations, my understanding is that I use RDMA operations myself to access that memory locally. Is that correct? Can I access that memory directly with pointers in the case of a node accessing its own memory location? Are local access RDMA operations efficient? Thanks, AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Thu Aug 14 15:44:46 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 14 Aug 2008 15:44:46 -0700 Subject: [ofa-general] Are Memory Locations Accessed via RDMA Cached? In-Reply-To: <9870a2060808141422h223404abvc18e37c3f83be466@mail.gmail.com> (Adrien Guillon's message of "Thu, 14 Aug 2008 17:22:45 -0400") References: <9870a2060808141422h223404abvc18e37c3f83be466@mail.gmail.com> Message-ID: > Does RDMA have support for memory caching? So if one node accesses the > memory of another, and the contents haven't changed, does it have to go out > onto the network to get all the data again? There is no cache coherence protocol, and all RDMA operations are explicitly requested by the application -- so, in short, if you request an RDMA operation it will always be performed, even if data has not changed. - R. From arlin.r.davis at intel.com Thu Aug 14 16:19:16 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:19:16 -0700 Subject: [ofa-general] [PATCH 1/6][uDAPL v1] dapl scm: change connect and accept to non-blocking to avoid blocking user thread. Message-ID: <000001c8fe64$2e302130$8963fe0a@amr.corp.intel.com> Patch set for uDAPL v1 that includes socket cm provider improvements. Similar patch set coming for uDAPL v2. The connect socket that is used to exchange QP information is now non-blocking and the data exchange is done via the cr thread. New state RTU_PENDING added. On the passive side there is a new state ACCEPT_DATA used to avoid read blocking on the user accept call. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_cm.c | 214 +++++++++++++++++++++++++++++----------- dapl/openib_scm/dapl_ib_util.h | 2 + 2 files changed, 156 insertions(+), 60 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_cm.c b/dapl/openib_scm/dapl_ib_cm.c index f78ebe6..03b0f12 100644 --- a/dapl/openib_scm/dapl_ib_cm.c +++ b/dapl/openib_scm/dapl_ib_cm.c @@ -197,6 +197,63 @@ dapli_socket_disconnect(ib_cm_handle_t cm_ptr) return DAT_SUCCESS; } +/* + * ACTIVE: socket connected, send QP information to peer + */ +void +dapli_socket_connected(ib_cm_handle_t cm_ptr, int err) +{ + int len, opt = 1; + struct iovec iovec[2]; + struct dapl_ep *ep_ptr = cm_ptr->ep; + + if (err) { + dapl_log(DAPL_DBG_TYPE_ERR, " connect: socket ERR %s\n", + strerror(err)); + goto bail; + } + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " socket connected, write QP and private data\n"); + + /* no delay for small packets */ + setsockopt(cm_ptr->socket,IPPROTO_TCP,TCP_NODELAY,&opt,sizeof(opt)); + + /* send qp info and pdata to remote peer */ + iovec[0].iov_base = &cm_ptr->dst; + iovec[0].iov_len = sizeof(ib_qp_cm_t); + if (cm_ptr->dst.p_size) { + iovec[1].iov_base = cm_ptr->p_data; + iovec[1].iov_len = ntohl(cm_ptr->dst.p_size); + } + + len = writev(cm_ptr->socket, iovec, (cm_ptr->dst.p_size ? 2:1)); + if (len != (ntohl(cm_ptr->dst.p_size) + sizeof(ib_qp_cm_t))) { + dapl_log(DAPL_DBG_TYPE_ERR, + " connect write: ERR %s, wcnt=%d\n", + strerror(errno), len); + goto bail; + } + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " connected: sending SRC port=0x%x lid=0x%x," + " qpn=0x%x, psize=%d\n", + ntohs(cm_ptr->dst.port), ntohs(cm_ptr->dst.lid), + ntohl(cm_ptr->dst.qpn), ntohl(cm_ptr->dst.p_size)); + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " connected: sending SRC GID subnet %016llx id %016llx\n", + (unsigned long long) + cpu_to_be64(cm_ptr->dst.gid.global.subnet_prefix), + (unsigned long long) + cpu_to_be64(cm_ptr->dst.gid.global.interface_id)); + + /* queue up to work thread to avoid blocking consumer */ + cm_ptr->state = SCM_RTU_PENDING; + return; +bail: + /* close socket, free cm structure and post error event */ + dapli_cm_destroy(cm_ptr); + dapl_evd_connection_callback(NULL, IB_CME_LOCAL_FAILURE, NULL, ep_ptr); +} + /* * ACTIVE: Create socket, connect, defer exchange QP information to CR thread @@ -210,8 +267,7 @@ dapli_socket_connect(DAPL_EP *ep_ptr, DAT_PVOID p_data) { ib_cm_handle_t cm_ptr; - int len, opt = 1; - struct iovec iovec[2]; + int ret; DAPL_IA *ia_ptr = ep_ptr->header.owner_ia; dapl_dbg_log(DAPL_DBG_TYPE_EP, " connect: r_qual %d p_size=%d\n", @@ -227,19 +283,28 @@ dapli_socket_connect(DAPL_EP *ep_ptr, return DAT_INSUFFICIENT_RESOURCES; } - ((struct sockaddr_in*)r_addr)->sin_port = htons(r_qual); + /* non-blocking */ + ret = fcntl(cm_ptr->socket, F_GETFL); + if (ret < 0 || fcntl(cm_ptr->socket, + F_SETFL, ret | O_NONBLOCK) < 0) { + dapl_log(DAPL_DBG_TYPE_ERR, + " connect: fcntl on socket %d ERR %d %s\n", + cm_ptr->socket, ret, + strerror(errno)); + goto bail; + } - if (connect(cm_ptr->socket, r_addr, sizeof(*r_addr)) < 0) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect: %s on r_qual %d\n", - strerror(errno), (unsigned int)r_qual); + ((struct sockaddr_in*)r_addr)->sin_port = htons(r_qual); + ret = connect(cm_ptr->socket, r_addr, sizeof(*r_addr)); + if (ret && errno != EINPROGRESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " connect ERROR: %s on %s r_qual %d\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); dapli_cm_destroy(cm_ptr); return DAT_INVALID_ADDRESS; - } - setsockopt(cm_ptr->socket,IPPROTO_TCP,TCP_NODELAY,&opt,sizeof(opt)); - - dapl_dbg_log(DAPL_DBG_TYPE_EP, " socket connected!\n"); - + } /* Send QP info, IA address, and private data */ cm_ptr->dst.qpn = htonl(ep_ptr->qp_handle->qp_num); @@ -257,41 +322,36 @@ dapli_socket_connect(DAPL_EP *ep_ptr, &cm_ptr->dst.gid)) goto bail; + /* save references */ + cm_ptr->hca = ia_ptr->hca_ptr; + cm_ptr->ep = ep_ptr; cm_ptr->dst.ia_address = ia_ptr->hca_ptr->hca_address; - cm_ptr->dst.p_size = htonl(p_size); - iovec[0].iov_base = &cm_ptr->dst; - iovec[0].iov_len = sizeof(ib_qp_cm_t); if (p_size) { - iovec[1].iov_base = p_data; - iovec[1].iov_len = p_size; + cm_ptr->dst.p_size = htonl(p_size); + dapl_os_memcpy(cm_ptr->p_data, p_data, p_size); } - dapl_dbg_log(DAPL_DBG_TYPE_EP," socket connected, write QP and private data\n"); - len = writev(cm_ptr->socket, iovec, (p_size ? 2:1)); - if (len != (p_size + sizeof(ib_qp_cm_t))) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect write: ERR %s, wcnt=%d\n", - strerror(errno), len); - goto bail; - } - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect: SRC port=0x%x lid=0x%x, qpn=0x%x, psize=%d\n", - ntohs(cm_ptr->dst.port), ntohs(cm_ptr->dst.lid), - ntohl(cm_ptr->dst.qpn), ntohl(cm_ptr->dst.p_size)); - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect SRC GID subnet %016llx id %016llx\n", - (unsigned long long) - cpu_to_be64(cm_ptr->dst.gid.global.subnet_prefix), - (unsigned long long) - cpu_to_be64(cm_ptr->dst.gid.global.interface_id)); - - /* queue up to work thread to avoid blocking consumer */ - cm_ptr->state = SCM_CONN_PENDING; - cm_ptr->hca = ia_ptr->hca_ptr; - cm_ptr->ep = ep_ptr; + /* connected or pending, either way results via async event */ + if (ret == 0) + dapli_socket_connected(cm_ptr,0); + else + cm_ptr->state = SCM_CONN_PENDING; + + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " connect: socket %d to %s r_qual %d pending\n", + cm_ptr->socket, + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); + dapli_cm_queue(cm_ptr); return DAT_SUCCESS; bail: + dapl_log(DAPL_DBG_TYPE_ERR, + " connect ERROR: %s query lid(0x%x)/gid on %s r_qual %d\n", + strerror(errno),ntohs(cm_ptr->dst.lid), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); + /* close socket, free cm structure */ dapli_cm_destroy(cm_ptr); return DAT_INTERNAL_ERROR; @@ -470,25 +530,22 @@ bail: return dat_status; } - /* - * PASSIVE: accept socket, receive peer QP information, private data, post cr_event + * PASSIVE: accept socket */ -DAT_RETURN +void dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) { ib_cm_handle_t acm_ptr; - void *p_data = NULL; int len; - DAT_RETURN dat_status = DAT_SUCCESS; dapl_dbg_log(DAPL_DBG_TYPE_EP," socket_accept\n"); /* Allocate accept CM and initialize */ if ((acm_ptr = dapl_os_alloc(sizeof(*acm_ptr))) == NULL) - return DAT_INSUFFICIENT_RESOURCES; + goto bail; - (void) dapl_os_memzero( acm_ptr, sizeof( *acm_ptr ) ); + (void) dapl_os_memzero(acm_ptr, sizeof(*acm_ptr)); acm_ptr->socket = -1; acm_ptr->sp = cm_ptr->sp; @@ -498,15 +555,34 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) acm_ptr->socket = accept(cm_ptr->socket, (struct sockaddr*)&acm_ptr->dst.ia_address, (socklen_t*)&len); - if (acm_ptr->socket < 0) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, " accept: ERR %s on FD %d l_cr %p\n", strerror(errno),cm_ptr->socket,cm_ptr); - dat_status = DAT_INTERNAL_ERROR; goto bail; } + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " socket accepted, queue new cm %p\n",acm_ptr); + + acm_ptr->state = SCM_ACCEPTING; + dapli_cm_queue(acm_ptr); + return; +bail: + /* close socket, free cm structure, active will see socket close as reject */ + if (acm_ptr) + dapli_cm_destroy(acm_ptr); +} + +/* + * PASSIVE: receive peer QP information, private data, post cr_event + */ +void +dapli_socket_accept_data(ib_cm_srvc_handle_t acm_ptr) +{ + int len; + void *p_data = NULL; + dapl_dbg_log(DAPL_DBG_TYPE_EP," socket accepted, read QP data\n"); /* read in DST QP info, IA address. check for private data */ @@ -516,7 +592,6 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) dapl_dbg_log(DAPL_DBG_TYPE_ERR, " accept read: ERR %s, rcnt=%d, ver=%d\n", strerror(errno), len, acm_ptr->dst.ver); - dat_status = DAT_INTERNAL_ERROR; goto bail; } @@ -537,7 +612,6 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) dapl_dbg_log(DAPL_DBG_TYPE_ERR, " accept read: psize (%d) wrong\n", acm_ptr->dst.p_size); - dat_status = DAT_INTERNAL_ERROR; goto bail; } @@ -551,24 +625,24 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) dapl_dbg_log(DAPL_DBG_TYPE_ERR, " accept read pdata: ERR %s, rcnt=%d\n", strerror(errno), len); - dat_status = DAT_INTERNAL_ERROR; goto bail; } dapl_dbg_log(DAPL_DBG_TYPE_EP," accept: psize=%d read\n",len); p_data = acm_ptr->p_data; } - acm_ptr->state = SCM_ACCEPTING; + acm_ptr->state = SCM_ACCEPTING_DATA; /* trigger CR event and return SUCCESS */ dapls_cr_callback(acm_ptr, IB_CME_CONNECTION_REQUEST_PENDING, p_data, acm_ptr->sp ); - return DAT_SUCCESS; + return; bail: + /* close socket, free cm structure, active will see socket close as reject */ dapli_cm_destroy(acm_ptr); - return DAT_INTERNAL_ERROR; + return; } /* @@ -669,7 +743,6 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, sizeof(cm_ptr->dst.ia_address)); dapl_dbg_log( DAPL_DBG_TYPE_EP," PASSIVE: accepted!\n" ); - dapli_cm_queue(cm_ptr); return DAT_SUCCESS; bail: dapl_dbg_log(DAPL_DBG_TYPE_ERR," accept_rtu: ERR !QP_RTR_RTS \n"); @@ -1202,7 +1275,8 @@ void cr_thread(void *arg) { struct dapl_hca *hca_ptr = arg; ib_cm_handle_t cr, next_cr; - int ret,idx; + int opt,ret,idx; + socklen_t opt_len; char rbuf[2]; struct pollfd ufds[SCM_MAX_CONN]; @@ -1242,14 +1316,19 @@ void cr_thread(void *arg) /* Add to ufds for poll, check for immediate work */ ufds[++idx].fd = cr->socket; /* add listen or cr */ - ufds[idx].events = POLLIN; + if (cr->state == SCM_CONN_PENDING) + ufds[idx].events = POLLOUT; + else + ufds[idx].events = POLLIN; /* check socket for event, accept in or connect out */ dapl_dbg_log(DAPL_DBG_TYPE_CM," poll cr=%p, fd=%d,%d\n", cr, cr->socket, ufds[idx].fd); dapl_os_unlock(&hca_ptr->ib_trans.lock); ret = poll(&ufds[idx],1,1); - dapl_dbg_log(DAPL_DBG_TYPE_CM," poll wakeup ret=%d cr->st=%d ev=%d fd=%d\n", + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " poll wakeup ret=%d cr->st=%d" + " ev=0x%x fd=%d\n", ret,cr->state,ufds[idx].revents,ufds[idx].fd); /* data on listen, qp exchange, and on disconnect request */ @@ -1257,13 +1336,28 @@ void cr_thread(void *arg) if (cr->socket > 0) { if (cr->state == SCM_LISTEN) dapli_socket_accept(cr); + else if (cr->state == SCM_ACCEPTING) + dapli_socket_accept_data(cr); else if (cr->state == SCM_ACCEPTED) dapli_socket_accept_rtu(cr); - else if (cr->state == SCM_CONN_PENDING) + else if (cr->state == SCM_RTU_PENDING) dapli_socket_connect_rtu(cr); else if (cr->state == SCM_CONNECTED) dapli_socket_disconnect(cr); } + /* connect socket is writable, check status */ + } else if ((ret == 1) && + (ufds[idx].revents & POLLOUT || + ufds[idx].revents & POLLERR)) { + if (cr->state == SCM_CONN_PENDING) { + opt = 0; + ret = getsockopt(cr->socket, SOL_SOCKET, + SO_ERROR, &opt, &opt_len); + if (!ret) + dapli_socket_connected(cr,opt); + else + dapli_socket_connected(cr,EFAULT); + } } else if (ret != 0) { dapl_dbg_log(DAPL_DBG_TYPE_CM, " cr_thread(cr=%p) st=%d poll ERR= %s\n", diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index 37c5dbb..6d7568c 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -102,7 +102,9 @@ typedef enum scm_state SCM_INIT, SCM_LISTEN, SCM_CONN_PENDING, + SCM_RTU_PENDING, SCM_ACCEPTING, + SCM_ACCEPTING_DATA, SCM_ACCEPTED, SCM_REJECTED, SCM_CONNECTED, -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 16:19:28 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:19:28 -0700 Subject: [ofa-general] [PATCH 2/6][uDAPL v1] dapl scm: add mtu adjustments via environment, default = 1024. Message-ID: <000101c8fe64$34d8b060$8963fe0a@amr.corp.intel.com> DAPL_IB_MTU adjusts path mtu setting for RC qp's. Default setting is min of 1024 and active mtu on IB device. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 22 +++++++++++++++++++--- dapl/openib_scm/dapl_ib_util.h | 1 + 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index 96116c8..fdbd17f 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -64,6 +64,18 @@ static const char rcsid[] = "$Id: $"; int g_dapl_loopback_connection = 0; int g_scm_pipe[2]; +enum ibv_mtu dapl_ib_mtu(int mtu) +{ + switch (mtu) { + case 256: return IBV_MTU_256; + case 512: return IBV_MTU_512; + case 1024: return IBV_MTU_1024; + case 2048: return IBV_MTU_2048; + case 4096: return IBV_MTU_4096; + default: return IBV_MTU_1024; + } +} + /* just get IP address for hostname */ DAT_RETURN getipaddr( char *addr, int addr_len) { @@ -223,6 +235,8 @@ found: dapl_os_get_env_val("DAPL_HOP_LIMIT", SCM_HOP_LIMIT); hca_ptr->ib_trans.tclass = dapl_os_get_env_val("DAPL_TCLASS", SCM_TCLASS); + hca_ptr->ib_trans.mtu = + dapl_ib_mtu(dapl_os_get_env_val("DAPL_IB_MTU", SCM_IB_MTU)); /* initialize cq_lock */ dat_status = dapl_os_lock_init(&hca_ptr->ib_trans.cq_lock); @@ -446,13 +460,15 @@ DAT_RETURN dapls_ib_query_hca ( ia_attr->vendor_attr = NULL; hca_ptr->ib_trans.ack_timer = DAPL_MAX(dev_attr.local_ca_ack_delay, hca_ptr->ib_trans.ack_timer); - hca_ptr->ib_trans.mtu = port_attr.active_mtu; + hca_ptr->ib_trans.mtu = DAPL_MIN(port_attr.active_mtu, + hca_ptr->ib_trans.mtu); dapl_dbg_log (DAPL_DBG_TYPE_UTIL, - " query_hca: (%x.%x) ep %d ep_q %d evd %d evd_q %d\n", + " query_hca: (%x.%x) ep %d ep_q %d evd %d evd_q %d mtu %d\n", ia_attr->hardware_version_major, ia_attr->hardware_version_minor, ia_attr->max_eps, ia_attr->max_dto_per_ep, - ia_attr->max_evds, ia_attr->max_evd_qlen ); + ia_attr->max_evds, ia_attr->max_evd_qlen, + 128 << hca_ptr->ib_trans.mtu); dapl_dbg_log (DAPL_DBG_TYPE_UTIL, " query_hca: msg %llu rdma %llu iov %d lmr %d rmr %d ack_time %d\n", ia_attr->max_mtu_size, ia_attr->max_rdma_size, diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index 6d7568c..dbbe3fa 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -187,6 +187,7 @@ typedef struct ibv_comp_channel *ib_wait_obj_handle_t; #define SCM_ACK_RETRY 7 /* 3 bits, 7 * 134ms = 940ms */ #define SCM_RNR_TIMER 28 /* 5 bits, 28 == 163ms, 31 == 491ms */ #define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ +#define SCM_IB_MTU 1024 /* Global routing defaults */ #define SCM_GLOBAL 0 /* global routing is disabled */ -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 16:19:35 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:19:35 -0700 Subject: [ofa-general] [PATCH 3/6][uDAPL v1] dapl scm: use correct device attribute for max_rdma_read_out, max_qp_init_rd_atom Message-ID: <000201c8fe64$38ecd910$8963fe0a@amr.corp.intel.com> Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index fdbd17f..bc4ddbc 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -436,9 +436,9 @@ DAT_RETURN dapls_ib_query_hca ( ia_attr->max_eps = dev_attr.max_qp; ia_attr->max_dto_per_ep = dev_attr.max_qp_wr; ia_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; - ia_attr->max_rdma_read_out = dev_attr.max_qp_rd_atom; + ia_attr->max_rdma_read_out = dev_attr.max_qp_init_rd_atom; ia_attr->max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; - ia_attr->max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; + ia_attr->max_rdma_read_per_ep_out = dev_attr.max_qp_init_rd_atom; ia_attr->max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; ia_attr->max_rdma_read_per_ep_out_guaranteed = DAT_TRUE; ia_attr->max_evds = dev_attr.max_cq; @@ -485,7 +485,7 @@ DAT_RETURN dapls_ib_query_hca ( ep_attr->max_recv_iov = dev_attr.max_sge; ep_attr->max_request_iov = dev_attr.max_sge; ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; - ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; + ep_attr->max_rdma_read_out= dev_attr.max_qp_init_rd_atom; dapl_dbg_log (DAPL_DBG_TYPE_UTIL, " query_hca: MAX msg %llu dto %d iov %d rdma i%d,o%d\n", ep_attr->max_mtu_size, -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 16:19:42 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:19:42 -0700 Subject: [ofa-general] [PATCH 4/6][uDAPL v1] dapl scm: change IB RC qp inline and timer defaults. Message-ID: <000301c8fe64$3cb72780$8963fe0a@amr.corp.intel.com> rnr nak can be the result of any operation not just message send receiver not ready. Timer is much too large given this case. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.h | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index dbbe3fa..8ed4fac 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -180,13 +180,13 @@ typedef struct ibv_comp_channel *ib_wait_obj_handle_t; #define IB_INVALID_HANDLE NULL /* inline send rdma threshold */ -#define INLINE_SEND_DEFAULT 128 +#define INLINE_SEND_DEFAULT 200 /* RC timer - retry count defaults */ -#define SCM_ACK_TIMER 15 /* 5 bits, 4.096us*2^ack_timer. 15 == 134ms */ -#define SCM_ACK_RETRY 7 /* 3 bits, 7 * 134ms = 940ms */ -#define SCM_RNR_TIMER 28 /* 5 bits, 28 == 163ms, 31 == 491ms */ -#define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ +#define SCM_ACK_TIMER 16 /* 5 bits, 4.096us*2^ack_timer. 16== 268ms */ +#define SCM_ACK_RETRY 7 /* 3 bits, 7 * 268ms = 1.8 seconds */ +#define SCM_RNR_TIMER 12 /* 5 bits, 12 =.64ms, 28 =163ms, 31 =491ms */ +#define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ #define SCM_IB_MTU 1024 /* Global routing defaults */ -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 16:20:01 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:20:01 -0700 Subject: [ofa-general] [PATCH 6/6][uDAPL v1] dapl scm: better cm debug output in non-debug builds Message-ID: <000501c8fe64$47f25e30$8963fe0a@amr.corp.intel.com> Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_cm.c | 161 ++++++++++++++++++++++++++++-------------- 1 files changed, 109 insertions(+), 52 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_cm.c b/dapl/openib_scm/dapl_ib_cm.c index 03b0f12..3dcdbad 100644 --- a/dapl/openib_scm/dapl_ib_cm.c +++ b/dapl/openib_scm/dapl_ib_cm.c @@ -229,8 +229,10 @@ dapli_socket_connected(ib_cm_handle_t cm_ptr, int err) len = writev(cm_ptr->socket, iovec, (cm_ptr->dst.p_size ? 2:1)); if (len != (ntohl(cm_ptr->dst.p_size) + sizeof(ib_qp_cm_t))) { dapl_log(DAPL_DBG_TYPE_ERR, - " connect write: ERR %s, wcnt=%d\n", - strerror(errno), len); + " CONN_PENDING write: ERR %s, wcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } dapl_dbg_log(DAPL_DBG_TYPE_CM, @@ -288,7 +290,7 @@ dapli_socket_connect(DAPL_EP *ep_ptr, if (ret < 0 || fcntl(cm_ptr->socket, F_SETFL, ret | O_NONBLOCK) < 0) { dapl_log(DAPL_DBG_TYPE_ERR, - " connect: fcntl on socket %d ERR %d %s\n", + " socket connect: fcntl on socket %d ERR %d %s\n", cm_ptr->socket, ret, strerror(errno)); goto bail; @@ -298,10 +300,10 @@ dapli_socket_connect(DAPL_EP *ep_ptr, ret = connect(cm_ptr->socket, r_addr, sizeof(*r_addr)); if (ret && errno != EINPROGRESS) { dapl_log(DAPL_DBG_TYPE_ERR, - " connect ERROR: %s on %s r_qual %d\n", - strerror(errno), - inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), - (unsigned int)r_qual); + " socket connect ERROR: %s -> %s r_qual %d\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); dapli_cm_destroy(cm_ptr); return DAT_INVALID_ADDRESS; } @@ -312,15 +314,24 @@ dapli_socket_connect(DAPL_EP *ep_ptr, cm_ptr->dst.lid = htons(dapli_get_lid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num)); - if (cm_ptr->dst.lid == 0xffff) + if (cm_ptr->dst.lid == 0xffff) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONNECT: query LID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr)); goto bail; + } /* in network order */ if (ibv_query_gid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num, - 0, - &cm_ptr->dst.gid)) + 0, &cm_ptr->dst.gid)) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONNECT: query GID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr)); goto bail; + } /* save references */ cm_ptr->hca = ia_ptr->hca_ptr; @@ -347,8 +358,9 @@ dapli_socket_connect(DAPL_EP *ep_ptr, return DAT_SUCCESS; bail: dapl_log(DAPL_DBG_TYPE_ERR, - " connect ERROR: %s query lid(0x%x)/gid on %s r_qual %d\n", - strerror(errno),ntohs(cm_ptr->dst.lid), + " socket connect ERROR: %s query lid(0x%x)/gid" + " -> %s r_qual %d\n", + strerror(errno), ntohs(cm_ptr->dst.lid), inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), (unsigned int)r_qual); @@ -377,16 +389,20 @@ dapli_socket_connect_rtu(ib_cm_handle_t cm_ptr) iovec[0].iov_len = sizeof(ib_qp_cm_t); len = readv(cm_ptr->socket, iovec, 1); if (len != sizeof(ib_qp_cm_t) || ntohs(cm_ptr->dst.ver) != DSCM_VER) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read: ERR %s, rcnt=%d, ver=%d\n", - strerror(errno), len, cm_ptr->dst.ver); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read: ERR %s, rcnt=%d, ver=%d -> %s\n", + strerror(errno), len, cm_ptr->dst.ver, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } /* check for consumer reject */ if (cm_ptr->dst.rej) { - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect_rtu read: PEER REJ reason=0x%x\n", - ntohs(cm_ptr->dst.rej)); + dapl_log(DAPL_DBG_TYPE_CM, + " CONN_RTU read: PEER REJ reason=0x%x -> %s\n", + ntohs(cm_ptr->dst.rej), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; goto bail; } @@ -403,16 +419,18 @@ dapli_socket_connect_rtu(ib_cm_handle_t cm_ptr) sizeof(ep_ptr->remote_ia_address)); dapl_dbg_log(DAPL_DBG_TYPE_EP, - " connect_rtu: DST %s port=0x%x lid=0x%x, qpn=0x%x, psize=%d\n", + " CONN_RTU: DST %s port=0x%x lid=0x%x, qpn=0x%x, psize=%d\n", inet_ntoa(((struct sockaddr_in *)&cm_ptr->dst.ia_address)->sin_addr), cm_ptr->dst.port, cm_ptr->dst.lid, cm_ptr->dst.qpn, cm_ptr->dst.p_size); /* validate private data size before reading */ if (cm_ptr->dst.p_size > IB_MAX_REP_PDATA_SIZE) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read: psize (%d) wrong\n", - cm_ptr->dst.p_size ); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read: psize (%d) wrong -> %s\n", + cm_ptr->dst.p_size, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } @@ -423,21 +441,35 @@ dapli_socket_connect_rtu(ib_cm_handle_t cm_ptr) iovec[0].iov_len = cm_ptr->dst.p_size; len = readv(cm_ptr->socket, iovec, 1); if (len != cm_ptr->dst.p_size) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read pdata: ERR %s, rcnt=%d\n", - strerror(errno), len); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read pdata: ERR %s, rcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } } /* modify QP to RTR and then to RTS with remote info */ if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTR, &cm_ptr->dst) != DAT_SUCCESS) + IBV_QPS_RTR, &cm_ptr->dst) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU: QPS_RTR ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; + } if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTS, &cm_ptr->dst) != DAT_SUCCESS) + IBV_QPS_RTS, &cm_ptr->dst) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU: QPS_RTS ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; + } ep_ptr->qp_state = IB_QP_STATE_RTS; @@ -488,8 +520,9 @@ dapli_socket_listen(DAPL_IA *ia_ptr, /* bind, listen, set sockopt, accept, exchange data */ if ((cm_ptr->socket = socket(AF_INET, SOCK_STREAM, 0)) < 0) { - dapl_dbg_log (DAPL_DBG_TYPE_ERR, - "socket for listen returned %d\n", errno); + dapl_log(DAPL_DBG_TYPE_ERR, + " ERR: listen socket create: %s\n", + strerror(errno)); dat_status = DAT_INSUFFICIENT_RESOURCES; goto bail; } @@ -501,9 +534,9 @@ dapli_socket_listen(DAPL_IA *ia_ptr, if ((bind(cm_ptr->socket,(struct sockaddr*)&addr, sizeof(addr)) < 0) || (listen(cm_ptr->socket, 128) < 0)) { - dapl_dbg_log( DAPL_DBG_TYPE_CM, - " listen: ERROR %s on conn_qual 0x%x\n", - strerror(errno),serviceID); + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " listen: ERROR %s on conn_qual 0x%x\n", + strerror(errno),serviceID); if (errno == EADDRINUSE) dat_status = DAT_CONN_QUAL_IN_USE; else @@ -556,7 +589,7 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) (struct sockaddr*)&acm_ptr->dst.ia_address, (socklen_t*)&len); if (acm_ptr->socket < 0) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept: ERR %s on FD %d l_cr %p\n", strerror(errno),cm_ptr->socket,cm_ptr); goto bail; @@ -589,7 +622,7 @@ dapli_socket_accept_data(ib_cm_srvc_handle_t acm_ptr) len = read(acm_ptr->socket, &acm_ptr->dst, sizeof(ib_qp_cm_t)); if (len != sizeof(ib_qp_cm_t) || ntohs(acm_ptr->dst.ver) != DSCM_VER) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept read: ERR %s, rcnt=%d, ver=%d\n", strerror(errno), len, acm_ptr->dst.ver); goto bail; @@ -622,7 +655,7 @@ dapli_socket_accept_data(ib_cm_srvc_handle_t acm_ptr) len = read( acm_ptr->socket, acm_ptr->p_data, acm_ptr->dst.p_size); if (len != acm_ptr->dst.p_size) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept read pdata: ERR %s, rcnt=%d\n", strerror(errno), len); goto bail; @@ -669,20 +702,30 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, return DAT_INTERNAL_ERROR; dapl_dbg_log(DAPL_DBG_TYPE_EP, - " accept_usr: remote port=0x%x lid=0x%x" + " ACCEPT_USR: remote port=0x%x lid=0x%x" " qpn=0x%x psize=%d\n", cm_ptr->dst.port, cm_ptr->dst.lid, cm_ptr->dst.qpn, cm_ptr->dst.p_size); /* modify QP to RTR and then to RTS with remote info already read */ if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTR, &cm_ptr->dst) != DAT_SUCCESS) + IBV_QPS_RTR, &cm_ptr->dst) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: QPS_RTR ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; - + } if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTS, &cm_ptr->dst) != DAT_SUCCESS) + IBV_QPS_RTS, &cm_ptr->dst) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: QPS_RTS ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; - + } ep_ptr->qp_state = IB_QP_STATE_RTS; /* save remote address information */ @@ -695,15 +738,26 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, cm_ptr->dst.port = htons(ia_ptr->hca_ptr->port_num); cm_ptr->dst.lid = htons(dapli_get_lid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num)); - if (cm_ptr->dst.lid == 0xffff) + if (cm_ptr->dst.lid == 0xffff) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: query LID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; + } /* in network order */ if (ibv_query_gid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num, - 0, - &cm_ptr->dst.gid)) + 0, &cm_ptr->dst.gid)) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: query GID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; + } cm_ptr->dst.ia_address = ia_ptr->hca_ptr->hca_address; cm_ptr->dst.p_size = htonl(p_size); @@ -715,18 +769,20 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, } len = writev(cm_ptr->socket, iovec, (p_size ? 2:1)); if (len != (p_size + sizeof(ib_qp_cm_t))) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " accept_rtu: ERR %s, wcnt=%d\n", - strerror(errno), len); + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: ERR %s, wcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; } dapl_dbg_log(DAPL_DBG_TYPE_CM, - " accept_usr: local port=0x%x lid=0x%x" + " ACCEPT_USR: local port=0x%x lid=0x%x" " qpn=0x%x psize=%d\n", ntohs(cm_ptr->dst.port), ntohs(cm_ptr->dst.lid), ntohl(cm_ptr->dst.qpn), ntohl(cm_ptr->dst.p_size)); dapl_dbg_log(DAPL_DBG_TYPE_CM, - " accept_usr SRC GID subnet %016llx id %016llx\n", + " ACCEPT_USR SRC GID subnet %016llx id %016llx\n", (unsigned long long) cpu_to_be64(cm_ptr->dst.gid.global.subnet_prefix), (unsigned long long) @@ -745,7 +801,6 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, dapl_dbg_log( DAPL_DBG_TYPE_EP," PASSIVE: accepted!\n" ); return DAT_SUCCESS; bail: - dapl_dbg_log(DAPL_DBG_TYPE_ERR," accept_rtu: ERR !QP_RTR_RTS \n"); dapli_cm_destroy(cm_ptr); dapls_ib_reinit_ep(ep_ptr); /* reset QP state */ return DAT_INTERNAL_ERROR; @@ -763,9 +818,11 @@ dapli_socket_accept_rtu(ib_cm_handle_t cm_ptr) /* complete handshake after final QP state change */ len = read(cm_ptr->socket, &rtu_data, sizeof(rtu_data)); if (len != sizeof(rtu_data) || ntohs(rtu_data) != 0x0e0f) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " accept_rtu: ERR %s, rcnt=%d rdata=%x\n", - strerror(errno), len, ntohs(rtu_data)); + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_RTU: ERR %s, rcnt=%d rdata=%x\n", + strerror(errno), len, ntohs(rtu_data), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; } -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 16:19:49 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 16:19:49 -0700 Subject: [ofa-general] [PATCH 5/6][uDAPL v1] dapl scm: update max_rdma_read_iov, max_rdma_write_iov EP attributes during query Message-ID: <000401c8fe64$415a70d0$8963fe0a@amr.corp.intel.com> Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index bc4ddbc..76bde89 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -486,6 +486,8 @@ DAT_RETURN dapls_ib_query_hca ( ep_attr->max_request_iov = dev_attr.max_sge; ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; ep_attr->max_rdma_read_out= dev_attr.max_qp_init_rd_atom; + ep_attr->max_rdma_read_iov= dev_attr.max_sge; + ep_attr->max_rdma_write_iov= dev_attr.max_sge; dapl_dbg_log (DAPL_DBG_TYPE_UTIL, " query_hca: MAX msg %llu dto %d iov %d rdma i%d,o%d\n", ep_attr->max_mtu_size, -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 17:06:39 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 17:06:39 -0700 Subject: [ofa-general] [PATCH 1/5] [uDAPL v2] dapl scm: update max_rdma_read_iov, max_rdma_write_iov EP attributes during query Message-ID: <000601c8fe6a$cc1b9f90$8963fe0a@amr.corp.intel.com> >From 7e25c0f21d755cce3aa7aff993fb0baddaafc0e8 Mon Sep 17 00:00:00 2001 From: Arlin Davis Patch set for uDAPL v2 socket cm provider improvements. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index 43f85ac..f3874ca 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -478,6 +478,8 @@ DAT_RETURN dapls_ib_query_hca ( ep_attr->max_request_iov = dev_attr.max_sge; ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; + ep_attr->max_rdma_read_iov= dev_attr.max_sge; + ep_attr->max_rdma_write_iov= dev_attr.max_sge; dapl_dbg_log (DAPL_DBG_TYPE_UTIL, " query_hca: MAX msg %llu mtu %d dto %d iov %d" " rdma i%d,o%d\n", -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 17:06:43 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 17:06:43 -0700 Subject: [ofa-general] [PATCH 2/5] [uDAPL v2] dapl scm: change connect and accept to non-blocking to avoid blocking user thread. Message-ID: <000701c8fe6a$cde56310$8963fe0a@amr.corp.intel.com> The connect socket that is used to exchange QP information is now non-blocking and the data exchange is done via the cr thread. New state RTU_PENDING added. On the passive side there is a new state ACCEPT_DATA used to avoid read blocking on the user accept call. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_cm.c | 415 +++++++++++++++++++++++----------- dapl/openib_scm/dapl_ib_extensions.c | 2 +- dapl/openib_scm/dapl_ib_util.c | 2 - dapl/openib_scm/dapl_ib_util.h | 2 + 4 files changed, 288 insertions(+), 133 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_cm.c b/dapl/openib_scm/dapl_ib_cm.c index e712f9d..5ba5ddc 100644 --- a/dapl/openib_scm/dapl_ib_cm.c +++ b/dapl/openib_scm/dapl_ib_cm.c @@ -150,7 +150,7 @@ static uint16_t dapli_get_lid(IN struct ibv_context *ctx, IN uint8_t port) * ACTIVE/PASSIVE: called from CR thread or consumer via ep_disconnect */ static DAT_RETURN -dapli_socket_disconnect(dp_ib_cm_handle_t cm_ptr) +dapli_socket_disconnect(dp_ib_cm_handle_t cm_ptr) { DAPL_EP *ep_ptr = cm_ptr->ep; DAT_UINT32 disc_data = htonl(0xdead); @@ -197,6 +197,65 @@ dapli_socket_disconnect(dp_ib_cm_handle_t cm_ptr) return DAT_SUCCESS; } +/* + * ACTIVE: socket connected, send QP information to peer + */ +void +dapli_socket_connected(dp_ib_cm_handle_t cm_ptr, int err) +{ + int len, opt = 1; + struct iovec iovec[2]; + struct dapl_ep *ep_ptr = cm_ptr->ep; + + if (err) { + dapl_log(DAPL_DBG_TYPE_ERR, " connect: socket ERR %s\n", + strerror(err)); + goto bail; + } + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " socket connected, write QP and private data\n"); + + /* no delay for small packets */ + setsockopt(cm_ptr->socket,IPPROTO_TCP,TCP_NODELAY,&opt,sizeof(opt)); + + /* send qp info and pdata to remote peer */ + iovec[0].iov_base = &cm_ptr->dst; + iovec[0].iov_len = sizeof(ib_qp_cm_t); + if (cm_ptr->dst.p_size) { + iovec[1].iov_base = cm_ptr->p_data; + iovec[1].iov_len = ntohl(cm_ptr->dst.p_size); + } + + len = writev(cm_ptr->socket, iovec, (cm_ptr->dst.p_size ? 2:1)); + if (len != (ntohl(cm_ptr->dst.p_size) + sizeof(ib_qp_cm_t))) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_PENDING write: ERR %s, wcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); + goto bail; + } + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " connected: sending SRC port=0x%x lid=0x%x," + " qpn=0x%x, psize=%d\n", + ntohs(cm_ptr->dst.port), ntohs(cm_ptr->dst.lid), + ntohl(cm_ptr->dst.qpn), ntohl(cm_ptr->dst.p_size)); + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " connected: sending SRC GID subnet %016llx id %016llx\n", + (unsigned long long) + cpu_to_be64(cm_ptr->dst.gid.global.subnet_prefix), + (unsigned long long) + cpu_to_be64(cm_ptr->dst.gid.global.interface_id)); + + /* queue up to work thread to avoid blocking consumer */ + cm_ptr->state = SCM_RTU_PENDING; + return; +bail: + /* close socket, free cm structure and post error event */ + dapli_cm_destroy(cm_ptr); + dapl_evd_connection_callback(NULL, IB_CME_LOCAL_FAILURE, NULL, ep_ptr); +} + /* * ACTIVE: Create socket, connect, defer exchange QP information to CR thread @@ -210,8 +269,7 @@ dapli_socket_connect(DAPL_EP *ep_ptr, DAT_PVOID p_data) { dp_ib_cm_handle_t cm_ptr; - int len, opt = 1; - struct iovec iovec[2]; + int ret; DAPL_IA *ia_ptr = ep_ptr->header.owner_ia; dapl_dbg_log(DAPL_DBG_TYPE_EP, " connect: r_qual %d p_size=%d\n", @@ -227,75 +285,88 @@ dapli_socket_connect(DAPL_EP *ep_ptr, return DAT_INSUFFICIENT_RESOURCES; } - ((struct sockaddr_in*)r_addr)->sin_port = htons(r_qual); + /* non-blocking */ + ret = fcntl(cm_ptr->socket, F_GETFL); + if (ret < 0 || fcntl(cm_ptr->socket, + F_SETFL, ret | O_NONBLOCK) < 0) { + dapl_log(DAPL_DBG_TYPE_ERR, + " socket connect: fcntl on socket %d ERR %d %s\n", + cm_ptr->socket, ret, + strerror(errno)); + goto bail; + } - if (connect(cm_ptr->socket, r_addr, sizeof(*r_addr)) < 0) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect: %s on r_qual %d\n", - strerror(errno), (unsigned int)r_qual); + ((struct sockaddr_in*)r_addr)->sin_port = htons(r_qual); + ret = connect(cm_ptr->socket, r_addr, sizeof(*r_addr)); + if (ret && errno != EINPROGRESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " socket connect ERROR: %s -> %s r_qual %d\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); dapli_cm_destroy(cm_ptr); return DAT_INVALID_ADDRESS; - } - setsockopt(cm_ptr->socket,IPPROTO_TCP,TCP_NODELAY,&opt,sizeof(opt)); - - dapl_dbg_log(DAPL_DBG_TYPE_EP, " socket connected!\n"); - + } /* Send QP info, IA address, and private data */ cm_ptr->dst.qpn = htonl(ep_ptr->qp_handle->qp_num); +#ifdef DAT_EXTENSIONS cm_ptr->dst.qp_type = htons(ep_ptr->qp_handle->qp_type); +#endif cm_ptr->dst.port = htons(ia_ptr->hca_ptr->port_num); cm_ptr->dst.lid = htons(dapli_get_lid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num)); - if (cm_ptr->dst.lid == 0xffff) + if (cm_ptr->dst.lid == 0xffff) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONNECT: query LID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr)); goto bail; + } /* in network order */ if (ibv_query_gid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num, - 0, - &cm_ptr->dst.gid)) + 0, &cm_ptr->dst.gid)) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONNECT: query GID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr)); goto bail; + } + /* save references */ + cm_ptr->hca = ia_ptr->hca_ptr; + cm_ptr->ep = ep_ptr; cm_ptr->dst.ia_address = ia_ptr->hca_ptr->hca_address; - cm_ptr->dst.p_size = htonl(p_size); - iovec[0].iov_base = &cm_ptr->dst; - iovec[0].iov_len = sizeof(ib_qp_cm_t); if (p_size) { - iovec[1].iov_base = p_data; - iovec[1].iov_len = p_size; + cm_ptr->dst.p_size = htonl(p_size); + dapl_os_memcpy(cm_ptr->p_data, p_data, p_size); } - dapl_dbg_log(DAPL_DBG_TYPE_EP, - " socket connected, write QP (%d), private data (%d)\n", - sizeof(ib_qp_cm_t),p_size); + /* connected or pending, either way results via async event */ + if (ret == 0) + dapli_socket_connected(cm_ptr,0); + else + cm_ptr->state = SCM_CONN_PENDING; - len = writev(cm_ptr->socket, iovec, (p_size ? 2:1)); - if (len != (p_size + sizeof(ib_qp_cm_t))) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect write: ERR %s, wcnt=%d\n", - strerror(errno), len); - goto bail; - } - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect: SRC port=0x%x lid=0x%x, qpn=0x%x, psize=%d\n", - ntohs(cm_ptr->dst.port), ntohs(cm_ptr->dst.lid), - ntohl(cm_ptr->dst.qpn), ntohl(cm_ptr->dst.p_size)); - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect SRC GID subnet %016llx id %016llx\n", - (unsigned long long) - cpu_to_be64(cm_ptr->dst.gid.global.subnet_prefix), - (unsigned long long) - cpu_to_be64(cm_ptr->dst.gid.global.interface_id)); - - /* queue up to work thread to avoid blocking consumer */ - cm_ptr->state = SCM_CONN_PENDING; - cm_ptr->hca = ia_ptr->hca_ptr; - cm_ptr->ep = ep_ptr; + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " connect: socket %d to %s r_qual %d pending\n", + cm_ptr->socket, + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); + dapli_cm_queue(cm_ptr); return DAT_SUCCESS; bail: + dapl_log(DAPL_DBG_TYPE_ERR, + " socket connect ERROR: %s query lid(0x%x)/gid" + " -> %s r_qual %d\n", + strerror(errno), ntohs(cm_ptr->dst.lid), + inet_ntoa(((struct sockaddr_in *)r_addr)->sin_addr), + (unsigned int)r_qual); + /* close socket, free cm structure */ dapli_cm_destroy(cm_ptr); return DAT_INTERNAL_ERROR; @@ -306,7 +377,7 @@ bail: * ACTIVE: exchange QP information, called from CR thread */ void -dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) +dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) { DAPL_EP *ep_ptr = cm_ptr->ep; int len; @@ -321,16 +392,20 @@ dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) iovec[0].iov_len = sizeof(ib_qp_cm_t); len = readv(cm_ptr->socket, iovec, 1); if (len != sizeof(ib_qp_cm_t) || ntohs(cm_ptr->dst.ver) != DSCM_VER) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read: ERR %s, rcnt=%d, ver=%d\n", - strerror(errno), len, ntohs(cm_ptr->dst.ver)); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read: ERR %s, rcnt=%d, ver=%d -> %s\n", + strerror(errno), len, cm_ptr->dst.ver, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } /* check for consumer reject */ if (cm_ptr->dst.rej) { - dapl_dbg_log(DAPL_DBG_TYPE_CM, - " connect_rtu read: PEER REJ reason=0x%x\n", - ntohs(cm_ptr->dst.rej)); + dapl_log(DAPL_DBG_TYPE_CM, + " CONN_RTU read: PEER REJ reason=0x%x -> %s\n", + ntohs(cm_ptr->dst.rej), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); event = IB_CME_DESTINATION_REJECT_PRIVATE_DATA; goto bail; } @@ -339,7 +414,9 @@ dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) cm_ptr->dst.port = ntohs(cm_ptr->dst.port); cm_ptr->dst.lid = ntohs(cm_ptr->dst.lid); cm_ptr->dst.qpn = ntohl(cm_ptr->dst.qpn); +#ifdef DAT_EXTENSIONS cm_ptr->dst.qp_type = ntohs(cm_ptr->dst.qp_type); +#endif cm_ptr->dst.p_size = ntohl(cm_ptr->dst.p_size); /* save remote address information */ @@ -348,7 +425,7 @@ dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) sizeof(ep_ptr->remote_ia_address)); dapl_dbg_log(DAPL_DBG_TYPE_EP, - " connect_rtu: DST %s port=0x%x lid=0x%x," + " CONN_RTU: DST %s port=0x%x lid=0x%x," " qpn=0x%x, qp_type=%d, psize=%d\n", inet_ntoa(((struct sockaddr_in *) &cm_ptr->dst.ia_address)->sin_addr), @@ -358,35 +435,50 @@ dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) /* validate private data size before reading */ if (cm_ptr->dst.p_size > IB_MAX_REP_PDATA_SIZE) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read: psize (%d) wrong\n", - cm_ptr->dst.p_size ); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read: psize (%d) wrong -> %s\n", + cm_ptr->dst.p_size, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } /* read private data into cm_handle if any present */ - dapl_dbg_log(DAPL_DBG_TYPE_EP," socket connected, read pdata\n"); - + dapl_dbg_log(DAPL_DBG_TYPE_EP," socket connected, read private data\n"); if (cm_ptr->dst.p_size) { iovec[0].iov_base = cm_ptr->p_data; iovec[0].iov_len = cm_ptr->dst.p_size; len = readv(cm_ptr->socket, iovec, 1); if (len != cm_ptr->dst.p_size) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " connect_rtu read pdata: ERR %s, rcnt=%d\n", - strerror(errno), len); + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU read pdata: ERR %s, rcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; } } /* modify QP to RTR and then to RTS with remote info */ if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTR, cm_ptr) != DAT_SUCCESS) + IBV_QPS_RTR, cm_ptr) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU: QPS_RTR ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; + } if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTS, cm_ptr) != DAT_SUCCESS) + IBV_QPS_RTS, cm_ptr) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " CONN_RTU: QPS_RTS ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + ep_ptr->param.remote_ia_address_ptr)->sin_addr)); goto bail; + } ep_ptr->qp_state = IB_QP_STATE_RTS; @@ -426,7 +518,6 @@ dapli_socket_connect_rtu(dp_ib_cm_handle_t cm_ptr) IB_CME_CONNECTED, cm_ptr->p_data, ep_ptr); - return; bail: /* close socket, free cm structure and post error event */ @@ -461,8 +552,9 @@ dapli_socket_listen(DAPL_IA *ia_ptr, /* bind, listen, set sockopt, accept, exchange data */ if ((cm_ptr->socket = socket(AF_INET, SOCK_STREAM, 0)) < 0) { - dapl_dbg_log (DAPL_DBG_TYPE_ERR, - "socket for listen returned %d\n", errno); + dapl_log(DAPL_DBG_TYPE_ERR, + " ERR: listen socket create: %s\n", + strerror(errno)); dat_status = DAT_INSUFFICIENT_RESOURCES; goto bail; } @@ -474,9 +566,9 @@ dapli_socket_listen(DAPL_IA *ia_ptr, if ((bind(cm_ptr->socket,(struct sockaddr*)&addr, sizeof(addr)) < 0) || (listen(cm_ptr->socket, 128) < 0)) { - dapl_dbg_log( DAPL_DBG_TYPE_CM, - " listen: ERROR %s on conn_qual 0x%x\n", - strerror(errno),serviceID); + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " listen: ERROR %s on conn_qual 0x%x\n", + strerror(errno),serviceID); if (errno == EADDRINUSE) dat_status = DAT_CONN_QUAL_IN_USE; else @@ -503,25 +595,22 @@ bail: return dat_status; } - /* - * PASSIVE: accept socket, receive peer QP information, private data, post cr_event + * PASSIVE: accept socket */ -DAT_RETURN +void dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) { - dp_ib_cm_handle_t acm_ptr; - void *p_data = NULL; + dp_ib_cm_handle_t acm_ptr; int len; - DAT_RETURN dat_status = DAT_SUCCESS; dapl_dbg_log(DAPL_DBG_TYPE_EP," socket_accept\n"); /* Allocate accept CM and initialize */ if ((acm_ptr = dapl_os_alloc(sizeof(*acm_ptr))) == NULL) - return DAT_INSUFFICIENT_RESOURCES; + goto bail; - (void) dapl_os_memzero( acm_ptr, sizeof( *acm_ptr ) ); + (void) dapl_os_memzero(acm_ptr, sizeof(*acm_ptr)); acm_ptr->socket = -1; acm_ptr->sp = cm_ptr->sp; @@ -531,25 +620,43 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) acm_ptr->socket = accept(cm_ptr->socket, (struct sockaddr*)&acm_ptr->dst.ia_address, (socklen_t*)&len); - if (acm_ptr->socket < 0) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept: ERR %s on FD %d l_cr %p\n", strerror(errno),cm_ptr->socket,cm_ptr); - dat_status = DAT_INTERNAL_ERROR; goto bail; } + dapl_dbg_log(DAPL_DBG_TYPE_EP, + " socket accepted, queue new cm %p\n",acm_ptr); + + acm_ptr->state = SCM_ACCEPTING; + dapli_cm_queue(acm_ptr); + return; +bail: + /* close socket, free cm structure, active will see socket close as reject */ + if (acm_ptr) + dapli_cm_destroy(acm_ptr); +} + +/* + * PASSIVE: receive peer QP information, private data, post cr_event + */ +void +dapli_socket_accept_data(ib_cm_srvc_handle_t acm_ptr) +{ + int len; + void *p_data = NULL; + dapl_dbg_log(DAPL_DBG_TYPE_EP," socket accepted, read QP data\n"); /* read in DST QP info, IA address. check for private data */ len = read(acm_ptr->socket, &acm_ptr->dst, sizeof(ib_qp_cm_t)); if (len != sizeof(ib_qp_cm_t) || ntohs(acm_ptr->dst.ver) != DSCM_VER) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept read: ERR %s, rcnt=%d, ver=%d\n", - strerror(errno), len, ntohs(acm_ptr->dst.ver)); - dat_status = DAT_INTERNAL_ERROR; + strerror(errno), len, acm_ptr->dst.ver); goto bail; } @@ -557,13 +664,14 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) acm_ptr->dst.port = ntohs(acm_ptr->dst.port); acm_ptr->dst.lid = ntohs(acm_ptr->dst.lid); acm_ptr->dst.qpn = ntohl(acm_ptr->dst.qpn); +#ifdef DAT_EXTENSIONS acm_ptr->dst.qp_type = ntohs(acm_ptr->dst.qp_type); +#endif acm_ptr->dst.p_size = ntohl(acm_ptr->dst.p_size); dapl_dbg_log(DAPL_DBG_TYPE_EP, - " accept: DST %s port=0x%x lid=0x%x, qpn=0x%x, psz=%d\n", - inet_ntoa(((struct sockaddr_in *) - &acm_ptr->dst.ia_address)->sin_addr), + " accept: DST %s port=0x%x lid=0x%x, qpn=0x%x, psize=%d\n", + inet_ntoa(((struct sockaddr_in *)&acm_ptr->dst.ia_address)->sin_addr), acm_ptr->dst.port, acm_ptr->dst.lid, acm_ptr->dst.qpn, acm_ptr->dst.p_size); @@ -572,7 +680,6 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) dapl_dbg_log(DAPL_DBG_TYPE_ERR, " accept read: psize (%d) wrong\n", acm_ptr->dst.p_size); - dat_status = DAT_INTERNAL_ERROR; goto bail; } @@ -583,18 +690,17 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) len = read( acm_ptr->socket, acm_ptr->p_data, acm_ptr->dst.p_size); if (len != acm_ptr->dst.p_size) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, + dapl_log(DAPL_DBG_TYPE_ERR, " accept read pdata: ERR %s, rcnt=%d\n", strerror(errno), len); - dat_status = DAT_INTERNAL_ERROR; goto bail; } dapl_dbg_log(DAPL_DBG_TYPE_EP," accept: psize=%d read\n",len); p_data = acm_ptr->p_data; } - acm_ptr->state = SCM_ACCEPTING; - + acm_ptr->state = SCM_ACCEPTING_DATA; + #ifdef DAT_EXTENSIONS if (acm_ptr->dst.qp_type == IBV_QPT_UD) { DAT_IB_EXTENSION_EVENT_DATA xevent; @@ -617,10 +723,11 @@ dapli_socket_accept(ib_cm_srvc_handle_t cm_ptr) IB_CME_CONNECTION_REQUEST_PENDING, p_data, acm_ptr->sp ); - return DAT_SUCCESS; + return; bail: + /* close socket, free cm structure, active will see socket close as reject */ dapli_cm_destroy(acm_ptr); - return DAT_INTERNAL_ERROR; + return; } /* @@ -635,7 +742,7 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, DAT_PVOID p_data) { DAPL_IA *ia_ptr = ep_ptr->header.owner_ia; - dp_ib_cm_handle_t cm_ptr = cr_ptr->ib_cm_handle; + dp_ib_cm_handle_t cm_ptr = cr_ptr->ib_cm_handle; ib_qp_cm_t local; struct iovec iovec[2]; int len; @@ -648,7 +755,7 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, return DAT_INTERNAL_ERROR; dapl_dbg_log(DAPL_DBG_TYPE_EP, - " accept_usr: remote port=0x%x lid=0x%x" + " ACCEPT_USR: remote port=0x%x lid=0x%x" " qpn=0x%x qp_type %d, psize=%d\n", cm_ptr->dst.port, cm_ptr->dst.lid, cm_ptr->dst.qpn, cm_ptr->dst.qp_type, @@ -658,25 +765,34 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, if (cm_ptr->dst.qp_type == IBV_QPT_UD && ep_ptr->qp_handle->qp_type != IBV_QPT_UD) { dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " accept_rtu: ERR remote QP is UD," + " ACCEPT_USR: ERR remote QP is UD," ", but local QP is not\n"); return (DAT_INVALID_HANDLE | DAT_INVALID_HANDLE_EP); - } #endif /* modify QP to RTR and then to RTS with remote info already read */ if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTR, cm_ptr) != DAT_SUCCESS) + IBV_QPS_RTR, cm_ptr) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: QPS_RTR ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; - + } if (dapls_modify_qp_state(ep_ptr->qp_handle, - IBV_QPS_RTS, cm_ptr) != DAT_SUCCESS) + IBV_QPS_RTS, cm_ptr) != DAT_SUCCESS) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: QPS_RTS ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; - + } ep_ptr->qp_state = IB_QP_STATE_RTS; - /* save remote address information, for qp_query */ + /* save remote address information */ dapl_os_memcpy( &ep_ptr->remote_ia_address, &cm_ptr->dst.ia_address, sizeof(ep_ptr->remote_ia_address)); @@ -689,15 +805,26 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, local.port = htons(ia_ptr->hca_ptr->port_num); local.lid = htons(dapli_get_lid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num)); - if (local.lid == 0xffff) + if (local.lid == 0xffff) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: query LID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; + } /* in network order */ if (ibv_query_gid(ia_ptr->hca_ptr->ib_hca_handle, (uint8_t)ia_ptr->hca_ptr->port_num, - 0, - &local.gid)) + 0, &local.gid)) { + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: query GID ERR %s -> %s\n", + strerror(errno), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; + } local.ia_address = ia_ptr->hca_ptr->hca_address; local.p_size = htonl(p_size); @@ -709,19 +836,20 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, } len = writev(cm_ptr->socket, iovec, (p_size ? 2:1)); if (len != (p_size + sizeof(ib_qp_cm_t))) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " accept_rtu: ERR %s, wcnt=%d\n", - strerror(errno), len); + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_USR: ERR %s, wcnt=%d -> %s\n", + strerror(errno), len, + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; } dapl_dbg_log(DAPL_DBG_TYPE_CM, - " accept_usr: local port=0x%x lid=0x%x" - " qpn=0x%x qp_type=%d psize=%d\n", + " ACCEPT_USR: local port=0x%x lid=0x%x" + " qpn=0x%x psize=%d\n", ntohs(local.port), ntohs(local.lid), - ntohl(local.qpn), ntohs(local.qp_type), - ntohl(local.p_size)); + ntohl(local.qpn), ntohl(local.p_size)); dapl_dbg_log(DAPL_DBG_TYPE_CM, - " accept_usr SRC GID subnet %016llx id %016llx\n", + " ACCEPT_USR SRC GID subnet %016llx id %016llx\n", (unsigned long long) cpu_to_be64(local.gid.global.subnet_prefix), (unsigned long long) @@ -733,10 +861,8 @@ dapli_socket_accept_usr(DAPL_EP *ep_ptr, cm_ptr->state = SCM_ACCEPTED; dapl_dbg_log( DAPL_DBG_TYPE_EP," PASSIVE: accepted!\n" ); - dapli_cm_queue(cm_ptr); return DAT_SUCCESS; bail: - dapl_dbg_log(DAPL_DBG_TYPE_ERR," accept_rtu: ERR !QP_RTR_RTS \n"); dapli_cm_destroy(cm_ptr); dapls_ib_reinit_ep(ep_ptr); /* reset QP state */ return DAT_INTERNAL_ERROR; @@ -746,7 +872,7 @@ bail: * PASSIVE: read RTU from active peer, post CONN event */ void -dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) +dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) { int len; short rtu_data = 0; @@ -754,9 +880,11 @@ dapli_socket_accept_rtu(dp_ib_cm_handle_t cm_ptr) /* complete handshake after final QP state change */ len = read(cm_ptr->socket, &rtu_data, sizeof(rtu_data)); if (len != sizeof(rtu_data) || ntohs(rtu_data) != 0x0e0f) { - dapl_dbg_log(DAPL_DBG_TYPE_ERR, - " accept_rtu: ERR %s, rcnt=%d rdata=%x\n", - strerror(errno), len, ntohs(rtu_data)); + dapl_log(DAPL_DBG_TYPE_ERR, + " ACCEPT_RTU: ERR %s, rcnt=%d rdata=%x\n", + strerror(errno), len, ntohs(rtu_data), + inet_ntoa(((struct sockaddr_in *) + &cm_ptr->dst.ia_address)->sin_addr)); goto bail; } @@ -860,7 +988,7 @@ dapls_ib_disconnect( IN DAPL_EP *ep_ptr, IN DAT_CLOSE_FLAGS close_flags) { - dp_ib_cm_handle_t cm_ptr = ep_ptr->cm_handle; + dp_ib_cm_handle_t cm_ptr = ep_ptr->cm_handle; dapl_dbg_log (DAPL_DBG_TYPE_EP, "dapls_ib_disconnect(ep_handle %p ....)\n", @@ -1045,7 +1173,7 @@ dapls_ib_reject_connection( IN DAT_COUNT psize, IN const DAT_PVOID pdata) { - struct iovec iovec; + struct iovec iovec[2]; dapl_dbg_log (DAPL_DBG_TYPE_EP, " reject(cm %p reason %x, pdata %p, psize %d)\n", @@ -1055,9 +1183,15 @@ dapls_ib_reject_connection( if (cm_ptr->socket >= 0) { cm_ptr->dst.rej = (uint16_t)reason; cm_ptr->dst.rej = htons(cm_ptr->dst.rej); - iovec.iov_base = &cm_ptr->dst; - iovec.iov_len = sizeof(ib_qp_cm_t); - writev(cm_ptr->socket, &iovec, 1); + iovec[0].iov_base = &cm_ptr->dst; + iovec[0].iov_len = sizeof(ib_qp_cm_t); + if (psize) { + iovec[1].iov_base = pdata; + iovec[2].iov_len = psize; + writev(cm_ptr->socket, &iovec[0], 2); + } else + writev(cm_ptr->socket, &iovec[0], 1); + close(cm_ptr->socket); cm_ptr->socket = -1; } @@ -1090,7 +1224,7 @@ dapls_ib_cm_remote_addr ( OUT DAT_SOCK_ADDR6 *remote_ia_address ) { DAPL_HEADER *header; - dp_ib_cm_handle_t ib_cm_handle; + dp_ib_cm_handle_t ib_cm_handle; dapl_dbg_log (DAPL_DBG_TYPE_EP, "dapls_ib_cm_remote_addr(dat_handle %p, ....)\n", @@ -1290,7 +1424,8 @@ void cr_thread(void *arg) { struct dapl_hca *hca_ptr = arg; dp_ib_cm_handle_t cr, next_cr; - int ret,idx; + int opt,ret,idx; + socklen_t opt_len; char rbuf[2]; struct pollfd ufds[SCM_MAX_CONN]; @@ -1330,14 +1465,19 @@ void cr_thread(void *arg) /* Add to ufds for poll, check for immediate work */ ufds[++idx].fd = cr->socket; /* add listen or cr */ - ufds[idx].events = POLLIN; + if (cr->state == SCM_CONN_PENDING) + ufds[idx].events = POLLOUT; + else + ufds[idx].events = POLLIN; /* check socket for event, accept in or connect out */ dapl_dbg_log(DAPL_DBG_TYPE_CM," poll cr=%p, fd=%d,%d\n", cr, cr->socket, ufds[idx].fd); dapl_os_unlock(&hca_ptr->ib_trans.lock); ret = poll(&ufds[idx],1,1); - dapl_dbg_log(DAPL_DBG_TYPE_CM," poll wakeup ret=%d cr->st=%d ev=%d fd=%d\n", + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " poll wakeup ret=%d cr->st=%d" + " ev=0x%x fd=%d\n", ret,cr->state,ufds[idx].revents,ufds[idx].fd); /* data on listen, qp exchange, and on disconnect request */ @@ -1345,13 +1485,28 @@ void cr_thread(void *arg) if (cr->socket > 0) { if (cr->state == SCM_LISTEN) dapli_socket_accept(cr); + else if (cr->state == SCM_ACCEPTING) + dapli_socket_accept_data(cr); else if (cr->state == SCM_ACCEPTED) dapli_socket_accept_rtu(cr); - else if (cr->state == SCM_CONN_PENDING) + else if (cr->state == SCM_RTU_PENDING) dapli_socket_connect_rtu(cr); else if (cr->state == SCM_CONNECTED) dapli_socket_disconnect(cr); } + /* connect socket is writable, check status */ + } else if ((ret == 1) && + (ufds[idx].revents & POLLOUT || + ufds[idx].revents & POLLERR)) { + if (cr->state == SCM_CONN_PENDING) { + opt = 0; + ret = getsockopt(cr->socket, SOL_SOCKET, + SO_ERROR, &opt, &opt_len); + if (!ret) + dapli_socket_connected(cr,opt); + else + dapli_socket_connected(cr,EFAULT); + } } else if (ret != 0) { dapl_dbg_log(DAPL_DBG_TYPE_CM, " cr_thread(cr=%p) st=%d poll ERR= %s\n", diff --git a/dapl/openib_scm/dapl_ib_extensions.c b/dapl/openib_scm/dapl_ib_extensions.c index b88e853..692c8a8 100755 --- a/dapl/openib_scm/dapl_ib_extensions.c +++ b/dapl/openib_scm/dapl_ib_extensions.c @@ -82,7 +82,7 @@ dapl_extensions(IN DAT_HANDLE dat_handle, IN va_list args) { DAT_EP_HANDLE ep; - DAT_IB_ADDR_HANDLE *ah; + DAT_IB_ADDR_HANDLE *ah = NULL; DAT_LMR_TRIPLET *lmr_p; DAT_DTO_COOKIE cookie; const DAT_RMR_TRIPLET *rmr_p; diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index f3874ca..0f24737 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -444,8 +444,6 @@ DAT_RETURN dapls_ib_query_hca ( ia_attr->transport_attr = NULL; ia_attr->num_vendor_attr = 0; ia_attr->vendor_attr = NULL; - ia_attr->max_iov_segments_per_rdma_read = dev_attr.max_sge; - ia_attr->max_iov_segments_per_rdma_write = dev_attr.max_sge; #ifdef DAT_EXTENSIONS ia_attr->extension_supported = DAT_EXTENSION_IB; ia_attr->extension_version = DAT_IB_EXTENSION_VERSION; diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index bd3ea83..deb6be3 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -107,7 +107,9 @@ typedef enum scm_state SCM_INIT, SCM_LISTEN, SCM_CONN_PENDING, + SCM_RTU_PENDING, SCM_ACCEPTING, + SCM_ACCEPTING_DATA, SCM_ACCEPTED, SCM_REJECTED, SCM_CONNECTED, -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 17:06:49 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 17:06:49 -0700 Subject: [ofa-general] [PATCH 3/5] [uDAPL v2] dapl scm: add mtu adjustments via environment, default = 1024 Message-ID: <000801c8fe6a$d1682130$8963fe0a@amr.corp.intel.com> DAPL_IB_MTU adjusts path mtu setting for RC qp's. Default setting is min of 1024 and active mtu on IB device. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 27 +++++++++++++++++++++++---- dapl/openib_scm/dapl_ib_util.h | 1 + 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index 0f24737..f7ee58b 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -64,6 +64,18 @@ static const char rcsid[] = "$Id: $"; int g_dapl_loopback_connection = 0; int g_scm_pipe[2]; +enum ibv_mtu dapl_ib_mtu(int mtu) +{ + switch (mtu) { + case 256: return IBV_MTU_256; + case 512: return IBV_MTU_512; + case 1024: return IBV_MTU_1024; + case 2048: return IBV_MTU_2048; + case 4096: return IBV_MTU_4096; + default: return IBV_MTU_1024; + } +} + /* just get IP address for hostname */ DAT_RETURN getipaddr( char *addr, int addr_len) { @@ -223,6 +235,8 @@ found: dapl_os_get_env_val("DAPL_HOP_LIMIT", SCM_HOP_LIMIT); hca_ptr->ib_trans.tclass = dapl_os_get_env_val("DAPL_TCLASS", SCM_TCLASS); + hca_ptr->ib_trans.mtu = + dapl_ib_mtu(dapl_os_get_env_val("DAPL_IB_MTU", SCM_IB_MTU)); /* initialize cq_lock */ dat_status = dapl_os_lock_init(&hca_ptr->ib_trans.cq_lock); @@ -448,19 +462,24 @@ DAT_RETURN dapls_ib_query_hca ( ia_attr->extension_supported = DAT_EXTENSION_IB; ia_attr->extension_version = DAT_IB_EXTENSION_VERSION; #endif - hca_ptr->ib_trans.mtu = port_attr.active_mtu; + hca_ptr->ib_trans.mtu = DAPL_MIN(port_attr.active_mtu, + hca_ptr->ib_trans.mtu); hca_ptr->ib_trans.ack_timer = DAPL_MAX(dev_attr.local_ca_ack_delay, hca_ptr->ib_trans.ack_timer); dapl_dbg_log (DAPL_DBG_TYPE_UTIL, - " query_hca: (%x.%x) ep %d ep_q %d evd %d evd_q %d\n", + " query_hca: (%x.%x) ep %d ep_q %d evd %d" + " evd_q %d mtu %d\n", ia_attr->hardware_version_major, ia_attr->hardware_version_minor, ia_attr->max_eps, ia_attr->max_dto_per_ep, - ia_attr->max_evds, ia_attr->max_evd_qlen ); + ia_attr->max_evds, ia_attr->max_evd_qlen, + 128 << hca_ptr->ib_trans.mtu); + dapl_dbg_log (DAPL_DBG_TYPE_UTIL, - " query_hca: msg %llu rdma %llu iov %d lmr %d rmr %d ack_time %d\n", + " query_hca: msg %llu rdma %llu iov %d lmr %d rmr %d" + " ack_time %d\n", ia_attr->max_message_size, ia_attr->max_rdma_size, ia_attr->max_iov_segments_per_dto, ia_attr->max_lmrs, ia_attr->max_rmrs,hca_ptr->ib_trans.ack_timer ); diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index deb6be3..bd702a9 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -196,6 +196,7 @@ typedef struct ibv_comp_channel *ib_wait_obj_handle_t; #define SCM_ACK_RETRY 7 /* 3 bits, 7 * 134ms = 940ms */ #define SCM_RNR_TIMER 28 /* 5 bits, 28 == 163ms, 31 == 491ms */ #define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ +#define SCM_IB_MTU 1024 /* Global routing defaults */ #define SCM_GLOBAL 0 /* global routing is disabled */ -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 17:06:56 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 17:06:56 -0700 Subject: [ofa-general] [PATCH 4/5] [uDAPL v2] dapl scm: change IB RC qp inline and timer defaults Message-ID: <000901c8fe6a$d5d23210$8963fe0a@amr.corp.intel.com> rnr nak can be the result of any operation not just message send receiver not ready. Timer is much too large given this case. Adjust inline send. Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.h | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.h b/dapl/openib_scm/dapl_ib_util.h index bd702a9..4e75d2c 100644 --- a/dapl/openib_scm/dapl_ib_util.h +++ b/dapl/openib_scm/dapl_ib_util.h @@ -186,16 +186,16 @@ typedef struct ibv_comp_channel *ib_wait_obj_handle_t; #define IB_INVALID_HANDLE NULL /* inline send rdma threshold */ -#define INLINE_SEND_DEFAULT 128 +#define INLINE_SEND_DEFAULT 200 /* qkey for UD QP's */ #define SCM_UD_QKEY 0x78654321 /* RC timer - retry count defaults */ -#define SCM_ACK_TIMER 15 /* 5 bits, 4.096us*2^ack_timer. 15 == 134ms */ -#define SCM_ACK_RETRY 7 /* 3 bits, 7 * 134ms = 940ms */ -#define SCM_RNR_TIMER 28 /* 5 bits, 28 == 163ms, 31 == 491ms */ -#define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ +#define SCM_ACK_TIMER 16 /* 5 bits, 4.096us*2^ack_timer. 16== 268ms */ +#define SCM_ACK_RETRY 7 /* 3 bits, 7 * 268ms = 1.8 seconds */ +#define SCM_RNR_TIMER 12 /* 5 bits, 12 =.64ms, 28 =163ms, 31 =491ms */ +#define SCM_RNR_RETRY 7 /* 3 bits, 7 == infinite */ #define SCM_IB_MTU 1024 /* Global routing defaults */ -- 1.5.2.5 From arlin.r.davis at intel.com Thu Aug 14 17:07:04 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 14 Aug 2008 17:07:04 -0700 Subject: [ofa-general] [PATCH 5/5] [uDAPL v2] dapl scm: use correct device attribute for max_rdma_read_out, max_qp_init_rd_atom Message-ID: <000a01c8fe6a$dae33150$8963fe0a@amr.corp.intel.com> Signed-off by: Arlin Davis ardavis at ichips.intel.com --- dapl/openib_scm/dapl_ib_util.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c index f7ee58b..11294fa 100644 --- a/dapl/openib_scm/dapl_ib_util.c +++ b/dapl/openib_scm/dapl_ib_util.c @@ -436,9 +436,9 @@ DAT_RETURN dapls_ib_query_hca ( ia_attr->max_eps = dev_attr.max_qp; ia_attr->max_dto_per_ep = dev_attr.max_qp_wr; ia_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; - ia_attr->max_rdma_read_out = dev_attr.max_qp_rd_atom; + ia_attr->max_rdma_read_out = dev_attr.max_qp_init_rd_atom; ia_attr->max_rdma_read_per_ep_in = dev_attr.max_qp_rd_atom; - ia_attr->max_rdma_read_per_ep_out = dev_attr.max_qp_rd_atom; + ia_attr->max_rdma_read_per_ep_out = dev_attr.max_qp_init_rd_atom; ia_attr->max_rdma_read_per_ep_in_guaranteed = DAT_TRUE; ia_attr->max_rdma_read_per_ep_out_guaranteed = DAT_TRUE; ia_attr->max_evds = dev_attr.max_cq; @@ -494,7 +494,7 @@ DAT_RETURN dapls_ib_query_hca ( ep_attr->max_recv_iov = dev_attr.max_sge; ep_attr->max_request_iov = dev_attr.max_sge; ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; - ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; + ep_attr->max_rdma_read_out= dev_attr.max_qp_init_rd_atom; ep_attr->max_rdma_read_iov= dev_attr.max_sge; ep_attr->max_rdma_write_iov= dev_attr.max_sge; dapl_dbg_log (DAPL_DBG_TYPE_UTIL, -- 1.5.2.5 From Terek-impulsed at generaltrade.ro Thu Aug 14 19:14:42 2008 From: Terek-impulsed at generaltrade.ro (MSNBC Breaking News) Date: Fri, 15 Aug 2008 10:14:42 +0800 Subject: [ofa-general] msnbc.com - BREAKING NEWS: Thursday, Al Gore gave yet another speech about the planet or something Message-ID: <000e01c8fe7c$adcc5b30$2b7c7edd@usersskelwk3ks> An HTML attachment was scrubbed... URL: From rajib.majumder at credit-suisse.com Thu Aug 14 21:38:06 2008 From: rajib.majumder at credit-suisse.com (Majumder, Rajib) Date: Fri, 15 Aug 2008 12:38:06 +0800 Subject: [ofa-general] IB Setup Message-ID: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcummins at sgi.com Thu Aug 14 21:42:28 2008 From: rcummins at sgi.com (Robert Cummins) Date: Thu, 14 Aug 2008 21:42:28 -0700 Subject: [ofa-general] IB Setup In-Reply-To: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> References: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> Message-ID: <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> Have you verified that your sm is up and running? Do you have a valid lid for the interface? Can you lidtrace between two devices on the fabric? Does the remote hca have a lid? ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Majumder, Rajib Sent: Thursday, August 14, 2008 10:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] IB Setup Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajib.majumder at credit-suisse.com Thu Aug 14 21:46:39 2008 From: rajib.majumder at credit-suisse.com (Majumder, Rajib) Date: Fri, 15 Aug 2008 12:46:39 +0800 Subject: [ofa-general] IB Setup In-Reply-To: <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> References: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> Message-ID: <0175FAC12977B047809C1BACA25881AD02A8C9A2@ESNG17P32002A.csfb.cs-group.com> #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.3.000 node_guid: 001e:0bff:ff84:58fc sys_image_guid: 001e:0bff:ff84:58ff vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 3 port_lmc: 0x00 This is the info for the remote HCA ports. ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:42 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup Have you verified that your sm is up and running? Do you have a valid lid for the interface? Can you lidtrace between two devices on the fabric? Does the remote hca have a lid? ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Majumder, Rajib Sent: Thursday, August 14, 2008 10:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] IB Setup Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcummins at sgi.com Thu Aug 14 21:52:51 2008 From: rcummins at sgi.com (Robert Cummins) Date: Thu, 14 Aug 2008 21:52:51 -0700 Subject: [ofa-general] IB Setup In-Reply-To: <0175FAC12977B047809C1BACA25881AD02A8C9A2@ESNG17P32002A.csfb.cs-group.com> References: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> <0175FAC12977B047809C1BACA25881AD02A8C9A2@ESNG17P32002A.csfb.cs-group.com> Message-ID: <66DED8287791CE4A973228C522229811037E0813@mtv-amer001e--3.americas.sgi.com> What does the local hca look like? What are the ip addresses of the local ib interface and remote ib interface? ________________________________ From: Majumder, Rajib [mailto:rajib.majumder at credit-suisse.com] Sent: Thursday, August 14, 2008 10:47 PM To: Robert Cummins; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.3.000 node_guid: 001e:0bff:ff84:58fc sys_image_guid: 001e:0bff:ff84:58ff vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 3 port_lmc: 0x00 This is the info for the remote HCA ports. ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:42 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup Have you verified that your sm is up and running? Do you have a valid lid for the interface? Can you lidtrace between two devices on the fabric? Does the remote hca have a lid? ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Majumder, Rajib Sent: Thursday, August 14, 2008 10:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] IB Setup Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajib.majumder at credit-suisse.com Thu Aug 14 21:59:05 2008 From: rajib.majumder at credit-suisse.com (Majumder, Rajib) Date: Fri, 15 Aug 2008 12:59:05 +0800 Subject: [ofa-general] IB Setup In-Reply-To: <66DED8287791CE4A973228C522229811037E0813@mtv-amer001e--3.americas.sgi.com> References: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> <0175FAC12977B047809C1BACA25881AD02A8C9A2@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0813@mtv-amer001e--3.americas.sgi.com> Message-ID: <0175FAC12977B047809C1BACA25881AD02A8C9A3@ESNG17P32002A.csfb.cs-group.com> local interface -------------------- ib0 Link encap:UNSPEC HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:11.4.8.35 Bcast:11.4.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) ib1 Link encap:UNSPEC HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00 inet6 addr: fe80::21b:78ff:ff34:4282/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:68 (68.0 b) remote interface ----------------------- ib0 Link encap:UNSPEC HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:11.4.8.36 Bcast:11.4.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) ib1 Link encap:UNSPEC HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00 inet6 addr: fe80::21e:bff:ff84:58fe/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:68 (68.0 b) local interface -------------------- #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.2.000 node_guid: 001b:78ff:ff34:4280 sys_image_guid: 001b:78ff:ff34:4283 vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 # ping 11.4.8.36 PING 11.4.8.36 (11.4.8.36) 56(84) bytes of data. From 11.4.8.35: icmp_seq=2 Destination Host Unreachable Do you know why port 1 is down? ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:53 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup What does the local hca look like? What are the ip addresses of the local ib interface and remote ib interface? ________________________________ From: Majumder, Rajib [mailto:rajib.majumder at credit-suisse.com] Sent: Thursday, August 14, 2008 10:47 PM To: Robert Cummins; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.3.000 node_guid: 001e:0bff:ff84:58fc sys_image_guid: 001e:0bff:ff84:58ff vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 3 port_lmc: 0x00 This is the info for the remote HCA ports. ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:42 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup Have you verified that your sm is up and running? Do you have a valid lid for the interface? Can you lidtrace between two devices on the fabric? Does the remote hca have a lid? ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Majumder, Rajib Sent: Thursday, August 14, 2008 10:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] IB Setup Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcummins at sgi.com Thu Aug 14 22:01:34 2008 From: rcummins at sgi.com (Robert Cummins) Date: Thu, 14 Aug 2008 22:01:34 -0700 Subject: [ofa-general] IB Setup In-Reply-To: <0175FAC12977B047809C1BACA25881AD02A8C9A3@ESNG17P32002A.csfb.cs-group.com> References: <0175FAC12977B047809C1BACA25881AD02A8C9A1@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0812@mtv-amer001e--3.americas.sgi.com> <0175FAC12977B047809C1BACA25881AD02A8C9A2@ESNG17P32002A.csfb.cs-group.com> <66DED8287791CE4A973228C522229811037E0813@mtv-amer001e--3.americas.sgi.com> <0175FAC12977B047809C1BACA25881AD02A8C9A3@ESNG17P32002A.csfb.cs-group.com> Message-ID: <66DED8287791CE4A973228C522229811037E0814@mtv-amer001e--3.americas.sgi.com> Okay, from what you sent before you have ib1 cabled on the remote system but ib0 is assigned the ip address. Either move the cable to the other interface on the remote system or ifconfig ib1 to 11.4.8.36/16 ________________________________ From: Majumder, Rajib [mailto:rajib.majumder at credit-suisse.com] Sent: Thursday, August 14, 2008 10:59 PM To: Robert Cummins; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup local interface -------------------- ib0 Link encap:UNSPEC HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:11.4.8.35 Bcast:11.4.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) ib1 Link encap:UNSPEC HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00 inet6 addr: fe80::21b:78ff:ff34:4282/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:68 (68.0 b) remote interface ----------------------- ib0 Link encap:UNSPEC HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:11.4.8.36 Bcast:11.4.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) ib1 Link encap:UNSPEC HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00 inet6 addr: fe80::21e:bff:ff84:58fe/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:68 (68.0 b) local interface -------------------- #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.2.000 node_guid: 001b:78ff:ff34:4280 sys_image_guid: 001b:78ff:ff34:4283 vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 # ping 11.4.8.36 PING 11.4.8.36 (11.4.8.36) 56(84) bytes of data. >From 11.4.8.35: icmp_seq=2 Destination Host Unreachable Do you know why port 1 is down? ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:53 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup What does the local hca look like? What are the ip addresses of the local ib interface and remote ib interface? ________________________________ From: Majumder, Rajib [mailto:rajib.majumder at credit-suisse.com] Sent: Thursday, August 14, 2008 10:47 PM To: Robert Cummins; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup #ibv_devinfo hca_id: mlx4_0 fw_ver: 2.3.000 node_guid: 001e:0bff:ff84:58fc sys_image_guid: 001e:0bff:ff84:58ff vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 3 port_lmc: 0x00 This is the info for the remote HCA ports. ________________________________ From: Robert Cummins [mailto:rcummins at sgi.com] Sent: 15 August 2008 12:42 To: Majumder, Rajib; general at lists.openfabrics.org Subject: RE: [ofa-general] IB Setup Have you verified that your sm is up and running? Do you have a valid lid for the interface? Can you lidtrace between two devices on the fabric? Does the remote hca have a lid? ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Majumder, Rajib Sent: Thursday, August 14, 2008 10:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] IB Setup Hi, I am running OFED 1.3 in SLERT 10 SP2 on ConnectX device. I am getting a Destination Host Unreachable error when I ping the remote HCA port. Pinging the local ib0 interface works fine. Any idea, what's wrong with the config? Thanks Rajib ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanba at gmail.com Fri Aug 15 07:11:32 2008 From: dotanba at gmail.com (Dotan Barak) Date: Fri, 15 Aug 2008 16:11:32 +0200 Subject: [ofa-general] ***SPAM*** Accessing RDMA Memory Locally In-Reply-To: <9870a2060808141424s3f2da738y372741dae79187e1@mail.gmail.com> References: <9870a2060808141424s3f2da738y372741dae79187e1@mail.gmail.com> Message-ID: <48A58E94.1020902@gmail.com> Adrien Guillon wrote: > If I allocate memory to be accessible by others using RDMA operations, > my understanding is that I use RDMA operations myself to access that > memory locally. Is that correct? No, the memory is local memory for your process which you can allocate using malloc (or even static memory) thay you registered. > Can I access that memory directly with pointers in the case of a node > accessing its own memory location? Are local access RDMA operations > efficient? You can read the buffers (that remote nodes RDMA Write to) using memcpy/pointer access or any other way that you can use memory buffers in your process. After the RDMA Write ands and the data is placed in the remote buffer, one can treat the local buffer as IB wasn't used .... Dotan > > Thanks, > > AJ > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From vlad at lists.openfabrics.org Fri Aug 15 02:54:09 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 15 Aug 2008 02:54:09 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080815-0200 daily build status Message-ID: <20080815095410.089E8E60E52@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080815-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From swise at opengridcomputing.com Fri Aug 15 04:13:14 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 15 Aug 2008 06:13:14 -0500 Subject: [ofa-general] [PATCH 1/5] [uDAPL v2] dapl scm: update max_rdma_read_iov, max_rdma_write_iov EP attributes during query In-Reply-To: <000601c8fe6a$cc1b9f90$8963fe0a@amr.corp.intel.com> References: <000601c8fe6a$cc1b9f90$8963fe0a@amr.corp.intel.com> Message-ID: <48A564CA.4080603@opengridcomputing.com> Arlin Davis wrote: > >From 7e25c0f21d755cce3aa7aff993fb0baddaafc0e8 Mon Sep 17 00:00:00 2001 > From: Arlin Davis > Patch set for uDAPL v2 socket cm provider improvements. > > Signed-off by: Arlin Davis ardavis at ichips.intel.com > --- > dapl/openib_scm/dapl_ib_util.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/dapl/openib_scm/dapl_ib_util.c b/dapl/openib_scm/dapl_ib_util.c > index 43f85ac..f3874ca 100644 > --- a/dapl/openib_scm/dapl_ib_util.c > +++ b/dapl/openib_scm/dapl_ib_util.c > @@ -478,6 +478,8 @@ DAT_RETURN dapls_ib_query_hca ( > ep_attr->max_request_iov = dev_attr.max_sge; > ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; > ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; > + ep_attr->max_rdma_read_iov= dev_attr.max_sge; > + ep_attr->max_rdma_write_iov= dev_attr.max_sge; > dapl_dbg_log (DAPL_DBG_TYPE_UTIL, > " query_hca: MAX msg %llu mtu %d dto %d iov %d" > " rdma i%d,o%d\n", > This breaks iwarp which only allows 1 SGE for the local SGL in an rdma read. I thought dev.max_qp_rd_atom indicated this value, not max_sge. So either this patch is wrong, or the rdma device attrs need to be enhanced to separate max_read_sge as a stand-alone attr. Steve. From parhizj at ufl.edu Fri Aug 15 10:08:08 2008 From: parhizj at ufl.edu (John Parhizgari) Date: Fri, 15 Aug 2008 13:08:08 -0400 Subject: [ofa-general] ibdiagnet -pm Patch Message-ID: <48A5B7F8.3030405@ufl.edu> This is a patch for a problem that has been in the ibdiagnet tools for 1.3.1; this problem is seen when running "ibdiagnet -pm" where the output pm file, ibdiagnet.pm, does not show all the error counters that were actually retrieved: specifically port_rcv_remote_physical_errors and port_rcv_switch_relay_errors This problem has been noticed in earlier ofed versions as well. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ibdiagnet-ofed131.patch URL: From caitlin.bestler at gmail.com Fri Aug 15 10:58:52 2008 From: caitlin.bestler at gmail.com (Caitlin Bestler) Date: Fri, 15 Aug 2008 10:58:52 -0700 Subject: [ofa-general] ***SPAM*** Accessing RDMA Memory Locally In-Reply-To: <48A58E94.1020902@gmail.com> References: <9870a2060808141424s3f2da738y372741dae79187e1@mail.gmail.com> <48A58E94.1020902@gmail.com> Message-ID: <469958e00808151058r476429f9i318c26bd59d36596@mail.gmail.com> On Fri, Aug 15, 2008 at 7:11 AM, Dotan Barak wrote: >> >> Can I access that memory directly with pointers in the case of a node >> accessing its own memory location? Are local access RDMA operations >> efficient? > > You can read the buffers (that remote nodes RDMA Write to) using > memcpy/pointer access or any other way that you can use memory buffers in > your process. > > After the RDMA Write ands and the data is placed in the remote buffer, one > can treat the local buffer as IB wasn't used .... > After the appropriate completion is received the application can treat the local buffer as though RDMA was not used. But the application generally should not attempt to infer when the RDMA Write ends, but rather rely on completions. Completions are simple. Figuring out possible re-orderings, IB vs iWARP and device specific cache interactions are tricky. It's ultimately like have a co-processor. You can use the memory as normal, but you should rely on the agreed upon handshake to determine when it is safe to do so. For RDMA, those are the completions. From rdreier at cisco.com Fri Aug 15 11:31:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 15 Aug 2008 11:31:39 -0700 Subject: [ofa-general] Re: [PATCH 2/2] IB/ipath - Fixed incorrect check for max physical address in TID In-Reply-To: <20080814000811.7874.52077.stgit@eng-46.mv.qlogic.com> (Ralph Campbell's message of "Wed, 13 Aug 2008 17:08:11 -0700") References: <20080814000800.7874.6686.stgit@eng-46.mv.qlogic.com> <20080814000811.7874.52077.stgit@eng-46.mv.qlogic.com> Message-ID: thanks, applied 1 and 2. - R. From arlin.r.davis at intel.com Fri Aug 15 12:39:12 2008 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Fri, 15 Aug 2008 12:39:12 -0700 Subject: [ofa-general] [PATCH 1/5] [uDAPL v2] dapl scm:update max_rdma_read_iov, max_rdma_write_iov EP attributes during query In-Reply-To: <48A564CA.4080603@opengridcomputing.com> References: <000601c8fe6a$cc1b9f90$8963fe0a@amr.corp.intel.com> <48A564CA.4080603@opengridcomputing.com> Message-ID: >> @@ -478,6 +478,8 @@ DAT_RETURN dapls_ib_query_hca ( >> ep_attr->max_request_iov = dev_attr.max_sge; >> ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; >> ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; >> + ep_attr->max_rdma_read_iov= dev_attr.max_sge; >> + ep_attr->max_rdma_write_iov= dev_attr.max_sge; >> dapl_dbg_log (DAPL_DBG_TYPE_UTIL, >> " query_hca: MAX msg %llu mtu %d dto %d iov %d" >> " rdma i%d,o%d\n", >> >This breaks iwarp which only allows 1 SGE for the local SGL in an rdma >read. I thought dev.max_qp_rd_atom indicated this value, not >max_sge. >So either this patch is wrong, or the rdma device attrs need to be >enhanced to separate max_read_sge as a stand-alone attr. max_qp_rd_atom - max number of rdma reads/atomics outstanding per QP as a target. max_qp_init_rd_atom - max number of rdma reads/atomics outstanding per QP as initiator. max_sge - max number os scatter/gather entries per work request for all work requests other then reliable datagram. At the verbs level, there is no separate SGE value provided for rdma_reads over other work request types. The uDAPL socket cm providers only support IB so this is not an issue with this patch. However, with uDAPL rdma_cm providers, I guess we need to add a device check on this query and return 1 if iWARP. Is this an iWARP specification or implementation issue? -arlin From swise at opengridcomputing.com Fri Aug 15 13:39:28 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 15 Aug 2008 15:39:28 -0500 Subject: [ofa-general] [PATCH 1/5] [uDAPL v2] dapl scm:update max_rdma_read_iov, max_rdma_write_iov EP attributes during query In-Reply-To: References: <000601c8fe6a$cc1b9f90$8963fe0a@amr.corp.intel.com> <48A564CA.4080603@opengridcomputing.com> Message-ID: <48A5E980.8000701@opengridcomputing.com> Davis, Arlin R wrote: > > > >>> @@ -478,6 +478,8 @@ DAT_RETURN dapls_ib_query_hca ( >>> ep_attr->max_request_iov = dev_attr.max_sge; >>> ep_attr->max_rdma_read_in = dev_attr.max_qp_rd_atom; >>> ep_attr->max_rdma_read_out= dev_attr.max_qp_rd_atom; >>> + ep_attr->max_rdma_read_iov= dev_attr.max_sge; >>> + ep_attr->max_rdma_write_iov= dev_attr.max_sge; >>> dapl_dbg_log (DAPL_DBG_TYPE_UTIL, >>> " query_hca: MAX msg %llu mtu %d dto %d iov %d" >>> " rdma i%d,o%d\n", >>> >>> >> This breaks iwarp which only allows 1 SGE for the local SGL in an rdma >> read. I thought dev.max_qp_rd_atom indicated this value, not >> max_sge. >> So either this patch is wrong, or the rdma device attrs need to be >> enhanced to separate max_read_sge as a stand-alone attr. >> > > max_qp_rd_atom - max number of rdma reads/atomics outstanding > per QP as a target. > max_qp_init_rd_atom - max number of rdma reads/atomics outstanding > per QP as initiator. > max_sge - max number os scatter/gather entries per work request > for all work requests other then reliable datagram. > > At the verbs level, there is no separate SGE value provided for > rdma_reads over other work request types. > > The uDAPL socket cm providers only support IB so this is not an issue > with this patch. However, with uDAPL rdma_cm providers, I guess we > need to add a device check on this query and return 1 if iWARP. > > Is this an iWARP specification or implementation issue? > > iWARP spec. From rdreier at cisco.com Fri Aug 15 15:19:53 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 15 Aug 2008 15:19:53 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: (Roland Dreier's message of "Mon, 11 Aug 2008 13:55:04 -0700") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: OK, how about the patch below? The idea is that we don't really care about tasks that happen when running ipoib_stop(), except we don't want netif_carrier_on() to race with bringing down the interface -- but I think adding rtnl_lock() around that is sufficient. Oh and the calls to flush_scheduled_work() were bogus, since we never use schedule_work() in the first place. It only makes sense to flush the real ipoib workqueue. This seems to work fine on my test system and produces no lockdep warnings... any testing and/or review would be good, but I'm pretty happy with this approach. - R. drivers/infiniband/ulp/ipoib/ipoib_main.c | 19 +++++++++---------- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 10 +++++++++- 2 files changed, 18 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index f51201b..7e9e218 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -156,14 +156,8 @@ static int ipoib_stop(struct net_device *dev) netif_stop_queue(dev); - /* - * Now flush workqueue to make sure a scheduled task doesn't - * bring our internal state back up. - */ - flush_workqueue(ipoib_workqueue); - - ipoib_ib_dev_down(dev, 1); - ipoib_ib_dev_stop(dev, 1); + ipoib_ib_dev_down(dev, 0); + ipoib_ib_dev_stop(dev, 0); if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { struct ipoib_dev_priv *cpriv; @@ -1314,7 +1308,7 @@ sysfs_failed: register_failed: ib_unregister_event_handler(&priv->event_handler); - flush_scheduled_work(); + flush_workqueue(ipoib_workqueue); event_failed: ipoib_dev_cleanup(priv->dev); @@ -1373,7 +1367,12 @@ static void ipoib_remove_one(struct ib_device *device) list_for_each_entry_safe(priv, tmp, dev_list, list) { ib_unregister_event_handler(&priv->event_handler); - flush_scheduled_work(); + + rtnl_lock(); + dev_change_flags(priv->dev, priv->dev->flags & ~IFF_UP); + rtnl_unlock(); + + flush_workqueue(ipoib_workqueue); unregister_netdev(priv->dev); ipoib_dev_cleanup(priv->dev); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 8950e95..ac33c8f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -392,8 +392,16 @@ static int ipoib_mcast_join_complete(int status, &priv->mcast_task, 0); mutex_unlock(&mcast_mutex); - if (mcast == priv->broadcast) + if (mcast == priv->broadcast) { + /* + * Take RTNL lock here to avoid racing with + * ipoib_stop() and turning the carrier back + * on while a device is being removed. + */ + rtnl_lock(); netif_carrier_on(dev); + rtnl_unlock(); + } return 0; } From syphiliseso9 at richmayerhomes.com Fri Aug 15 23:34:10 2008 From: syphiliseso9 at richmayerhomes.com (Desmond Goddard) Date: Sat, 16 Aug 2008 15:34:10 +0900 Subject: [ofa-general] No test, No class, buy yourself Bacheelor/MasteerMBA/Doctoraate dip1omas, VALID in all countries Message-ID: <01c8ffb5$873f1180$248dbe7d@syphiliseso9> WHAT A GREAT IDEA! We provide a concept that will allow anyone with sufficient work experience to obtain a fully verifiable University Degree. Bachelors, Masters or even a Doctorate. For US: 1.781.634.7970 Outside US: +1.781.634.7970 "Just leave your NAME & PHONE NO. (with CountryCode)" in the voicemail. Our staff will get back to you in next few days! -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Sat Aug 16 02:54:44 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 16 Aug 2008 02:54:44 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080816-0200 daily build status Message-ID: <20080816095444.7F172E607F8@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.60-0.21-smp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.60-0.21-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.60-0.21-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-93.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-93.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-93.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1041: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: include/linux/skbuff.h: In function 'skb_add_data': include/linux/skbuff.h:1013: warning: pointer targets in passing argument 2 of 'csum_partial_copy_from_user' differ in signedness make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iscsi_iser.h:45, from /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.c:38: /home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/include/scsi/iscsi_proto.h:160: error: 'SCSI_MAX_VARLEN_CDB_SIZE' undeclared here (not in a function) make[4]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser/iser_verbs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/ulp/iser] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080816-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From porcine_1998 at gwise.louisville.edu Sat Aug 16 03:33:31 2008 From: porcine_1998 at gwise.louisville.edu (MSNBC Breaking News) Date: Sat, 16 Aug 2008 18:33:31 +0800 Subject: [ofa-general] msnbc.com - BREAKING NEWS: Paris Hilton: I Will Give My Body To The Winner Of The French Open Message-ID: <001001c8ff8b$87897a10$0650f43a@GHOSTE04348AF3> An HTML attachment was scrubbed... URL: From customer.service at lloydstsd.com Sat Aug 16 04:14:26 2008 From: customer.service at lloydstsd.com (Lloyds TSB Bank) Date: Sat, 16 Aug 2008 07:14:26 -0400 Subject: [ofa-general] Protect Your Account From Fraud Message-ID: An HTML attachment was scrubbed... URL: From info at lottery.co.uk Sat Aug 16 04:26:13 2008 From: info at lottery.co.uk (Rose Wood) Date: Sat, 16 Aug 2008 09:26:13 -0200 Subject: [ofa-general] ***Congratulations Lucky Winner*** Message-ID: <20080816112520.M63386@lottery.co.uk> Dear receipt, This is to inform you that your email ID has won 1,000,000,00 Pounds from the UK NATIONAL LOTTERY online programme which was held this weekend through the Online internet ballot system. Your Id with other ID was picked online by our online machine and your ID appear to fall out as one of the Six Lucky winner of this year promo which has qualify you as a winner of the sum of 1,000,000,00 Pounds. You are to contact our agent which shall lead you until you claim the winning amount with your date and personal details below for more information Agent Name: Linda Hills Email: lindahills001 at gmail.com Name Address Age Phone Occupations Country Manager Frank Wood R Direct Tel:+447031985765 From changquing.tang at hp.com Sat Aug 16 08:14:08 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Sat, 16 Aug 2008 15:14:08 +0000 Subject: [ofa-general] How many processes on a node can open IB device ? Message-ID: <58C6777539C300489D145B0F8E29C32815EA8DD024@GVW0673EXC.americas.hpqcorp.net> HI, driver engineers: I have system with 8 cores, 16G memory, the node is idle. it is Mellanox connectX card: hca_id: mlx4_0 fw_ver: 2.3.000 node_guid: 001e:0bff:ff83:9f1c sys_image_guid: 001e:0bff:ff83:9f1f vendor_id: 0x02c9 vendor_part_id: 25418 hw_ver: 0xA0 board_id: HP_08B0000001 phys_port_cnt: 2 I have simple IBV code, which only open the device and create PD. (attached below), then the code sleep there. When I start as many processes as I could, it fails at 895 copies, it fails with error: ibv_open_device() failed So how many IB processes can I run on a node ? Is there any driver limit ? Thanks for help. --CQ Tang, HP-MPI compile: gcc -o ibv.x ibv.c -libverbs run: #!/bin/sh count=0 while [ $? -eq 0 ] do count=`expr $count + 1` echo "%%%%%%%%%%%%%loop: $count;" ./ibv.x & done ibv.c: #include #include #include /* * Main program code. */ int main(int argc, char *argv[]) { int i; int nif; int err; struct ibv_device **interface_list; struct ibv_device_attr device_attr; struct ibv_port_attr port_attr; struct ibv_qp_attr qp_attr; struct ibv_qp_init_attr qp_init_attr; struct ibv_context *hca_hndl; struct ibv_pd *pd_hndl; struct ibv_mr *mr_hndl; uint8_t port_num; uint16_t port_lid; char *buf; int size; int step; interface_list = ibv_get_device_list(&nif); if (nif <= 0) { fprintf(stderr, "NO ibv interface found\n"); return(-1); } hca_hndl = ibv_open_device(interface_list[0]); if (!hca_hndl) { fprintf(stderr, "ibv_open_device() failed\n"); return(-1); } err = ibv_query_device(hca_hndl, &device_attr); if (err != 0) { fprintf(stderr, "ibv_query_device(() failed\n"); return(-1); } for (i = 0; i < device_attr.phys_port_cnt; i++) { port_num = (uint8_t)(i + 1); err = ibv_query_port(hca_hndl, port_num, &port_attr); if (err != 0) { fprintf(stderr, "ibv_query_port() failed\n"); return(-1); } if (port_attr.state != IBV_PORT_ACTIVE) { continue; } port_lid = port_attr.lid; break; } if (i == device_attr.phys_port_cnt) { fprintf(stderr, "No active port\n"); return(-1); } ibv_free_device_list(interface_list); pd_hndl = ibv_alloc_pd(hca_hndl); if (!pd_hndl) { fprintf(stderr, "ib_alloc_pd() failed\n"); return(-1); } sleep(600); fprintf(stderr, "No IBV error\n"); return (0); } From yosefe at voltaire.com Sat Aug 16 12:18:50 2008 From: yosefe at voltaire.com (Yosef Etigin) Date: Sat, 16 Aug 2008 22:18:50 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: <32cb786f0808161218o417553b5w1738a517f0eb468a@mail.gmail.com> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > index 8950e95..ac33c8f 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > @@ -392,8 +392,16 @@ static int ipoib_mcast_join_complete(int status, > &priv->mcast_task, 0); > mutex_unlock(&mcast_mutex); > > - if (mcast == priv->broadcast) > + if (mcast == priv->broadcast) { > + /* > + * Take RTNL lock here to avoid racing with > + * ipoib_stop() and turning the carrier back > + * on while a device is being removed. > + */ > + rtnl_lock(); > netif_carrier_on(dev); > + rtnl_unlock(); > + } > > return 0; > } What if you bring the device down, while you get a join completion event? ipoib_stop() can run in parellel with ipoib_mcast_join_complete(), and you will just wait for ipoib_stop() to finish to do netif_carrier_on() afterwards. --Yossi Unfortunately, you can't tell if the device was brought down, or still not brought up From sashak at voltaire.com Sat Aug 16 16:08:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 17 Aug 2008 02:08:41 +0300 Subject: [ofa-general] [PATCH] opensm/osm_ucast_mgr: code consolidation and cleanup In-Reply-To: <48A2EE96.9000408@mellanox.co.il> References: <1214252698.5369.537.camel@cardanus.llnl.gov> <20080624130950.GL7341@sashak.voltaire.com> <20080624204340.GR7341@sashak.voltaire.com> <20080624204509.GS7341@sashak.voltaire.com> <4861F98F.8080308@mellanox.co.il> <20080625101734.GB22159@sashak.voltaire.com> <48A2EE96.9000408@mellanox.co.il> Message-ID: <20080816230841.GI2339@sashak.voltaire.com> On 17:24 Wed 13 Aug , Yevgeny Kliteynik wrote: > > Perhaps you mean this: > > + while(!cl_is_qlist_empty(&p_mgr->port_order_list)) > + cl_qlist_remove_head(&p_mgr->port_order_list); Yes, meant this. Sasha From sashak at voltaire.com Sat Aug 16 16:09:32 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 17 Aug 2008 02:09:32 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_mgr.c: cleaning port_order_list In-Reply-To: <48A2F118.30505@dev.mellanox.co.il> References: <48A2F118.30505@dev.mellanox.co.il> Message-ID: <20080816230932.GJ2339@sashak.voltaire.com> On 17:35 Wed 13 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Small bug fix in cleaning the port order list. > This bug was causing assertion in list handling. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From iuzzolin at nmia.com Sun Aug 17 01:03:34 2008 From: iuzzolin at nmia.com (Harold/Carlyn Iuzzolino) Date: Sun, 17 Aug 2008 02:03:34 -0600 Subject: [ofa-general] Compile error: NULL isn't defined in dat/common/dat_strerror.c Message-ID: <200808170803.m7H83YE23011@gandalf.iuzzolino.com> Dear Openfabrics and Arlin Davis general at lists.openfabrics.org arlin.r.davis at intel.com On August 6th I reported a bug, Number 1120, that In subroutine dat_strerror.c NULL isn't defined. Which of the .h files, dat_dr.h, dat_sr.h, , do you need to include in dat_strerror.c so that NULL gets defined? I am using a Fedora 9, Athlon 64X4 bit machine. [root at treebeard OFED-1.4-20080816-0600]# cat /etc/issue Fedora release 9 (Sulphur) Kernel \r on an \m (\l) [root at treebeard OFED-1.4-20080816-0600]# uname -a Linux treebeard 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux As far as I can tell, this bug hasn't been fixed in the last 4 versions and because of the way the OFED software is packaged, ie a tar file of rpms which have tar.gz files inside, I can't try to fix it myself and then run install.pl again. [root at treebeard root]# md5sum `ff compat-dapl-1.2.8.tar.gz` 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080816-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080813-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080803-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080811-0819.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz I downloaded version OFED-1.4-20080813-0600.tar.gz and then later, OFED-1.4-20080816-0600.tar.gz and tried installing it/them. No luck. Error in the same spot. Build compat-dapl RPM Running rpmbuild --rebuild --define '_topdir /var/tmp/OFED_topdir' --define 'dist ' --target x86_64 --define '_prefix /usr' --define '_exec_prefix /usr' --define '_sysconfdir /etc' --define '_defaultdocdir /usr/share/doc/compat-dapl-1.2.8' --define '_usr /usr' /root/OFED-1.4-20080813-0600/SRPMS/compat-dapl-1.2.8-1.src.rpm Failed to build compat-dapl RPM See /tmp/OFED.32320.logs/compat-dapl.rpmbuild.log gcc -DHAVE_CONFIG_H -I. -I. -I. -Wall -g -D_GNU_SOURCE -DOS_RELEASE=131078 -I./dat/include/ -I./dat/udat/ -I./dat/udat/linux -I./dat/common/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dat_udat_libdat_la-dat_dictionary.lo -MD -MP -MF .deps/dat_udat_libdat_la-dat_dictionary.Tpo -c dat/common/dat_dictionary.c -o dat_udat_libdat_la-dat_dictionary.o >/dev/null 2>&1 dat/common/dat_strerror.c: In function 'dat_strerror': dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) dat/common/dat_strerror.c:621: error: (Each undeclared identifier is reported only once dat/common/dat_strerror.c:621: error: for each function it appears in.) make[2]: *** [dat_udat_libdat_la-dat_strerror.lo] Error 1 Because of the way the install.pl script works, I have to wait for someone to fix it, and make a new OFED-1.4-200808xx-0600.tar.gz. I am puzzled that this problem hasn't occurred for everyone who tries to compile. But, please could someone fix this. I can't continue compiling until it is fixed. Carlyn Iuzzolino From iuzzolin at nmia.com Sun Aug 17 01:04:23 2008 From: iuzzolin at nmia.com (Harold/Carlyn Iuzzolino) Date: Sun, 17 Aug 2008 02:04:23 -0600 Subject: [ofa-general] How do you repackage individual source rpm files? Message-ID: <200808170804.m7H84NF23017@gandalf.iuzzolino.com> Dear Openfabrics, general at lists.openfabrics.org On August 5th, I reported a bug in the compat-dapl-1.2.8.tar.gz package, in particular: dat/common/dat_strerror.c: In function 'dat_strerror': dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) It is now August 17th and the bug is STILL not fixed. And I'm stuck. Because of the way the OFED-1.4-20080816-0600.tgz file is created and the way install.pl works, I can't just make a fix to the dat_strerror.c file and try again. I have to wait until one of you fixes it and makes a new OFED-1.4-20080816-0600.tgz file. Is there any way for someone official to show us how to repackage the OFED-1.4-20080816-0600.tgz file? ----------------- Inside OFED-1.4-20080816-0600.tgz, is a directory SRPMS/ with lots of *.src.rpm's. Inside the offending compat-dapl-1.2.8-1.src.rpm is a tar file that contains the sources, in particular the dat/common/dat_strerror.c file. [root at treebeard root]# rpm -qilp OFED-1.4-20080816-0600/SRPMS/compat-dapl-1.2.8-1.src.rpm Name : compat-dapl Relocations: (not relocatable) Version : 1.2.8 Vendor: (none) Release : 1 Build Date: Sat 16 Aug 2008 07:17:58 AM MDT Install Date: (not installed) Build Host: hosting.openfabrics.org Group : System Environment/Libraries Source RPM: (none) Size : 675045 License: Dual GPL/BSD/CPL Signature : (none) URL : http://openfabrics.org/ Summary : A Library for userspace access to RDMA devices using OS Agnostic DAT API v1.2. Description : Along with the OpenFabrics kernel drivers, libdat and libdapl provides a userspace RDMA API that supports DAT 1.2 specification and IB transport extensions for atomic operations and rdma write with immediate data. compat-dapl-1.2.8.tar.gz <----------------------- dapl.spec <------------ the rpm spec file??? If I want to try to fix the subroutine dat/common/dat_strerror.c I would have to 1. Extract the compat-dapl-1.2.8-1.src.rpm somewhere 2. Change the file compat-dapl-1.2.8/dat/common/dat_strerror.c 3. tar together the directory compat-dapl-1.2.8 4. Recreate the rpm file. I know how to 'tar cvfz compat-dapl-1.2.8.tgz compat-dapl-1.2.8' But I don't know how to create the rpm file. I assume the dapl.spec file is the rpm spec file. How do you use it in creating the rpm file? That way I could at least make an attempt to fix compile errors and continue compiling while I wait for the OFED team fix the error in the correct way and give us the new OFED-1.4-20080816-0600.tgz. Carlyn Iuzzolino From vlad at lists.openfabrics.org Sun Aug 17 03:03:58 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 17 Aug 2008 03:03:58 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080817-0200 daily build status Message-ID: <20080817100358.A805FE60CF5@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From eli at dev.mellanox.co.il Sun Aug 17 03:45:22 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Sun, 17 Aug 2008 13:45:22 +0300 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> Message-ID: <20080817104522.GA23302@mtls03> On Fri, Aug 15, 2008 at 03:19:53PM -0700, Roland Dreier wrote: > @@ -1314,7 +1308,7 @@ sysfs_failed: > > register_failed: > ib_unregister_event_handler(&priv->event_handler); > - flush_scheduled_work(); > + flush_workqueue(ipoib_workqueue); > I don't find any flaw in this approach, but I don't understand why is the flush_workqueue(ipoib_workqueue) above needed. From jsquyres at cisco.com Sun Aug 17 05:21:16 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Sun, 17 Aug 2008 08:21:16 -0400 Subject: [ofa-general] STOP the onslaught of EWG spam Message-ID: The EWG list has gotten spam bombed over the last few hours. I lost count at 500+ spams in my inbox. I therefore logged into openfabrics.org and changed the site-wide password for Mailman (I have notified Jeff Becker of the new password). I then changed the EWG list to silently discard all non- member posts. Since I didn't know if other OF lists were being spam- bombed, I did the same for all OF lists as well. The spam onslaught has now stopped. I also notice that our mailmain installation is hopelessly out of date; it's v2.1.5 and the current version (including several important security fixes since v2.1.5) is v2.1.11. Someone needs to fix this ASAP. -- Jeff Squyres Cisco Systems From john.russo at qlogic.com Sun Aug 17 06:32:29 2008 From: john.russo at qlogic.com (John Russo) Date: Sun, 17 Aug 2008 08:32:29 -0500 Subject: [ofa-general] RE: [ewg] STOP the onslaught of EWG spam References: Message-ID: <99863D2ED484D449811D97A4C44C9CBD82BB8B@EPEXCH2.qlogic.org> Thank you. I was getting hammered during that same time. I had well over 800 emails. ________________________________ From: ewg-bounces at lists.openfabrics.org on behalf of Jeff Squyres Sent: Sun 8/17/2008 8:21 AM To: OpenFabrics EWG; OpenFabrics General Subject: [ewg] STOP the onslaught of EWG spam The EWG list has gotten spam bombed over the last few hours. I lost count at 500+ spams in my inbox. I therefore logged into openfabrics.org and changed the site-wide password for Mailman (I have notified Jeff Becker of the new password). I then changed the EWG list to silently discard all non- member posts. Since I didn't know if other OF lists were being spam- bombed, I did the same for all OF lists as well. The spam onslaught has now stopped. I also notice that our mailmain installation is hopelessly out of date; it's v2.1.5 and the current version (including several important security fixes since v2.1.5) is v2.1.11. Someone needs to fix this ASAP. -- Jeff Squyres Cisco Systems _______________________________________________ ewg mailing list ewg at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at dev.mellanox.co.il Sun Aug 17 08:12:18 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 17 Aug 2008 18:12:18 +0300 Subject: [ofa-general] [PATCH] opensm/libvendor/osm_vendor_mlx_sa.c: handling attribute offset of 0 Message-ID: <48A83FD2.4090604@dev.mellanox.co.il> Following patch f752a876edecfdac9e1d07bc0747a2c296eed230, "set SA attribute offset to 0 when no records are returned" Attribute offset of 0 caused the vendor to crash. Signed-off-by: Yevgeny Kliteynik --- opensm/libvendor/osm_vendor_mlx_sa.c | 30 ++++++++++++++++++------------ 1 files changed, 18 insertions(+), 12 deletions(-) diff --git a/opensm/libvendor/osm_vendor_mlx_sa.c b/opensm/libvendor/osm_vendor_mlx_sa.c index efd04bd..d0da219 100644 --- a/opensm/libvendor/osm_vendor_mlx_sa.c +++ b/opensm/libvendor/osm_vendor_mlx_sa.c @@ -140,18 +140,24 @@ __osmv_sa_mad_rcv_cb(IN osm_madw_t * p_madw, #else /* we used the offset value to calculate the number of records in here */ - query_res.result_cnt = (uintn_t) - ((p_madw->mad_size - IB_SA_MAD_HDR_SIZE) / - ib_get_attr_size(p_sa_mad->attr_offset)); - osm_log(p_bind->p_log, OSM_LOG_DEBUG, - "__osmv_sa_mad_rcv_cb: Count = %u = %u / %u (%u)\n", - query_res.result_cnt, - p_madw->mad_size - IB_SA_MAD_HDR_SIZE, - ib_get_attr_size(p_sa_mad->attr_offset), - (p_madw->mad_size - - IB_SA_MAD_HDR_SIZE) % - ib_get_attr_size(p_sa_mad->attr_offset) - ); + if (ib_get_attr_size(p_sa_mad->attr_offset) == 0) { + query_res.result_cnt = 0; + osm_log(p_bind->p_log, OSM_LOG_DEBUG, + "__osmv_sa_mad_rcv_cb: Count = 0\n"); + } + else { + query_res.result_cnt = (uintn_t) + ((p_madw->mad_size - IB_SA_MAD_HDR_SIZE) / + ib_get_attr_size(p_sa_mad->attr_offset)); + osm_log(p_bind->p_log, OSM_LOG_DEBUG, + "__osmv_sa_mad_rcv_cb: " + "Count = %u = %zu / %u (%zu)\n", + query_res.result_cnt, + p_madw->mad_size - IB_SA_MAD_HDR_SIZE, + ib_get_attr_size(p_sa_mad->attr_offset), + (p_madw->mad_size - IB_SA_MAD_HDR_SIZE) % + ib_get_attr_size(p_sa_mad->attr_offset)); + } #endif } } -- 1.5.1.4 From sashak at voltaire.com Sun Aug 17 11:46:03 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 17 Aug 2008 21:46:03 +0300 Subject: [ofa-general] Re: [PATCH] opensm/libvendor/osm_vendor_mlx_sa.c: handling attribute offset of 0 In-Reply-To: <48A83FD2.4090604@dev.mellanox.co.il> References: <48A83FD2.4090604@dev.mellanox.co.il> Message-ID: <20080817184603.GR2339@sashak.voltaire.com> On 18:12 Sun 17 Aug , Yevgeny Kliteynik wrote: > Following patch f752a876edecfdac9e1d07bc0747a2c296eed230, > "set SA attribute offset to 0 when no records are returned" > > Attribute offset of 0 caused the vendor to crash. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 17 13:54:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 17 Aug 2008 23:54:12 +0300 Subject: [ofa-general] [PATCH v3] Add a Node Description check on light sweep to ensure that the ND has been found for each node. In-Reply-To: <20080808090815.0a9b923d.weiny2@llnl.gov> References: <20080807145326.1d91604c.weiny2@llnl.gov> <20080808090815.0a9b923d.weiny2@llnl.gov> Message-ID: <20080817205411.GU2339@sashak.voltaire.com> Hi Ira, On 09:08 Fri 08 Aug , Ira Weiny wrote: > On Fri, 8 Aug 2008 08:58:15 -0400 > "Hal Rosenstock" wrote: > > > On Thu, Aug 7, 2008 at 5:53 PM, Ira Weiny wrote: > > > >From 123a950a8bf0fc43331a1e715f0cdd756529437c Mon Sep 17 00:00:00 2001 > > > From: Ira K. Weiny > > > > > > + if (status != IB_SUCCESS) > > > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > > > + "__osm_state_mgr_get_node_desc: ERR 3315: " > > > + "Failure initiating NodeDescription request (%s)\n", > > > + ib_get_err_str(status)); > > > > > > Aren't error codes 3314 and 3315 already taken ? > > > > -- Hal > > > > Yes, I forgot you mentioned that. v3 attached. > Ira > > > >From 6470536504e0bb6c6ff86619f3801235e022a99d Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 30 Jul 2008 17:28:30 -0700 > Subject: [PATCH] Add a Node Description check on light sweep to ensure that the ND has been > found for each node. This case covers the condition where a ND message is > dropped/lost for some reason and OpenSM is left with a valid configured node > which is not named correctly. Please use one line patch summary as subject and patch description in email body. ((15) of /usr/src/linux/Documentation/SubmittingPatches). > > This is not the same as a node which has changed it's Node Descriptioin. In > this case the node needs to send a trap. > > Signed-off-by: Ira Weiny > --- > opensm/include/opensm/osm_base.h | 11 ++++++++ > opensm/opensm/osm_node.c | 2 +- > opensm/opensm/osm_state_mgr.c | 53 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 65 insertions(+), 1 deletions(-) > > diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h > index 3793804..2e8def7 100644 > --- a/opensm/include/opensm/osm_base.h > +++ b/opensm/include/opensm/osm_base.h > @@ -640,6 +640,17 @@ BEGIN_C_DECLS > */ > #define OSM_NO_PATH 0xFF > /**********/ > +/****d* OpenSM: Base/OSM_NODE_DESC_UNKNOWN > +* NAME > +* OSM_NODE_DESC_UNKNOWN > +* > +* DESCRIPTION > +* Value indicating the Node Description is not set and is "unknown" > +* > +* SYNOPSIS > +*/ > +#define OSM_NODE_DESC_UNKNOWN "" > +/**********/ > /****d* OpenSM: Base/osm_thread_state_t > * NAME > * osm_thread_state_t > diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c > index d99c656..123feb8 100644 > --- a/opensm/opensm/osm_node.c > +++ b/opensm/opensm/osm_node.c > @@ -136,7 +136,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) > osm_node_init_physp(p_node, p_madw); > if (p_ni->node_type == IB_NODE_TYPE_SWITCH) > node_init_physp0(p_node, p_madw); > - p_node->print_desc = strdup(""); > + p_node->print_desc = strdup(OSM_NODE_DESC_UNKNOWN); > > return (p_node); > } > diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c > index 3cdb2cf..7502287 100644 > --- a/opensm/opensm/osm_state_mgr.c > +++ b/opensm/opensm/osm_state_mgr.c > @@ -516,6 +516,53 @@ static void query_sm_info(cl_map_item_t *item, void *cxt) > } > > /********************************************************************** > + During a light sweep check each node to see if the node descriptor is valid > + if not issue a ND query. > +**********************************************************************/ > +static void __osm_state_mgr_get_node_desc(IN cl_map_item_t * const p_object, > + IN void *context) > +{ > + osm_physp_t *p_physp = NULL; > + osm_node_t *const p_node = (osm_node_t *) p_object; > + ib_api_status_t status = IB_SUCCESS; > + osm_madw_context_t mad_context; > + osm_sm_t *sm = (osm_sm_t *)context; > + > + OSM_LOG_ENTER(sm->p_log); > + > + CL_ASSERT(p_node); > + > + if (p_node->print_desc && strcmp(p_node->print_desc, OSM_NODE_DESC_UNKNOWN)) > + /* if ND is valid, do nothing */ > + goto exit; > + > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > + "ERR 3319: Unknown node description \"%s\" for node " > + "0x%016" PRIx64 ". Reissuing ND query\n", > + p_node->print_desc ? p_node->print_desc : OSM_NODE_DESC_UNKNOWN, Actually this is not needed due to condition above. Just OSM_NODE_DESC_UNKNOWN (or just "Unknown node description for node....") could be printer instead. > + cl_ntoh64(osm_node_get_node_guid (p_node))); > + > + /* get a physp to request from. */ > + p_physp = osm_node_get_any_physp_ptr(p_node); > + > + mad_context.nd_context.node_guid = osm_node_get_node_guid(p_node); > + > + status = osm_req_get(sm, > + osm_physp_get_dr_path_ptr(p_physp), > + IB_MAD_ATTR_NODE_DESC, > + 0, CL_DISP_MSGID_NONE, &mad_context); > + if (status != IB_SUCCESS) > + OSM_LOG(sm->p_log, OSM_LOG_ERROR, > + "__osm_state_mgr_get_node_desc: ERR 331B: " OSM_LOG() macro includes function name, so you don't need to specify it again in format string. > + "Failure initiating NodeDescription request (%s)\n", > + ib_get_err_str(status)); > + > +exit: > + OSM_LOG_EXIT(sm->p_log); > +} > + > + > +/********************************************************************** > Initiates a lightweight sweep of the subnet. > Used during normal sweeps after the subnet is up. > **********************************************************************/ > @@ -524,6 +571,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > ib_api_status_t status = IB_SUCCESS; > osm_bind_handle_t h_bind; > cl_qmap_t *p_sw_tbl; > + cl_qmap_t *p_node_tbl; > cl_map_item_t *p_next; > osm_node_t *p_node; > osm_physp_t *p_physp; > @@ -532,6 +580,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > OSM_LOG_ENTER(sm->p_log); > > p_sw_tbl = &sm->p_subn->sw_guid_tbl; > + p_node_tbl = &sm->p_subn->node_guid_tbl; Seems like unneeded variable - it is used only once below: > > /* > * First, get the bind handle. > @@ -550,6 +599,10 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) > cl_qmap_apply_func(p_sw_tbl, __osm_state_mgr_get_sw_info, sm); > CL_PLOCK_RELEASE(sm->p_lock); > > + CL_PLOCK_ACQUIRE(sm->p_lock); > + cl_qmap_apply_func(p_node_tbl, __osm_state_mgr_get_node_desc, sm); cl_qmap_apply_func(&sm->p_subn->node_guid_tbl, __osm_state_mgr_get_node_desc, sm) > + CL_PLOCK_RELEASE(sm->p_lock); > + > /* now scan the list of physical ports that were not down but have no remote port */ > CL_PLOCK_ACQUIRE(sm->p_lock); > p_next = cl_qmap_head(&sm->p_subn->node_guid_tbl); > -- > 1.5.4.5 Sasha From iuzzolin at nmia.com Sun Aug 17 14:45:49 2008 From: iuzzolin at nmia.com (Harold/Carlyn Iuzzolino) Date: Sun, 17 Aug 2008 15:45:49 -0600 Subject: [ofa-general] In subroutine dat_strerror.c NULL isn't defined. Message-ID: <200808172145.m7HLjnH07740@gandalf.iuzzolino.com> Dear Openfabrics Newsgroup, I sent this last night right before all the spam emails started to arrive. If you did what I did to get rid of them, hold down the D key, my message probably got deleted too. So here it is again. Carlyn ----------------------- Dear Openfabrics and Arlin Davis general at lists.openfabrics.org arlin.r.davis at intel.com On August 6th I reported a bug, Number 1120, that In subroutine dat_strerror.c NULL isn't defined. Which of the .h files, dat_dr.h, dat_sr.h, , do you need to include in dat_strerror.c so that NULL gets defined? I am using a Fedora 9, Athlon 64X4 bit machine. [root at treebeard OFED-1.4-20080816-0600]# cat /etc/issue Fedora release 9 (Sulphur) Kernel \r on an \m (\l) [root at treebeard OFED-1.4-20080816-0600]# uname -a Linux treebeard 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux As far as I can tell, this bug hasn't been fixed in the last 4 versions and because of the way the OFED software is packaged, ie a tar file of rpms which have tar.gz files inside, I can't try to fix it myself and then run install.pl again. [root at treebeard root]# md5sum `ff compat-dapl-1.2.8.tar.gz` 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080816-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080813-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080803-0600.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz 39a825669913ff4622ee2b120a9a7409 ./OFED-1.4-20080811-0819.var.tmp/OFED_topdir/SOURCES/compat-dapl-1.2.8.tar.gz I downloaded version OFED-1.4-20080813-0600.tar.gz and then later, OFED-1.4-20080816-0600.tar.gz and tried installing it/them. No luck. Error in the same spot. Build compat-dapl RPM Running rpmbuild --rebuild --define '_topdir /var/tmp/OFED_topdir' --define 'dist ' --target x86_64 --define '_prefix /usr' --define '_exec_prefix /usr' --define '_sysconfdir /etc' --define '_defaultdocdir /usr/share/doc/compat-dapl-1.2.8' --define '_usr /usr' /root/OFED-1.4-20080813-0600/SRPMS/compat-dapl-1.2.8-1.src.rpm Failed to build compat-dapl RPM See /tmp/OFED.32320.logs/compat-dapl.rpmbuild.log gcc -DHAVE_CONFIG_H -I. -I. -I. -Wall -g -D_GNU_SOURCE -DOS_RELEASE=131078 -I./dat/include/ -I./dat/udat/ -I./dat/udat/linux -I./dat/common/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dat_udat_libdat_la-dat_dictionary.lo -MD -MP -MF .deps/dat_udat_libdat_la-dat_dictionary.Tpo -c dat/common/dat_dictionary.c -o dat_udat_libdat_la-dat_dictionary.o >/dev/null 2>&1 dat/common/dat_strerror.c: In function 'dat_strerror': dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) dat/common/dat_strerror.c:621: error: (Each undeclared identifier is reported only once dat/common/dat_strerror.c:621: error: for each function it appears in.) make[2]: *** [dat_udat_libdat_la-dat_strerror.lo] Error 1 Because of the way the install.pl script works, I have to wait for someone to fix it, and make a new OFED-1.4-200808xx-0600.tar.gz. I am puzzled that this problem hasn't occurred for everyone who tries to compile. But, please could someone fix this. I can't continue compiling until it is fixed. Carlyn Iuzzolino From iuzzolin at nmia.com Sun Aug 17 14:46:57 2008 From: iuzzolin at nmia.com (Harold/Carlyn Iuzzolino) Date: Sun, 17 Aug 2008 15:46:57 -0600 Subject: [ofa-general] How do you repackage individual source rpm files? Message-ID: <200808172146.m7HLkve07754@gandalf.iuzzolino.com> Dear Openfabrics Newsgroup, I sent this last night right before all the spam emails started to arrive. If you did what I did to get rid of them, hold down the D key, my message probably got deleted too. So here it is again. Carlyn ----------------------- Dear Openfabrics, general at lists.openfabrics.org On August 5th, I reported a bug in the compat-dapl-1.2.8.tar.gz package, in particular: dat/common/dat_strerror.c: In function 'dat_strerror': dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) It is now August 17th and the bug is STILL not fixed. And I'm stuck. Because of the way the OFED-1.4-20080816-0600.tgz file is created and the way install.pl works, I can't just make a fix to the dat_strerror.c file and try again. I have to wait until one of you fixes it and makes a new OFED-1.4-20080816-0600.tgz file. Is there any way for someone official to show us how to repackage the OFED-1.4-20080816-0600.tgz file? ----------------- Inside OFED-1.4-20080816-0600.tgz, is a directory SRPMS/ with lots of *.src.rpm's. Inside the offending compat-dapl-1.2.8-1.src.rpm is a tar file that contains the sources, in particular the dat/common/dat_strerror.c file. [root at treebeard root]# rpm -qilp OFED-1.4-20080816-0600/SRPMS/compat-dapl-1.2.8-1.src.rpm Name : compat-dapl Relocations: (not relocatable) Version : 1.2.8 Vendor: (none) Release : 1 Build Date: Sat 16 Aug 2008 07:17:58 AM MDT Install Date: (not installed) Build Host: hosting.openfabrics.org Group : System Environment/Libraries Source RPM: (none) Size : 675045 License: Dual GPL/BSD/CPL Signature : (none) URL : http://openfabrics.org/ Summary : A Library for userspace access to RDMA devices using OS Agnostic DAT API v1.2. Description : Along with the OpenFabrics kernel drivers, libdat and libdapl provides a userspace RDMA API that supports DAT 1.2 specification and IB transport extensions for atomic operations and rdma write with immediate data. compat-dapl-1.2.8.tar.gz <----------------------- dapl.spec <------------ the rpm spec file??? If I want to try to fix the subroutine dat/common/dat_strerror.c I would have to 1. Extract the compat-dapl-1.2.8-1.src.rpm somewhere 2. Change the file compat-dapl-1.2.8/dat/common/dat_strerror.c 3. tar together the directory compat-dapl-1.2.8 4. Recreate the rpm file. I know how to 'tar cvfz compat-dapl-1.2.8.tgz compat-dapl-1.2.8' But I don't know how to create the rpm file. I assume the dapl.spec file is the rpm spec file. How do you use it in creating the rpm file? That way I could at least make an attempt to fix compile errors and continue compiling while I wait for the OFED team fix the error in the correct way and give us the new OFED-1.4-20080816-0600.tgz. Carlyn Iuzzolino From bboas at systemfabricworks.com Sun Aug 17 16:18:35 2008 From: bboas at systemfabricworks.com (Bill Boas) Date: Sun, 17 Aug 2008 16:18:35 -0700 Subject: [ofa-general] RE: [ewg] STOP the onslaught of EWG spam In-Reply-To: References: Message-ID: <000c01c900bf$93b6bcf0$f592fea9@BillGWAYLAPTOP> Thank you very much Jeff, and thank you for taking this action on the other mail lists which were probably just as vulnerable - I has over 3400 "spam" messages - did others have as many, or was I targeted? :-)!!! Bill Boas VP, Business Development System Fabric Works 510-375-8840 bboas at systemfabricworks.com www.systemfabricworks.com -----Original Message----- From: ewg-bounces at lists.openfabrics.org [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Jeff Squyres Sent: Sunday, August 17, 2008 5:21 AM To: OpenFabrics EWG; OpenFabrics General Subject: [ewg] STOP the onslaught of EWG spam The EWG list has gotten spam bombed over the last few hours. I lost count at 500+ spams in my inbox. I therefore logged into openfabrics.org and changed the site-wide password for Mailman (I have notified Jeff Becker of the new password). I then changed the EWG list to silently discard all non- member posts. Since I didn't know if other OF lists were being spam- bombed, I did the same for all OF lists as well. The spam onslaught has now stopped. I also notice that our mailmain installation is hopelessly out of date; it's v2.1.5 and the current version (including several important security fixes since v2.1.5) is v2.1.11. Someone needs to fix this ASAP. -- Jeff Squyres Cisco Systems _______________________________________________ ewg mailing list ewg at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg From rdreier at cisco.com Sun Aug 17 17:31:05 2008 From: rdreier at cisco.com (Roland Dreier) Date: Sun, 17 Aug 2008 17:31:05 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <32cb786f0808161218o417553b5w1738a517f0eb468a@mail.gmail.com> (Yosef Etigin's message of "Sat, 16 Aug 2008 22:18:50 +0300") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <32cb786f0808161218o417553b5w1738a517f0eb468a@mail.gmail.com> Message-ID: > What if you bring the device down, while you get a join completion event? > ipoib_stop() can run in parellel with ipoib_mcast_join_complete(), and you > will just wait for ipoib_stop() to finish to do netif_carrier_on() afterwards. Yes, but after ipoib_stop() finishes, netif_carrier_on() doesn't do anything that could cause a problem, since the netdev is down. - R. From rdreier at cisco.com Sun Aug 17 17:32:46 2008 From: rdreier at cisco.com (Roland Dreier) Date: Sun, 17 Aug 2008 17:32:46 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: <20080817104522.GA23302@mtls03> (Eli Cohen's message of "Sun, 17 Aug 2008 13:45:22 +0300") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <20080817104522.GA23302@mtls03> Message-ID: > > register_failed: > > ib_unregister_event_handler(&priv->event_handler); > > - flush_scheduled_work(); > > + flush_workqueue(ipoib_workqueue); > > > > I don't find any flaw in this approach, but I don't understand why is > the flush_workqueue(ipoib_workqueue) above needed. It's mostly fixing the old code... but it makes sense to me, since this is the error path unwinding things after we registered an event handler. So an IB async event could have occurred and caused us to schedule work for this netdevice, and we should wait for that scheduled work before freeing the netdevice. It may be worth auditing whether we shouldn't register the event handler until later, since it might cause problems to handle an async event before registering a netdev... - R. From sashak at voltaire.com Mon Aug 18 01:14:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 11:14:45 +0300 Subject: [ofa-general] Re: [ewg] STOP the onslaught of EWG spam In-Reply-To: <9910A595EF5F194ABCC11F517D3A76A704CC5B21@azsmsx415.amr.corp.intel.com> References: <9910A595EF5F194ABCC11F517D3A76A704CC5B21@azsmsx415.amr.corp.intel.com> Message-ID: <20080818081445.GX2339@sashak.voltaire.com> On 22:18 Sun 17 Aug , Bhuiyan, Lutfor wrote: > I got 4000+ spam emails with "ewg" tag. BTW, all those messages were addressed an old mailing list - ewg at openib.org. Probably it is enough to just drop this address at all? (Personally I added a new rule in my .procmailrc already :)). Sasha From anuj01 at gmail.com Mon Aug 18 05:17:11 2008 From: anuj01 at gmail.com (=?UTF-8?B?4KSF4KSo4KWB4KSc?=) Date: Mon, 18 Aug 2008 17:47:11 +0530 Subject: [ofa-general] ibv_post_send implementation without doorbell Message-ID: hi I have found fast path for implementation of ibv_post_send(), which uses mthca_tavor_post_send() or mthca_arbel_post_send() defined in libmthca/src/qp.c. And in libibverbs ibv_cmd_post_send() is also defined which can be used for slow path. For other user verbs corresponding ibv_cmd_* api's are being used. But it is seemed to no slow path support is provided by libmthca. Please elaborate this point and explain how slow path can be used for ibv_post_send(). With Regards, Anuj Aggarwal From rdreier at cisco.com Mon Aug 18 08:10:00 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 18 Aug 2008 08:10:00 -0700 Subject: [ofa-general] ibv_post_send implementation without doorbell In-Reply-To: (=?utf-8?B?IuCkheCkqOClgeCknCIncw==?= message of "Mon, 18 Aug 2008 17:47:11 +0530") References: Message-ID: > But it is seemed to no slow path support is provided by libmthca. Right, because the pure userspace fast path implementation is all that is required for Mellanox HCAs. > Please elaborate this point and explain how slow path can be used for > ibv_post_send(). Look at libipathverbs -- I belive that still uses it. - R. From gopalakk at cse.ohio-state.edu Mon Aug 18 08:48:48 2008 From: gopalakk at cse.ohio-state.edu (Karthik Gopalakrishnan) Date: Mon, 18 Aug 2008 11:48:48 -0400 Subject: [ofa-general] ***SPAM*** Error returned by ibv_poll_cq() Message-ID: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> Hello. ibv_poll_cq() returns IBV_WC_REM_ACCESS_ERR or IBV_WC_WR_FLUSH_ERR in wc.status and IBV_WC_SEND in wc.opcode. The async event handler simultaneously reports IBV_EVENT_QP_ACCESS_ERR. I could not find a description of these errors either in the man pages or in verbs.h. I would be grateful if someone can tell me under what circumstances ibv_poll_cq() would return those errors. I am using the following adapter. =========================== CA 'mthca0' CA type: MT25208 Number of ports: 2 Firmware version: 5.1.400 Hardware version: a0 Node GUID: 0x0002c9020023c078 System image GUID: 0x0002c9020023c07b Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 67 LMC: 0 SM lid: 156 Capability mask: 0x02510a68 Port GUID: 0x0002c9020023c079 Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c9020023c07a =========================== Thanks & Regards, Karthik From dotanba at gmail.com Mon Aug 18 11:22:15 2008 From: dotanba at gmail.com (Dotan Barak) Date: Mon, 18 Aug 2008 21:22:15 +0300 Subject: [ofa-general] ***SPAM*** Error returned by ibv_poll_cq() In-Reply-To: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> References: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> Message-ID: <2f3bf9a60808181122w351835cdxef71792f75d6d354@mail.gmail.com> Hi. Did you try to perform (unsuccessful) RDMA operation to this side? Dotan On Mon, Aug 18, 2008 at 6:48 PM, Karthik Gopalakrishnan wrote: > Hello. > > ibv_poll_cq() returns IBV_WC_REM_ACCESS_ERR or IBV_WC_WR_FLUSH_ERR in > wc.status and IBV_WC_SEND in wc.opcode. The async event handler > simultaneously reports IBV_EVENT_QP_ACCESS_ERR. I could not find a > description of these errors either in the man pages or in verbs.h. I > would be grateful if someone can tell me under what circumstances > ibv_poll_cq() would return those errors. > > I am using the following adapter. > =========================== > CA 'mthca0' > CA type: MT25208 > Number of ports: 2 > Firmware version: 5.1.400 > Hardware version: a0 > Node GUID: 0x0002c9020023c078 > System image GUID: 0x0002c9020023c07b > Port 1: > State: Active > Physical state: LinkUp > Rate: 20 > Base lid: 67 > LMC: 0 > SM lid: 156 > Capability mask: 0x02510a68 > Port GUID: 0x0002c9020023c079 > Port 2: > State: Down > Physical state: Polling > Rate: 10 > Base lid: 0 > LMC: 0 > SM lid: 0 > Capability mask: 0x02510a68 > Port GUID: 0x0002c9020023c07a > =========================== > > Thanks & Regards, > Karthik > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From gopalakk at cse.ohio-state.edu Mon Aug 18 12:08:57 2008 From: gopalakk at cse.ohio-state.edu (Karthik Gopalakrishnan) Date: Mon, 18 Aug 2008 15:08:57 -0400 Subject: [ofa-general] ***SPAM*** Error returned by ibv_poll_cq() In-Reply-To: <2f3bf9a60808181122w351835cdxef71792f75d6d354@mail.gmail.com> References: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> <2f3bf9a60808181122w351835cdxef71792f75d6d354@mail.gmail.com> Message-ID: <92eddfb50808181208t1a7ccec9h81569ffebaab9a0a@mail.gmail.com> Yes. There was an unsuccessful RDMA operation to this side. On Mon, Aug 18, 2008 at 2:22 PM, Dotan Barak wrote: > Hi. > > Did you try to perform (unsuccessful) RDMA operation to this side? > > Dotan > > On Mon, Aug 18, 2008 at 6:48 PM, Karthik Gopalakrishnan > wrote: >> Hello. >> >> ibv_poll_cq() returns IBV_WC_REM_ACCESS_ERR or IBV_WC_WR_FLUSH_ERR in >> wc.status and IBV_WC_SEND in wc.opcode. The async event handler >> simultaneously reports IBV_EVENT_QP_ACCESS_ERR. I could not find a >> description of these errors either in the man pages or in verbs.h. I >> would be grateful if someone can tell me under what circumstances >> ibv_poll_cq() would return those errors. >> >> I am using the following adapter. >> =========================== >> CA 'mthca0' >> CA type: MT25208 >> Number of ports: 2 >> Firmware version: 5.1.400 >> Hardware version: a0 >> Node GUID: 0x0002c9020023c078 >> System image GUID: 0x0002c9020023c07b >> Port 1: >> State: Active >> Physical state: LinkUp >> Rate: 20 >> Base lid: 67 >> LMC: 0 >> SM lid: 156 >> Capability mask: 0x02510a68 >> Port GUID: 0x0002c9020023c079 >> Port 2: >> State: Down >> Physical state: Polling >> Rate: 10 >> Base lid: 0 >> LMC: 0 >> SM lid: 0 >> Capability mask: 0x02510a68 >> Port GUID: 0x0002c9020023c07a >> =========================== >> >> Thanks & Regards, >> Karthik >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >> > From sashak at voltaire.com Mon Aug 18 13:03:56 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:03:56 +0300 Subject: [ofa-general] opensm error when running ibsim In-Reply-To: <48A141A9.4090707@bull.net> References: <48A141A9.4090707@bull.net> Message-ID: <20080818200356.GH27204@sashak.voltaire.com> Hi Vincent, On 09:54 Tue 12 Aug , Vincent Ficet wrote: > > However, no other instance of opensm was running at that time. Digging > further, I realised that commenting out the 'guid' entry in opensm.conf > fixed this issue. You can put 'guid 0' there as well. > ==> Shouldn't opensm have some mechanism that excludes this parameter when > a connection to the sim:ctl socket is made ? No. OpenSM and lower layers (libibumad) cannot know that it runs under simulation mode, that is the whole point of libumad2sim preloading. Sasha From sashak at voltaire.com Mon Aug 18 13:08:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:08:47 +0300 Subject: [ofa-general] Re: opensm error when running ibsim In-Reply-To: <48A17C2E.7000504@bull.net> References: <829ded920808120455n40a8434fje0813501e996870b@mail.gmail.com> <48A17C2E.7000504@bull.net> Message-ID: <20080818200847.GI27204@sashak.voltaire.com> On 14:03 Tue 12 Aug , Vincent Ficet wrote: >> How does "opensm.conf " came into picture when you are running >> 'opensm' from command line ? >> > The default config files related defines come from > management/opensm/include/config.h, which is generated by autoconf. BTW, you can also use '-F ' option. Also as far as I remember the config file itself is not generated by build and not installed. So likely you created it by hand (or with -C option), no? Sasha From sashak at voltaire.com Mon Aug 18 13:17:18 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:17:18 +0300 Subject: [ofa-general] Re: [PATCH] ibsim: Add support for vendor ID and system image GUID In-Reply-To: <48A30108.4010307@obsidianresearch.com> References: <48A30108.4010307@obsidianresearch.com> Message-ID: <20080818201718.GJ27204@sashak.voltaire.com> Hi Hal, On 09:43 Wed 13 Aug , Hal Rosenstock wrote: > ibsim: Add support for vendor ID and system image GUID > > Signed-off-by: Hal Rosenstock > --- > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index 6bf1e29..ef99db8 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -483,7 +483,8 @@ static int dump_net(FILE * f, char *line) > fprintf(f, "\n%s %d \"%s\"", > node_type_name(node->type), > node->numports, node->nodeid); > - fprintf(f, "\tnodeguid %" PRIx64 "\n", node->nodeguid); > + fprintf(f, "\tnodeguid %" PRIx64 "\tsysimgguid %" PRIx64 "\n", > + node->nodeguid, node->sysguid); > > nports = node->numports; > if (node->type == SWITCH_NODE) { > diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c > index 6e3c0e9..146bcde 100644 > --- a/ibsim/sim_net.c > +++ b/ibsim/sim_net.c > @@ -190,7 +190,9 @@ char (*aliases)[NODEIDLEN + NODEPREFIX + 1]; // aliases map format: "%s@%s" > > int netnodes, netswitches, netports, netaliases; > char netprefix[NODEPREFIX + 1]; > +int netvendid; > int netdevid; > +uint64_t netsysimgguid; > int netwidth = DEFAULT_LINKWIDTH; > int netspeed = DEFAULT_LINKSPEED; > > @@ -324,11 +326,12 @@ static Node *new_node(int type, char *nodename, char *nodedesc, int nodeports) > } > > mad_set_field(nd->nodeinfo, 0, IB_NODE_NPORTS_F, nd->numports); > + mad_set_field(nd->nodeinfo, 0, IB_NODE_VENDORID_F, netvendid); > mad_set_field(nd->nodeinfo, 0, IB_NODE_DEVID_F, netdevid); > > mad_encode_field(nd->nodeinfo, IB_NODE_GUID_F, &nd->nodeguid); > mad_encode_field(nd->nodeinfo, IB_NODE_PORT_GUID_F, &nd->nodeguid); > - mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &nd->nodeguid); > + mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &netsysimgguid); And when netsysimgguid was not parsed for this node, it will put previous value there (or "0" if it was never parsed)? Sasha > > if ((nd->portsbase = new_ports(nd, nodeports, firstport)) < 0) { > IBWARN("can't alloc %d ports for node %s", nodeports, > @@ -805,6 +808,20 @@ static int parse_guidbase(int fd, char *line, int type) > return 1; > } > > +static int parse_vendid(int fd, char *line) > +{ > + char *s; > + > + if (!(s = strchr(line, '='))) { > + IBWARN("bad assignment: missing '=' sign"); > + return -1; > + } > + > + netvendid = strtol(s + 1, 0, 0); > + > + return 1; > +} > + > static int parse_devid(int fd, char *line) > { > char *s; > @@ -819,6 +836,20 @@ static int parse_devid(int fd, char *line) > return 1; > } > > +static uint64_t parse_sysimgguid(int fd, char *line) > +{ > + char *s; > + > + if (!(s = strchr(line, '='))) { > + IBWARN("bad assignment: missing '=' sign"); > + return -1; > + } > + > + netsysimgguid = strtoull(s + 1, 0, 0); > + > + return 1; > +} > + > static int parse_width(int fd, char *line) > { > char *s; > @@ -935,8 +966,12 @@ static int parse_netconf(int fd, FILE * out) > r = parse_guidbase(fd, line, HCA_NODE); > else if (!strncmp(line, "rtguid", 6)) > r = parse_guidbase(fd, line, ROUTER_NODE); > + else if (!strncmp(line, "vendid", 6)) > + r = parse_vendid(fd, line); > else if (!strncmp(line, "devid", 5)) > r = parse_devid(fd, line); > + else if (!strncmp(line, "sysimgguid", 10)) > + r = parse_sysimgguid(fd, line); > else if (!strncmp(line, "width", 5)) > r = parse_width(fd, line); > else if (!strncmp(line, "speed", 5)) From sashak at voltaire.com Mon Aug 18 13:20:11 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:20:11 +0300 Subject: [ofa-general] Re: [IBSIM][Trivial] initialize netstarted to 0 In-Reply-To: <1218647086.16508.566.camel@cardanus.llnl.gov> References: <1218647086.16508.566.camel@cardanus.llnl.gov> Message-ID: <20080818202011.GL27204@sashak.voltaire.com> Hi Al, On 10:04 Wed 13 Aug , Al Chu wrote: > From c94d427feb3f8ca9f7b339dab618e61b04c41914 Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Wed, 13 Aug 2008 09:54:43 -0700 > Subject: [PATCH] initialize netstarted to 0 > > > Signed-off-by: Albert Chu > --- > ibsim/sim_cmd.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index 6bf1e29..e4591d1 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -752,7 +752,7 @@ static int do_disconnect_client(FILE * out, int id) > return 0; > } > > -int netstarted; > +int netstarted = 0; It is global variable, should be initialized by default. Sasha From sashak at voltaire.com Mon Aug 18 13:22:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:22:55 +0300 Subject: [ofa-general] Re: [IBSIM][Trivial] document 'start' command In-Reply-To: <1218647088.16508.567.camel@cardanus.llnl.gov> References: <1218647088.16508.567.camel@cardanus.llnl.gov> Message-ID: <20080818202255.GM27204@sashak.voltaire.com> On 10:04 Wed 13 Aug , Al Chu wrote: > From c83408de7be662b08fa0ac10d759ca46515f9e21 Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Wed, 13 Aug 2008 09:57:16 -0700 > Subject: [PATCH] document start command > > > Signed-off-by: Albert Chu > --- > ibsim/sim_cmd.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index e4591d1..1f6ba88 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -731,6 +731,7 @@ static int dump_help(FILE * f) > ); > fprintf(f, > "\tBaselid \"nodeid\"[port] [lmc] : change port's lid (lmc)\n"); > + fprintf(f, "\tStart - start the simulator if currently inactive\n"); It is already there - second line of this help message. Sasha From dotanba at gmail.com Mon Aug 18 14:36:32 2008 From: dotanba at gmail.com (Dotan Barak) Date: Mon, 18 Aug 2008 23:36:32 +0200 Subject: [ofa-general] ***SPAM*** Error returned by ibv_poll_cq() In-Reply-To: <92eddfb50808181208t1a7ccec9h81569ffebaab9a0a@mail.gmail.com> References: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> <2f3bf9a60808181122w351835cdxef71792f75d6d354@mail.gmail.com> <92eddfb50808181208t1a7ccec9h81569ffebaab9a0a@mail.gmail.com> Message-ID: <48A9EB60.9060507@gmail.com> Karthik Gopalakrishnan wrote: > Yes. There was an unsuccessful RDMA operation to this side. > You should try to check that the r_key + address + size of the send request that you posted is matching a valid MR with the right permissions and size ... Dotan > On Mon, Aug 18, 2008 at 2:22 PM, Dotan Barak wrote: > >> Hi. >> >> Did you try to perform (unsuccessful) RDMA operation to this side? >> >> Dotan >> >> On Mon, Aug 18, 2008 at 6:48 PM, Karthik Gopalakrishnan >> wrote: >> >>> Hello. >>> >>> ibv_poll_cq() returns IBV_WC_REM_ACCESS_ERR or IBV_WC_WR_FLUSH_ERR in >>> wc.status and IBV_WC_SEND in wc.opcode. The async event handler >>> simultaneously reports IBV_EVENT_QP_ACCESS_ERR. I could not find a >>> description of these errors either in the man pages or in verbs.h. I >>> would be grateful if someone can tell me under what circumstances >>> ibv_poll_cq() would return those errors. >>> >>> I am using the following adapter. >>> =========================== >>> CA 'mthca0' >>> CA type: MT25208 >>> Number of ports: 2 >>> Firmware version: 5.1.400 >>> Hardware version: a0 >>> Node GUID: 0x0002c9020023c078 >>> System image GUID: 0x0002c9020023c07b >>> Port 1: >>> State: Active >>> Physical state: LinkUp >>> Rate: 20 >>> Base lid: 67 >>> LMC: 0 >>> SM lid: 156 >>> Capability mask: 0x02510a68 >>> Port GUID: 0x0002c9020023c079 >>> Port 2: >>> State: Down >>> Physical state: Polling >>> Rate: 10 >>> Base lid: 0 >>> LMC: 0 >>> SM lid: 0 >>> Capability mask: 0x02510a68 >>> Port GUID: 0x0002c9020023c07a >>> =========================== >>> >>> Thanks & Regards, >>> Karthik >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >>> >>> From sashak at voltaire.com Mon Aug 18 13:39:46 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:39:46 +0300 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <1218661626.16508.579.camel@cardanus.llnl.gov> References: <1218661626.16508.579.camel@cardanus.llnl.gov> Message-ID: <20080818203946.GN27204@sashak.voltaire.com> Hi Al, On 14:07 Wed 13 Aug , Al Chu wrote: > > I was looking into adding a new command to ibsim, Cool :) > but since the original > cmd-parsing function only checks for the first char of the inputted > command, it limits the ability to add a reasonable-sounding new command > name. The patch changes the function to check the entire command name. This would break a lot of existing test scripts, so basically I disagree with such radical change. However we could do something less destructive and achieve your goal - lets compare string partially as provided: strncasecmp(line, "Dump", strlen(line)) (or 'strncasecmp(line, "Dump", cmd_len)' if you don't want to put '\0' into line buffer) Looks fine? Also some comment is below. [snip...] > From 5871b81d1ebdf86f9a9fcf79c8d8a558fd2600b1 Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Wed, 13 Aug 2008 13:53:14 -0700 > Subject: [PATCH] parse sim cmds via full name > > > Signed-off-by: Albert Chu > --- > ibsim/sim_cmd.c | 105 +++++++++++++++++++++---------------------------------- > 1 files changed, 40 insertions(+), 65 deletions(-) > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index 1f6ba88..a35d0f4 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -757,91 +757,66 @@ int netstarted = 0; > > int do_cmd(char *buf, FILE *f) > { > + char cmdbuf[4096]; Why this huge buffer? Actually we don't need to copy there just find appropriate length for comparison, no? Sasha From chu11 at llnl.gov Mon Aug 18 13:49:35 2008 From: chu11 at llnl.gov (Al Chu) Date: Mon, 18 Aug 2008 13:49:35 -0700 Subject: [ofa-general] Re: [IBSIM][Trivial] document 'start' command In-Reply-To: <20080818202255.GM27204@sashak.voltaire.com> References: <1218647088.16508.567.camel@cardanus.llnl.gov> <20080818202255.GM27204@sashak.voltaire.com> Message-ID: <1219092575.29252.82.camel@cardanus.llnl.gov> On Mon, 2008-08-18 at 23:22 +0300, Sasha Khapyorsky wrote: > On 10:04 Wed 13 Aug , Al Chu wrote: > > From c83408de7be662b08fa0ac10d759ca46515f9e21 Mon Sep 17 00:00:00 2001 > > From: Albert Chu > > Date: Wed, 13 Aug 2008 09:57:16 -0700 > > Subject: [PATCH] document start command > > > > > > Signed-off-by: Albert Chu > > --- > > ibsim/sim_cmd.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > > index e4591d1..1f6ba88 100644 > > --- a/ibsim/sim_cmd.c > > +++ b/ibsim/sim_cmd.c > > @@ -731,6 +731,7 @@ static int dump_help(FILE * f) > > ); > > fprintf(f, > > "\tBaselid \"nodeid\"[port] [lmc] : change port's lid (lmc)\n"); > > + fprintf(f, "\tStart - start the simulator if currently inactive\n"); > > It is already there - second line of this help message. Ahhh. I see it now. B/c there isn't a '-' or ':' to separate the command from the description, I guess I mentally skipped it. Al > Sasha -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From sashak at voltaire.com Mon Aug 18 13:49:27 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:49:27 +0300 Subject: [ofa-general] Re: [PATCH] opensm/libvendor/Makefile.am: create symbolic link to osmvendor library In-Reply-To: <48A2F510.4030903@dev.mellanox.co.il> References: <48A2F510.4030903@dev.mellanox.co.il> Message-ID: <20080818204927.GO27204@sashak.voltaire.com> On 17:52 Wed 13 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Creating a symbolic link to the vendor library that denotes > which type of vendor it is. This is needed for ibutils/ibis. > This instal-exec-hook (among many others) was removed > by the following patch: > http://lists.openfabrics.org/pipermail/general/2008-July/052742.html > > Signed-off-by: Yevgeny Kliteynik > --- > opensm/libvendor/Makefile.am | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/opensm/libvendor/Makefile.am b/opensm/libvendor/Makefile.am > index f72dbbe..a6dd0b9 100644 > --- a/opensm/libvendor/Makefile.am > +++ b/opensm/libvendor/Makefile.am > @@ -88,3 +88,8 @@ libosmvendorinclude_HEADERS = $(HDRS) > > # headers are distributed as part of the include dir > EXTRA_DIST = $(srcdir)/libosmvendor.map $(srcdir)/libosmvendor.ver > + > +# Create a link to the installed vendor lib to > +# mark the type of the vendor library > +install-exec-hook: > + ln -sf $(DESTDIR)/$(libdir)/libosmvendor.so $(DESTDIR)/$(libdir)/libosmvendor_$(with_osmv).so It creates some mess in $(libdir). Those links are never cleaned and since only one libosmvendor.so can be installed there I don't see how this could be helpful. Also as far as I understand this link was used by ibis ./configure script only in case of ibmgtsm. Assuming so there are couple of options to solve it without making such junk links: to add --with-sim option to ibis ./configure script, or just to create this link *only* in simulation environment from ibmgtsim build script. Sasha From sashak at voltaire.com Mon Aug 18 13:52:32 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:52:32 +0300 Subject: [ofa-general] Re: [IBSIM][Trivial] document 'start' command In-Reply-To: <1219092575.29252.82.camel@cardanus.llnl.gov> References: <1218647088.16508.567.camel@cardanus.llnl.gov> <20080818202255.GM27204@sashak.voltaire.com> <1219092575.29252.82.camel@cardanus.llnl.gov> Message-ID: <20080818205232.GP27204@sashak.voltaire.com> On 13:49 Mon 18 Aug , Al Chu wrote: > > Ahhh. I see it now. B/c there isn't a '-' or ':' to separate the > command from the description, I guess I mentally skipped it. Yes, it is probably not 100% clear although consistent with other help lines, anyway feel free to improve this. Sasha From sashak at voltaire.com Mon Aug 18 13:57:22 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 18 Aug 2008 23:57:22 +0300 Subject: [ofa-general] [PATCH] opensm: remove USEGPPLINK hack Message-ID: <20080818205722.GQ27204@sashak.voltaire.com> For unknown reasons opensm, osmtest and libraries were linked with hardcoded command using g++ linker when vendor sim (ibmgtsim) was selected. We cannot find any reasonable explanation why it was done this way, OTOH with newer libtools it creates some issues on ibmgtsim side. - Remove this completely. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/Makefile.am | 10 ---------- opensm/osmtest/Makefile.am | 8 -------- 2 files changed, 0 insertions(+), 18 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 06c27cc..7ca4c2a 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -69,16 +69,6 @@ else opensm_CFLAGS = -Wall $(OSMV_CFLAGS) -fno-strict-aliasing -DVENDOR_RMPP_SUPPORT $(DBGFLAGS) -D_XOPEN_SOURCE=600 -D_BSD_SOURCE=1 endif -# for linking with the simulator client library we have to use g++: -if OSMV_SIM -USEGPPLINK = $(LIBTOOL) --mode=link g++ $(AM_CXXFLAGS) $(CXXFLAGS) $(AM_LDFLAGS) $(LDFLAGS) -o $@ -libopensm.la: $(libopensm_la_OBJECTS) $(libopensm_la_DEPENDENCIES) - $(USEGPPLINK) $(libopensm_la_LDFLAGS) $(libopensm_la_OBJECTS) $(libopensm_la_LIBADD) $(LIBS) -opensm$(EXEEXT): $(opensm_OBJECTS) $(opensm_DEPENDENCIES) - @rm -f opensm$(EXEEXT) - $(USEGPPLINK) $(opensm_OBJECTS) $(opensm_LDADD) $(LIBS) -endif - # we need to be able to load libraries from local build subtree before make install # we always give precedence to local tree libs and then use the pre-installed ones. opensm_LDADD = -L../complib -losmcomp -L../libvendor -losmvendor -L. -lopensm $(OSMV_LDADD) diff --git a/opensm/osmtest/Makefile.am b/opensm/osmtest/Makefile.am index 198c360..236cdcf 100644 --- a/opensm/osmtest/Makefile.am +++ b/opensm/osmtest/Makefile.am @@ -20,14 +20,6 @@ osmtest_CFLAGS = -Wall $(OSMV_CFLAGS) -DVENDOR_RMPP_SUPPORT $(DBGFLAGS) endif osmtest_LDADD = -L../complib -losmcomp -L../libvendor -losmvendor -L../opensm -lopensm $(OSMV_LDADD) -# for linking with the simulator client library we have to use g++: -if OSMV_SIM -USEGPPLINK = $(LIBTOOL) --mode=link g++ $(AM_CXXFLAGS) $(CXXFLAGS) $(AM_LDFLAGS) $(LDFLAGS) -o $@ -osmtest$(EXEEXT): $(osmtest_OBJECTS) $(osmtest_DEPENDENCIES) - @rm -f osmtest$(EXEEXT) - $(USEGPPLINK) $(osmtest_OBJECTS) $(osmtest_LDADD) $(LIBS) -endif - EXTRA_DIST = $(srcdir)/include/osmt_inform.h \ $(srcdir)/include/osmtest_subnet.h \ $(srcdir)/include/osmtest.h \ -- 1.5.5.1.178.g1f811 From chu11 at llnl.gov Mon Aug 18 14:27:23 2008 From: chu11 at llnl.gov (Al Chu) Date: Mon, 18 Aug 2008 14:27:23 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080818203946.GN27204@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> Message-ID: <1219094843.29252.104.camel@cardanus.llnl.gov> Hey Sasha, On Mon, 2008-08-18 at 23:39 +0300, Sasha Khapyorsky wrote: > Hi Al, > > On 14:07 Wed 13 Aug , Al Chu wrote: > > > > I was looking into adding a new command to ibsim, > > Cool :) > > > but since the original > > cmd-parsing function only checks for the first char of the inputted > > command, it limits the ability to add a reasonable-sounding new command > > name. The patch changes the function to check the entire command name. > > This would break a lot of existing test scripts, so basically I disagree > with such radical change. However we could do something less destructive > and achieve your goal - lets compare string partially as provided: > > strncasecmp(line, "Dump", strlen(line)) > > (or 'strncasecmp(line, "Dump", cmd_len)' if you don't want to put '\0' > into line buffer) I don't quite understand. How would my patch break existing scripts? Do the existing test scripts not have whitespace between the command and the options? If that's the case, then we could do as you suggested and program in: if (!strncasecmp(line, "Dump", 4)) r = dump_net(f, line); else if (!strncasecmp(line, "Route", 5)) r = dump_route(f, line); as you suggested. Or do the existing test scripts only use the first character of the command? If that's the case, then I guess we could program in single character commands to be legacy-special cases. But we would require whitespace between the command and options for that to work. > Looks fine? > > Also some comment is below. > > [snip...] > > > From 5871b81d1ebdf86f9a9fcf79c8d8a558fd2600b1 Mon Sep 17 00:00:00 2001 > > From: Albert Chu > > Date: Wed, 13 Aug 2008 13:53:14 -0700 > > Subject: [PATCH] parse sim cmds via full name > > > > > > Signed-off-by: Albert Chu > > --- > > ibsim/sim_cmd.c | 105 +++++++++++++++++++++---------------------------------- > > 1 files changed, 40 insertions(+), 65 deletions(-) > > > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > > index 1f6ba88..a35d0f4 100644 > > --- a/ibsim/sim_cmd.c > > +++ b/ibsim/sim_cmd.c > > @@ -757,91 +757,66 @@ int netstarted = 0; > > > > int do_cmd(char *buf, FILE *f) > > { > > + char cmdbuf[4096]; > > Why this huge buffer? I just copied the bufsize from the caller of do_cmd(), so I wanted to be consistent for a potentially uber-long command string. Naturally, we don't need to do it this way. > Actually we don't need to copy there just find > appropriate length for comparison, no? I did it primarily for the output of the bad command output message at the end. I guess we could stick a '\0' into the original linebuf. But I didn't want to edit the buffer and change whatever parsing expectations the other functions had. Al > Sasha -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From chu11 at llnl.gov Mon Aug 18 14:27:23 2008 From: chu11 at llnl.gov (Al Chu) Date: Mon, 18 Aug 2008 14:27:23 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080818203946.GN27204@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> Message-ID: <1219094843.29252.104.camel@cardanus.llnl.gov> Hey Sasha, On Mon, 2008-08-18 at 23:39 +0300, Sasha Khapyorsky wrote: > Hi Al, > > On 14:07 Wed 13 Aug , Al Chu wrote: > > > > I was looking into adding a new command to ibsim, > > Cool :) > > > but since the original > > cmd-parsing function only checks for the first char of the inputted > > command, it limits the ability to add a reasonable-sounding new command > > name. The patch changes the function to check the entire command name. > > This would break a lot of existing test scripts, so basically I disagree > with such radical change. However we could do something less destructive > and achieve your goal - lets compare string partially as provided: > > strncasecmp(line, "Dump", strlen(line)) > > (or 'strncasecmp(line, "Dump", cmd_len)' if you don't want to put '\0' > into line buffer) I don't quite understand. How would my patch break existing scripts? Do the existing test scripts not have whitespace between the command and the options? If that's the case, then we could do as you suggested and program in: if (!strncasecmp(line, "Dump", 4)) r = dump_net(f, line); else if (!strncasecmp(line, "Route", 5)) r = dump_route(f, line); as you suggested. Or do the existing test scripts only use the first character of the command? If that's the case, then I guess we could program in single character commands to be legacy-special cases. But we would require whitespace between the command and options for that to work. > Looks fine? > > Also some comment is below. > > [snip...] > > > From 5871b81d1ebdf86f9a9fcf79c8d8a558fd2600b1 Mon Sep 17 00:00:00 2001 > > From: Albert Chu > > Date: Wed, 13 Aug 2008 13:53:14 -0700 > > Subject: [PATCH] parse sim cmds via full name > > > > > > Signed-off-by: Albert Chu > > --- > > ibsim/sim_cmd.c | 105 +++++++++++++++++++++---------------------------------- > > 1 files changed, 40 insertions(+), 65 deletions(-) > > > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > > index 1f6ba88..a35d0f4 100644 > > --- a/ibsim/sim_cmd.c > > +++ b/ibsim/sim_cmd.c > > @@ -757,91 +757,66 @@ int netstarted = 0; > > > > int do_cmd(char *buf, FILE *f) > > { > > + char cmdbuf[4096]; > > Why this huge buffer? I just copied the bufsize from the caller of do_cmd(), so I wanted to be consistent for a potentially uber-long command string. Naturally, we don't need to do it this way. > Actually we don't need to copy there just find > appropriate length for comparison, no? I did it primarily for the output of the bad command output message at the end. I guess we could stick a '\0' into the original linebuf. But I didn't want to edit the buffer and change whatever parsing expectations the other functions had. Al > Sasha -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From weiny2 at llnl.gov Mon Aug 18 14:43:06 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 18 Aug 2008 14:43:06 -0700 Subject: [ofa-general] [PATCH v4] Add a Node Description check on light sweep. In-Reply-To: <20080817205411.GU2339@sashak.voltaire.com> References: <20080807145326.1d91604c.weiny2@llnl.gov> <20080808090815.0a9b923d.weiny2@llnl.gov> <20080817205411.GU2339@sashak.voltaire.com> Message-ID: <20080818144306.437b8dba.weiny2@llnl.gov> Revised with Sasha's comments. Ira >From 098718557b18b6e47cceec15a65b1706722999aa Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Wed, 30 Jul 2008 17:28:30 -0700 Subject: [PATCH] Add a Node Description check on light sweep. A Node Description check on light sweep will ensure that the ND has been found for each node. This case covers the condition where a ND message is dropped/lost for some reason and OpenSM is left with a valid configured node which is not named correctly. This is not the same as a node which has changed it's Node Descriptioin. In this case the node needs to send a trap. Signed-off-by: Ira Weiny --- opensm/include/opensm/osm_base.h | 11 ++++++++ opensm/opensm/osm_node.c | 2 +- opensm/opensm/osm_state_mgr.c | 49 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 61 insertions(+), 1 deletions(-) diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index 3793804..2e8def7 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -640,6 +640,17 @@ BEGIN_C_DECLS */ #define OSM_NO_PATH 0xFF /**********/ +/****d* OpenSM: Base/OSM_NODE_DESC_UNKNOWN +* NAME +* OSM_NODE_DESC_UNKNOWN +* +* DESCRIPTION +* Value indicating the Node Description is not set and is "unknown" +* +* SYNOPSIS +*/ +#define OSM_NODE_DESC_UNKNOWN "" +/**********/ /****d* OpenSM: Base/osm_thread_state_t * NAME * osm_thread_state_t diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index d99c656..123feb8 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -136,7 +136,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) osm_node_init_physp(p_node, p_madw); if (p_ni->node_type == IB_NODE_TYPE_SWITCH) node_init_physp0(p_node, p_madw); - p_node->print_desc = strdup(""); + p_node->print_desc = strdup(OSM_NODE_DESC_UNKNOWN); return (p_node); } diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 3cdb2cf..86a728c 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -516,6 +516,51 @@ static void query_sm_info(cl_map_item_t *item, void *cxt) } /********************************************************************** + During a light sweep check each node to see if the node descriptor is valid + if not issue a ND query. +**********************************************************************/ +static void __osm_state_mgr_get_node_desc(IN cl_map_item_t * const p_object, + IN void *context) +{ + osm_physp_t *p_physp = NULL; + osm_node_t *const p_node = (osm_node_t *) p_object; + ib_api_status_t status = IB_SUCCESS; + osm_madw_context_t mad_context; + osm_sm_t *sm = (osm_sm_t *)context; + + OSM_LOG_ENTER(sm->p_log); + + CL_ASSERT(p_node); + + if (p_node->print_desc && strcmp(p_node->print_desc, OSM_NODE_DESC_UNKNOWN)) + /* if ND is valid, do nothing */ + goto exit; + + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "ERR 3319: Unknown node description for node GUID " + "0x%016" PRIx64 ". Reissuing ND query\n", + cl_ntoh64(osm_node_get_node_guid (p_node))); + + /* get a physp to request from. */ + p_physp = osm_node_get_any_physp_ptr(p_node); + + mad_context.nd_context.node_guid = osm_node_get_node_guid(p_node); + + status = osm_req_get(sm, + osm_physp_get_dr_path_ptr(p_physp), + IB_MAD_ATTR_NODE_DESC, + 0, CL_DISP_MSGID_NONE, &mad_context); + if (status != IB_SUCCESS) + OSM_LOG(sm->p_log, OSM_LOG_ERROR, + "ERR 331B: Failure initiating NodeDescription request " + "(%s)\n", ib_get_err_str(status)); + +exit: + OSM_LOG_EXIT(sm->p_log); +} + + +/********************************************************************** Initiates a lightweight sweep of the subnet. Used during normal sweeps after the subnet is up. **********************************************************************/ @@ -550,6 +595,10 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_sm_t * sm) cl_qmap_apply_func(p_sw_tbl, __osm_state_mgr_get_sw_info, sm); CL_PLOCK_RELEASE(sm->p_lock); + CL_PLOCK_ACQUIRE(sm->p_lock); + cl_qmap_apply_func(&sm->p_subn->node_guid_tbl, __osm_state_mgr_get_node_desc, sm); + CL_PLOCK_RELEASE(sm->p_lock); + /* now scan the list of physical ports that were not down but have no remote port */ CL_PLOCK_ACQUIRE(sm->p_lock); p_next = cl_qmap_head(&sm->p_subn->node_guid_tbl); -- 1.5.4.5 From gopalakk at cse.ohio-state.edu Mon Aug 18 14:48:19 2008 From: gopalakk at cse.ohio-state.edu (Karthik Gopalakrishnan) Date: Mon, 18 Aug 2008 17:48:19 -0400 Subject: [ofa-general] ***SPAM*** Error returned by ibv_poll_cq() In-Reply-To: <48A9EB60.9060507@gmail.com> References: <92eddfb50808180848w2964eb54x143572119bec805c@mail.gmail.com> <2f3bf9a60808181122w351835cdxef71792f75d6d354@mail.gmail.com> <92eddfb50808181208t1a7ccec9h81569ffebaab9a0a@mail.gmail.com> <48A9EB60.9060507@gmail.com> Message-ID: <92eddfb50808181448j334bbc86pb186cf7d6593f94e@mail.gmail.com> Will do. Thanks for the tip. Regards, Karthik On Mon, Aug 18, 2008 at 5:36 PM, Dotan Barak wrote: > Karthik Gopalakrishnan wrote: >> >> Yes. There was an unsuccessful RDMA operation to this side. >> > > You should try to check that the r_key + address + size of the send request > that you posted > is matching a valid MR with the right permissions and size ... > > Dotan >> >> On Mon, Aug 18, 2008 at 2:22 PM, Dotan Barak wrote: >> >>> >>> Hi. >>> >>> Did you try to perform (unsuccessful) RDMA operation to this side? >>> >>> Dotan >>> >>> On Mon, Aug 18, 2008 at 6:48 PM, Karthik Gopalakrishnan >>> wrote: >>> >>>> >>>> Hello. >>>> >>>> ibv_poll_cq() returns IBV_WC_REM_ACCESS_ERR or IBV_WC_WR_FLUSH_ERR in >>>> wc.status and IBV_WC_SEND in wc.opcode. The async event handler >>>> simultaneously reports IBV_EVENT_QP_ACCESS_ERR. I could not find a >>>> description of these errors either in the man pages or in verbs.h. I >>>> would be grateful if someone can tell me under what circumstances >>>> ibv_poll_cq() would return those errors. >>>> >>>> I am using the following adapter. >>>> =========================== >>>> CA 'mthca0' >>>> CA type: MT25208 >>>> Number of ports: 2 >>>> Firmware version: 5.1.400 >>>> Hardware version: a0 >>>> Node GUID: 0x0002c9020023c078 >>>> System image GUID: 0x0002c9020023c07b >>>> Port 1: >>>> State: Active >>>> Physical state: LinkUp >>>> Rate: 20 >>>> Base lid: 67 >>>> LMC: 0 >>>> SM lid: 156 >>>> Capability mask: 0x02510a68 >>>> Port GUID: 0x0002c9020023c079 >>>> Port 2: >>>> State: Down >>>> Physical state: Polling >>>> Rate: 10 >>>> Base lid: 0 >>>> LMC: 0 >>>> SM lid: 0 >>>> Capability mask: 0x02510a68 >>>> Port GUID: 0x0002c9020023c07a >>>> =========================== >>>> >>>> Thanks & Regards, >>>> Karthik >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>>> http://openib.org/mailman/listinfo/openib-general >>>> >>>> > > From sashak at voltaire.com Mon Aug 18 15:11:33 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 01:11:33 +0300 Subject: [ofa-general] [PATCH v4] Add a Node Description check on light sweep. In-Reply-To: <20080818144306.437b8dba.weiny2@llnl.gov> References: <20080807145326.1d91604c.weiny2@llnl.gov> <20080808090815.0a9b923d.weiny2@llnl.gov> <20080817205411.GU2339@sashak.voltaire.com> <20080818144306.437b8dba.weiny2@llnl.gov> Message-ID: <20080818221133.GS27204@sashak.voltaire.com> On 14:43 Mon 18 Aug , Ira Weiny wrote: > Revised with Sasha's comments. > > Ira > > > From 098718557b18b6e47cceec15a65b1706722999aa Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 30 Jul 2008 17:28:30 -0700 > Subject: [PATCH] Add a Node Description check on light sweep. > > A Node Description check on light sweep will ensure that the ND has been found > for each node. This case covers the condition where a ND message is > dropped/lost for some reason and OpenSM is left with a valid configured node > which is not named correctly. > > This is not the same as a node which has changed it's Node Descriptioin. In > this case the node needs to send a trap. > > Signed-off-by: Ira Weiny Applied. Thanks. Sasha From sashak at voltaire.com Mon Aug 18 15:22:06 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 01:22:06 +0300 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <1219094843.29252.104.camel@cardanus.llnl.gov> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> Message-ID: <20080818222206.GT27204@sashak.voltaire.com> On 14:27 Mon 18 Aug , Al Chu wrote: > > Or do the existing test scripts only use the first character of the > command? Yes. > If that's the case, then I guess we could program in single > character commands to be legacy-special cases. Why it should be special case? What is wrong with using partial command names when resolving is simple? > But we would require > whitespace between the command and options for that to work. This is fine. What I meant is follow: unsigned cmd_len = 0; while (isalpha(line[cmd_len])) cmd_len++; if (!strncasecmp(line, "Dump", cmdlen)) ... In this case strings "D", "du", etc. will be resolved as "Dump" command. Sasha From weiny2 at llnl.gov Mon Aug 18 15:46:51 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 18 Aug 2008 15:46:51 -0700 Subject: [ofa-general] [PATCH] Fix some missing node name map substitutions Message-ID: <20080818154651.302d6053.weiny2@llnl.gov> >From 1b7ed57320796720a0e8c25f04a2544ac9374aa5 Mon Sep 17 00:00:00 2001 From: Ira Weiny Date: Mon, 18 Aug 2008 15:42:01 -0700 Subject: [PATCH] Fix some missing node name map substitutions These are 2 cases where the node name map substitution was missed. Signed-off-by: Ira Weiny --- infiniband-diags/src/ibnetdiscover.c | 21 +++++++++++++++++---- 1 files changed, 17 insertions(+), 4 deletions(-) diff --git a/infiniband-diags/src/ibnetdiscover.c b/infiniband-diags/src/ibnetdiscover.c index 20da1ea..803c300 100644 --- a/infiniband-diags/src/ibnetdiscover.c +++ b/infiniband-diags/src/ibnetdiscover.c @@ -510,7 +510,7 @@ out_ids(Node *node, int group, char *chname) && node->chrecord && node->chrecord->chassisnum) { fprintf(f, "\t\t# Chassis %d", node->chrecord->chassisnum); if (chname) - fprintf(f, " (%s)", clean_nodedesc(chname)); + fprintf(f, " (%s)", chname); if (is_xsigo_tca(node->nodeguid) && node->ports->remoteport) fprintf(f, " slot %d", node->ports->remoteport->portnum); } @@ -569,6 +569,8 @@ out_ca(Node *node, int group, char *chname) { char *node_type; char *node_type2; + char *nodename = remap_node_name(node_name_map, node->nodeguid, + node->nodedesc); out_ids(node, group, chname); switch(node->type) { @@ -589,10 +591,12 @@ out_ca(Node *node, int group, char *chname) fprintf(f, "%sguid=0x%" PRIx64 "\n", node_type, node->nodeguid); fprintf(f, "%s\t%d %s\t\t# \"%s\"", node_type2, node->numports, node_name(node), - clean_nodedesc(node->nodedesc)); + nodename); if (group && is_xsigo_hca(node->nodeguid)) fprintf(f, " (scp)"); fprintf(f, "\n"); + + free(nodename); } static char * @@ -705,6 +709,8 @@ dump_topology(int listtype, int group) if (!ch->chassisnum) continue; chguid = out_chassis(ch->chassisnum); + if (chname) + free(chname); chname = NULL; if (is_xsigo_guid(chguid)) { for (node = nodesdist[MAXHOPS]; node; node = node->dnext) { @@ -716,8 +722,10 @@ dump_topology(int listtype, int group) continue; if (is_xsigo_hca(node->nodeguid)) { - chname = node->nodedesc; - fprintf(f, "Hostname: %s\n", clean_nodedesc(node->nodedesc)); + chname = remap_node_name(node_name_map, + node->nodeguid, + node->nodedesc); + fprintf(f, "Hostname: %s\n", chname); } } } @@ -804,6 +812,8 @@ dump_topology(int listtype, int group) } } + if (chname) + free(chname); chname = NULL; if (group && !listtype) { @@ -851,6 +861,9 @@ dump_topology(int listtype, int group) out_ca_port(port, group); } + if (chname) + free(chname); + return i; } -- 1.5.4.5 From chu11 at llnl.gov Mon Aug 18 15:50:03 2008 From: chu11 at llnl.gov (Al Chu) Date: Mon, 18 Aug 2008 15:50:03 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080818222206.GT27204@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> Message-ID: <1219099803.29252.118.camel@cardanus.llnl.gov> Hey Sasha, On Tue, 2008-08-19 at 01:22 +0300, Sasha Khapyorsky wrote: > On 14:27 Mon 18 Aug , Al Chu wrote: > > > > Or do the existing test scripts only use the first character of the > > command? > > Yes. > > > If that's the case, then I guess we could program in single > > character commands to be legacy-special cases. > > Why it should be special case? What is wrong with using partial command > names when resolving is simple? The reason is that the ordering of the if statements now matters. If I add a new command called "DoSomething", it must come after the "Dump" comparison, otherwise the command "D" could take the "DoSomething" branch. We can add a comments or something to document this. It's obviously just a style difference. > > > > But we would require > > whitespace between the command and options for that to work. > > This is fine. > > What I meant is follow: > > unsigned cmd_len = 0; > > while (isalpha(line[cmd_len])) > cmd_len++; > > if (!strncasecmp(line, "Dump", cmdlen)) > ... > > In this case strings "D", "du", etc. will be resolved as "Dump" command. Ok. I see what you were thinking now. Al > Sasha -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From sashak at voltaire.com Mon Aug 18 16:01:13 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 02:01:13 +0300 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <1219099803.29252.118.camel@cardanus.llnl.gov> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> Message-ID: <20080818230113.GU27204@sashak.voltaire.com> On 15:50 Mon 18 Aug , Al Chu wrote: > > The reason is that the ordering of the if statements now matters. If I > add a new command called "DoSomething", it must come after the "Dump" > comparison, otherwise the command "D" could take the "DoSomething" > branch. Sure, order will matter - new commands will be at end. Sasha > We can add a comments or something to document this. It's > obviously just a style difference. > > > > > > > But we would require > > > whitespace between the command and options for that to work. > > > > This is fine. > > > > What I meant is follow: > > > > unsigned cmd_len = 0; > > > > while (isalpha(line[cmd_len])) > > cmd_len++; > > > > if (!strncasecmp(line, "Dump", cmdlen)) > > ... > > > > In this case strings "D", "du", etc. will be resolved as "Dump" command. > > Ok. I see what you were thinking now. > > Al > > > Sasha > -- > Albert Chu > chu11 at llnl.gov > 925-422-5311 > Computer Scientist > High Performance Systems Division > Lawrence Livermore National Laboratory > From sashak at voltaire.com Mon Aug 18 16:16:28 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 02:16:28 +0300 Subject: [ofa-general] Re: [PATCH] Fix some missing node name map substitutions In-Reply-To: <20080818154651.302d6053.weiny2@llnl.gov> References: <20080818154651.302d6053.weiny2@llnl.gov> Message-ID: <20080818231628.GV27204@sashak.voltaire.com> On 15:46 Mon 18 Aug , Ira Weiny wrote: > From 1b7ed57320796720a0e8c25f04a2544ac9374aa5 Mon Sep 17 00:00:00 2001 > From: Ira Weiny > Date: Mon, 18 Aug 2008 15:42:01 -0700 > Subject: [PATCH] Fix some missing node name map substitutions > > These are 2 cases where the node name map substitution was missed. > > Signed-off-by: Ira Weiny Applied. Thanks. Sasha From chu11 at llnl.gov Mon Aug 18 17:39:50 2008 From: chu11 at llnl.gov (Al Chu) Date: Mon, 18 Aug 2008 17:39:50 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080818230113.GU27204@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> <20080818230113.GU27204@sashak.voltaire.com> Message-ID: <1219106390.29252.125.camel@cardanus.llnl.gov> Hey Sasha, New patch is attached. Al On Tue, 2008-08-19 at 02:01 +0300, Sasha Khapyorsky wrote: > On 15:50 Mon 18 Aug , Al Chu wrote: > > > > The reason is that the ordering of the if statements now matters. If I > > add a new command called "DoSomething", it must come after the "Dump" > > comparison, otherwise the command "D" could take the "DoSomething" > > branch. > > Sure, order will matter - new commands will be at end. > > Sasha > > > We can add a comments or something to document this. It's > > obviously just a style difference. > > > > > > > > > > But we would require > > > > whitespace between the command and options for that to work. > > > > > > This is fine. > > > > > > What I meant is follow: > > > > > > unsigned cmd_len = 0; > > > > > > while (isalpha(line[cmd_len])) > > > cmd_len++; > > > > > > if (!strncasecmp(line, "Dump", cmdlen)) > > > ... > > > > > > In this case strings "D", "du", etc. will be resolved as "Dump" command. > > > > Ok. I see what you were thinking now. > > > > Al > > > > > Sasha > > -- > > Albert Chu > > chu11 at llnl.gov > > 925-422-5311 > > Computer Scientist > > High Performance Systems Division > > Lawrence Livermore National Laboratory > > -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-parse-sim-cmds-via-full-name.patch Type: text/x-patch Size: 3543 bytes Desc: not available URL: From devesh28 at gmail.com Mon Aug 18 21:44:43 2008 From: devesh28 at gmail.com (Devesh Sharma) Date: Tue, 19 Aug 2008 10:14:43 +0530 Subject: [ofa-general] ***SPAM*** SDP : How to find out current mem-usage of SDP socket Message-ID: <309a667c0808182144v5906d9f3o492b529cd1e9129b@mail.gmail.com> Hello all, Anybody please tell me how to find out current memory usage of SDP socket? -Devesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From ogerlitz at voltaire.com Mon Aug 18 22:45:04 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 19 Aug 2008 08:45:04 +0300 Subject: [ofa-general] Re: [ewg] STOP the onslaught of EWG spam - disallowing non member posts In-Reply-To: References: Message-ID: <48AA5DE0.9010808@voltaire.com> Jeff Squyres wrote: > The EWG list has gotten spam bombed over the last few hours. I lost > count at 500+ spams in my inbox. I therefore logged into > openfabrics.org and changed the site-wide password for Mailman (I have > notified Jeff Becker of the new password). I then changed the EWG > list to silently discard all non-member posts. Since I didn't know if > other OF lists were being spam-bombed, I did the same for all OF lists > as well. > Hi Jeff, Disallowing non members posts to the general list is problematic, since as of the below MAINTAINERS entry this is where people from the kernel community post issues they have with the RDMA stack, and you don't expect everyone to subscribe the list... white-listing member posts sounds fine, but this is a bit too much. As Sasha said and as was mentioned few times in the past, it seems like the old @openib.org aliases cause us lots of troubles, maybe we should just remove them. Or. > INFINIBAND SUBSYSTEM > P: Roland Dreier > M: rolandd at cisco.com > P: Sean Hefty > M: sean.hefty at intel.com > P: Hal Rosenstock > M: hal.rosenstock at gmail.com > L: general at lists.openfabrics.org > W: http://www.openib.org/ > T: git kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git > S: Supported > > From amirv.mellanox at gmail.com Tue Aug 19 00:14:06 2008 From: amirv.mellanox at gmail.com (Amir Vadai) Date: Tue, 19 Aug 2008 10:14:06 +0300 Subject: ***SPAM*** RE: [ofa-general] SDP : How to find out current mem-usage ofSDP socket In-Reply-To: <309a667c0808182144v5906d9f3o492b529cd1e9129b@mail.gmail.com> References: <309a667c0808182144v5906d9f3o492b529cd1e9129b@mail.gmail.com> Message-ID: <5E96F603830D43D0A832E57B7AAE1C8D@mtl.com> Devesh Hi, You could use some strategies: 1. check the size column of ib_sdp module when issueing lsmod 2. check in /proc/slabinfo values of slab named 'SDP' 3. check values in /proc/meminfo before and after loading and using ib_sdp module - Amir _____ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Devesh Sharma Sent: Tuesday, August 19, 2008 7:45 AM To: general at lists.openfabrics.org Subject: [ofa-general] ***SPAM*** SDP : How to find out current mem-usage ofSDP socket Hello all, Anybody please tell me how to find out current memory usage of SDP socket? -Devesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at dev.mellanox.co.il Tue Aug 19 02:23:10 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 19 Aug 2008 12:23:10 +0300 Subject: [ofa-general] opensm/osm_port_t struct definition Message-ID: <48AA90FE.3080109@dev.mellanox.co.il> Hi Sasha, I have a general question/concern about osm_port_t: typedef struct osm_port { cl_map_item_t map_item; cl_list_item_t list_item; ... } osm_port_t; Here and there in the code I see some comments that map_item and list_item should be first members of the struct, which, I guess, means that same object can't be member of both map and list. Do we have a problem here? -- Yevgeny From sashak at voltaire.com Tue Aug 19 02:29:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 12:29:02 +0300 Subject: [ofa-general] Re: opensm/osm_port_t struct definition In-Reply-To: <48AA90FE.3080109@dev.mellanox.co.il> References: <48AA90FE.3080109@dev.mellanox.co.il> Message-ID: <20080819092902.GW27204@sashak.voltaire.com> Hi Yevgeny, On 12:23 Tue 19 Aug , Yevgeny Kliteynik wrote: > > I have a general question/concern about osm_port_t: > > typedef struct osm_port { > cl_map_item_t map_item; > cl_list_item_t list_item; > ... > } osm_port_t; > > Here and there in the code I see some comments that > map_item and list_item should be first members of the > struct, I cannot find such comment about list_item. It should not be a first member, to access the structure we are using cl_item_obj() macro (cl_qlist.h). > which, I guess, means that same object can't be > member of both map and list. > Do we have a problem here? No, both can be used. I don't see any problem here. Sasha From kliteyn at dev.mellanox.co.il Tue Aug 19 04:30:15 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 19 Aug 2008 14:30:15 +0300 Subject: [ofa-general] Re: opensm/osm_port_t struct definition In-Reply-To: <20080819092902.GW27204@sashak.voltaire.com> References: <48AA90FE.3080109@dev.mellanox.co.il> <20080819092902.GW27204@sashak.voltaire.com> Message-ID: <48AAAEC7.7090106@dev.mellanox.co.il> Sasha Khapyorsky wrote: > Hi Yevgeny, > > On 12:23 Tue 19 Aug , Yevgeny Kliteynik wrote: >> I have a general question/concern about osm_port_t: >> >> typedef struct osm_port { >> cl_map_item_t map_item; >> cl_list_item_t list_item; >> ... >> } osm_port_t; >> >> Here and there in the code I see some comments that >> map_item and list_item should be first members of the >> struct, > > I cannot find such comment about list_item. Here are some examples (there are more): opensm/include/opensm/osm_prefix_route.h ---------------------------------------- typedef struct { cl_list_item_t list_item; /* must be first */ ... } osm_prefix_route_t; opensm/include/opensm/osm_service.h ----------------------------------- typedef struct osm_svcr { cl_list_item_t list_item; ... } osm_svcr_t; /* * FIELDS * map_item * Map Item for qmap linkage. Must be first element!! ... opensm/include/opensm/osm_mcm_info.h ------------------------------------ typedef struct osm_mcm_info { cl_list_item_t list_item; ... } osm_mcm_info_t; /* * FIELDS * list_item * Linkage structure for cl_qlist. MUST BE FIRST MEMBER! ... opensm/include/opensm/osm_madw.h -------------------------------- typedef struct osm_madw { cl_list_item_t list_item; ... } osm_madw_t; /* * FIELDS * list_item * List linkage for lists. MUST BE FIRST MEMBER! ... opensm/include/opensm/osm_inform.h ---------------------------------- typedef struct osm_infr { cl_list_item_t list_item; ... } osm_infr_t; /* * FIELDS * list_item * List Item for qlist linkage. Must be first element!! ... > It should not be a first > member, to access the structure we are using cl_item_obj() macro > (cl_qlist.h). I couldn't find any problem with having list_item not only in the beginning of the struct, but I was confused by all these comments in the code. So I guess that only cl_map_item_t has to be first in the struct. -- Yevgeny >> which, I guess, means that same object can't be >> member of both map and list. >> Do we have a problem here? > > No, both can be used. I don't see any problem here. > > Sasha > From ronli.voltaire at gmail.com Tue Aug 19 04:44:26 2008 From: ronli.voltaire at gmail.com (Ron Livne) Date: Tue, 19 Aug 2008 14:44:26 +0300 Subject: [ofa-general] Re: [ewg] [PATCH 1/2 v2]libibvers: add create_qp_expanded In-Reply-To: References: Message-ID: <3b5e77ad0808190444u732afbadnae40c74a73ab45f2@mail.gmail.com> OK, but doesn't it contradict the approach you agreed on? > What do you think of the following approach? > Instead of adding creation flags to the qp_init_attr, I can add a new verb: > ibv_qp *create_qp_extended(struct ibv_pd *pd, struct ibv_qp_init_attr, > *init_attr, enum ibv_qp_create_flags create_flags) > > I'm aware that adding a new verb isn't optimal, but at least we can > avoid incrementing the libibverbs version. I think this new verb seems like a better approach right now. On Tue, Aug 12, 2008 at 10:37 PM, Roland Dreier wrote: > Sorry for jumping in so late in the process, but a few big concerns: > > > struct ibv_qp *ibv_create_qp_expanded(struct ibv_pd *pd, > > struct ibv_qp_init_attr *qp_init_attr, > > uint32_t create_flags); > > I don't like the name "_expanded" when all we are doing is adding a > flags parameter. The next time we need to tweak this API, then we end > up with _extra_super_expanded or something like that. > > I see two better options: keep the same prototype but call it something > like ibv_create_qp_with_flags (or maybe ibv_create_qp_flags), or keep > the name ibv_create_qp_expanded but instead of create_flags, have the > new parameter be ext_mask, have one bit in ext_mask indicate create > flags, and add create_flags to struct ibv_qp_init_attr -- then we can > add more extra stuff by using more bits in ext_mask. > > Also, I wonder if it's worth a new verb in the kernel ABI for this. > Maybe we should add a new command in the ABI where libibverbs can pass > in a bitmask of supported extensions, and the kernel can respond with > which extensions it supports. And then we can just continue to use the > reserved field in the existing create_qp command if both kernel and > userspace agree that they support create flags there. > > - R. > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > From devesh28 at gmail.com Tue Aug 19 04:52:03 2008 From: devesh28 at gmail.com (Devesh Sharma) Date: Tue, 19 Aug 2008 17:22:03 +0530 Subject: ***SPAM*** Re: [ofa-general] SDP : How to find out current mem-usage ofSDP socket In-Reply-To: <5E96F603830D43D0A832E57B7AAE1C8D@mtl.com> References: <309a667c0808182144v5906d9f3o492b529cd1e9129b@mail.gmail.com> <5E96F603830D43D0A832E57B7AAE1C8D@mtl.com> Message-ID: <309a667c0808190452h12d07ab0r279315919347e2fe@mail.gmail.com> Thanks Amir I will try this out. can you suggest some benchmarking tools for SDP. On Tue, Aug 19, 2008 at 12:44 PM, Amir Vadai wrote: > Devesh Hi, > > You could use some strategies: > 1. check the size column of ib_sdp module when issueing lsmod > 2. check in /proc/slabinfo values of slab named 'SDP' > 3. check values in /proc/meminfo before and after loading and using ib_sdp > module > > - Amir > > ------------------------------ > *From:* general-bounces at lists.openfabrics.org [mailto: > general-bounces at lists.openfabrics.org] *On Behalf Of *Devesh Sharma > *Sent:* Tuesday, August 19, 2008 7:45 AM > *To:* general at lists.openfabrics.org > *Subject:* [ofa-general] ***SPAM*** SDP : How to find out current > mem-usage ofSDP socket > > Hello all, > > Anybody please tell me how to find out current memory usage of SDP socket? > > -Devesh > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Tue Aug 19 06:39:01 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 16:39:01 +0300 Subject: [ofa-general] Re: opensm/osm_port_t struct definition In-Reply-To: <48AAAEC7.7090106@dev.mellanox.co.il> References: <48AA90FE.3080109@dev.mellanox.co.il> <20080819092902.GW27204@sashak.voltaire.com> <48AAAEC7.7090106@dev.mellanox.co.il> Message-ID: <20080819133901.GA27535@sashak.voltaire.com> On 14:30 Tue 19 Aug , Yevgeny Kliteynik wrote: > Sasha Khapyorsky wrote: >> Hi Yevgeny, >> On 12:23 Tue 19 Aug , Yevgeny Kliteynik wrote: >>> I have a general question/concern about osm_port_t: >>> >>> typedef struct osm_port { >>> cl_map_item_t map_item; >>> cl_list_item_t list_item; >>> ... >>> } osm_port_t; >>> >>> Here and there in the code I see some comments that >>> map_item and list_item should be first members of the >>> struct, >> I cannot find such comment about list_item. > > Here are some examples (there are more): > > opensm/include/opensm/osm_prefix_route.h > ---------------------------------------- > > typedef struct { > cl_list_item_t list_item; /* must be first */ > ... > } osm_prefix_route_t; [snip...] All examples are not about 'struct osm_port' but about some other structures. How is this related? > I couldn't find any problem with having list_item not only in the > beginning of the struct, but I was confused by all these comments > in the code. Maybe I'm starting to understand confusion (maybe)... Example: struct obj { cl_any_item_t item1; .... cl_any_item_t item2; ... }; Now you can access this object: (1) via pointer to item1: obj_ptr = (struct obj *)any_item1_ptr; (2) via pointer to item2: obj_ptr = cl_item_obj(any_item2_ptr, obj_ptr, item2); [ cl_item_obj() is: #define cl_item_obj(item_ptr, obj_ptr, item_field) (typeof(obj_ptr)) \ ((void *)item_ptr - (unsigned long)&((typeof(obj_ptr))0)->item_field) ] Obviously for case (1) 'item1' must be first field in a structure. > So I guess that only cl_map_item_t has to be first in the struct. No, it is not related to item type, see above. Sasha From chu11 at llnl.gov Tue Aug 19 08:51:05 2008 From: chu11 at llnl.gov (Al Chu) Date: Tue, 19 Aug 2008 08:51:05 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080818230113.GU27204@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> <20080818230113.GU27204@sashak.voltaire.com> Message-ID: <1219161065.29252.129.camel@cardanus.llnl.gov> Hey Sasha, On Tue, 2008-08-19 at 02:01 +0300, Sasha Khapyorsky wrote: > On 15:50 Mon 18 Aug , Al Chu wrote: > > > > The reason is that the ordering of the if statements now matters. If I > > add a new command called "DoSomething", it must come after the "Dump" > > comparison, otherwise the command "D" could take the "DoSomething" > > branch. > > Sure, order will matter - new commands will be at end. I also realized that this approach (which is in the patch I posted last night), would allow typos like "star" and "hel" to work. Is that ok with you? Al > > Sasha > > > We can add a comments or something to document this. It's > > obviously just a style difference. > > > > > > > > > > But we would require > > > > whitespace between the command and options for that to work. > > > > > > This is fine. > > > > > > What I meant is follow: > > > > > > unsigned cmd_len = 0; > > > > > > while (isalpha(line[cmd_len])) > > > cmd_len++; > > > > > > if (!strncasecmp(line, "Dump", cmdlen)) > > > ... > > > > > > In this case strings "D", "du", etc. will be resolved as "Dump" command. > > > > Ok. I see what you were thinking now. > > > > Al > > > > > Sasha > > -- > > Albert Chu > > chu11 at llnl.gov > > 925-422-5311 > > Computer Scientist > > High Performance Systems Division > > Lawrence Livermore National Laboratory > > -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From jsquyres at cisco.com Tue Aug 19 09:11:06 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 19 Aug 2008 09:11:06 -0700 Subject: [ofa-general] Re: [ewg] STOP the onslaught of EWG spam In-Reply-To: <3F6F638B8D880340AB536D29CD4C1E19224EEF76@orsmsx501.amr.corp.intel.com> References: <3F6F638B8D880340AB536D29CD4C1E19224EEF76@orsmsx501.amr.corp.intel.com> Message-ID: <4B943B0A-AAAB-48F7-8DB0-2ACC36947230@cisco.com> I didn't change any of the list passwords, just the "master" mailman password. The master password should not be shared outside of the sysadmins (Jeff Becker). I am not the normal sysadmin; Jeff B. is. I only took action to save our inboxes over the weekend. I did send Jeff B. the new master password. If Jeff B. is indisposed, I'd be happy to reset any individual list passwords that you might need. On Aug 19, 2008, at 9:07 AM, Ryan, Jim wrote: > I sent the note below to the wrong Jeff, sorry. We still aren't > closed on this. I don't have access to provide approval I typically > need to do. Is anyone working this? > > Jim > > -----Original Message----- > From: ewg-bounces at lists.openfabrics.org [mailto:ewg-bounces at lists.openfabrics.org > ] On Behalf Of Ryan, Jim > Sent: Monday, August 18, 2008 4:03 PM > To: Bill Boas; 'Jeff Squyres'; 'OpenFabrics EWG'; 'OpenFabrics > General' > Subject: RE: [ewg] STOP the onslaught of EWG spam > > I got 'em too. Jeff, when you can, plz send the new pswd to me. I > can't do my admin duties at the moment > > Tx, j > > -----Original Message----- > From: ewg-bounces at lists.openfabrics.org [mailto:ewg-bounces at lists.openfabrics.org > ] On Behalf Of Bill Boas > Sent: Sunday, August 17, 2008 4:19 PM > To: 'Jeff Squyres'; 'OpenFabrics EWG'; 'OpenFabrics General' > Subject: RE: [ewg] STOP the onslaught of EWG spam > > Thank you very much Jeff, and thank you for taking this action on > the other > mail lists which were probably just as vulnerable - I has over 3400 > "spam" > messages - did others have as many, or was I targeted? :-)!!! > Bill Boas > VP, Business Development > System Fabric Works > 510-375-8840 > bboas at systemfabricworks.com > www.systemfabricworks.com > > > -----Original Message----- > From: ewg-bounces at lists.openfabrics.org > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Jeff Squyres > Sent: Sunday, August 17, 2008 5:21 AM > To: OpenFabrics EWG; OpenFabrics General > Subject: [ewg] STOP the onslaught of EWG spam > > The EWG list has gotten spam bombed over the last few hours. I lost > count at 500+ spams in my inbox. > > I therefore logged into openfabrics.org and changed the site-wide > password for Mailman (I have notified Jeff Becker of the new > password). I then changed the EWG list to silently discard all non- > member posts. Since I didn't know if other OF lists were being spam- > bombed, I did the same for all OF lists as well. > > The spam onslaught has now stopped. > > I also notice that our mailmain installation is hopelessly out of > date; it's v2.1.5 and the current version (including several important > security fixes since v2.1.5) is v2.1.11. Someone needs to fix this > ASAP. > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg -- Jeff Squyres Cisco Systems From rdreier at cisco.com Tue Aug 19 09:21:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 19 Aug 2008 09:21:33 -0700 Subject: [ofa-general] Re: [ewg] STOP the onslaught of EWG spam - disallowing non member posts In-Reply-To: <48AA5DE0.9010808@voltaire.com> (Or Gerlitz's message of "Tue, 19 Aug 2008 08:45:04 +0300") References: <48AA5DE0.9010808@voltaire.com> Message-ID: > Disallowing non members posts to the general list is problematic, > since as of the below MAINTAINERS entry this is where people from the > kernel community post issues they have with the RDMA stack, and you > don't expect everyone to subscribe the list... I agree... closed mailing lists are a pain for lists like general@ where we want to encourage people, even non-subscribers, to drop by and report bugs, and forcing them to subscribe raises an unnecessary barrier. It's probably acceptable to leave ewg closed, since the discussion on that list generally involves only subscribers. In any case it would be interesting to understand exactly why ewg started getting such a huge flood of traffic anyway. It seemed that the majority of mails were not actually spam but rather backscatter bounce messages caused by spam with a forged from address of ewg at openib.org. It seems disabling the ewg at openib.org address would fix things for now, and if there were some way to stop mailman from forwarding bounce messages, that would deal with the backscatter issue more permanently. - R. From jsquyres at cisco.com Tue Aug 19 09:28:18 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Tue, 19 Aug 2008 09:28:18 -0700 Subject: [ofa-general] Re: [ewg] STOP the onslaught of EWG spam - disallowing non member posts In-Reply-To: References: <48AA5DE0.9010808@voltaire.com> Message-ID: FWIW: I think we all know each other's positions on open vs. closed lists. :-) I only did what I did this weekend to stop the inbox tragedy. I leave future actions (such as re-enabling anonymous posting) up to Jeff Becker, the real sysadmin. I'd personally be in favor of removing *all* aliases for openib.org. On Aug 19, 2008, at 9:21 AM, Roland Dreier wrote: >> Disallowing non members posts to the general list is problematic, >> since as of the below MAINTAINERS entry this is where people from the >> kernel community post issues they have with the RDMA stack, and you >> don't expect everyone to subscribe the list... > > I agree... closed mailing lists are a pain for lists like general@ > where > we want to encourage people, even non-subscribers, to drop by and > report > bugs, and forcing them to subscribe raises an unnecessary barrier. > > It's probably acceptable to leave ewg closed, since the discussion on > that list generally involves only subscribers. > > In any case it would be interesting to understand exactly why ewg > started getting such a huge flood of traffic anyway. It seemed that > the > majority of mails were not actually spam but rather backscatter bounce > messages caused by spam with a forged from address of ewg at openib.org. > It seems disabling the ewg at openib.org address would fix things for > now, > and if there were some way to stop mailman from forwarding bounce > messages, that would deal with the backscatter issue more permanently. > > - R. -- Jeff Squyres Cisco Systems From jeff at splitrockpr.com Tue Aug 19 11:10:42 2008 From: jeff at splitrockpr.com (Jeffrey Scott) Date: Tue, 19 Aug 2008 11:10:42 -0700 Subject: [ofa-general] Don't miss the IBTA Technical Forum '08! Message-ID: <657B0C1542D14430A878C2419D9FA0EE@Gaucho> Hello OFA Members. We are rapidly approaching this year's IBTA Technical Forum; it's just four weeks away! The theme this year is "InfiniBand and the Enterprise Data Center" and the IBTA has put together a compelling agenda with end-user presentations from General Motors, France Telecom and others, as well as an analyst presentation from Gartner and an interactive panel discussion on the future of InfiniBand. Please see below for more information; register now to receive the early bird discount. Date: Monday, September 15, 2008 Time: 8am - 5pm with networking reception immediately following Location: Harrah's Las Vegas Register: www.regonline.com/IBTATechForum08 Rate: Early bird rate is $249; after September 1 the rate increases to $299 Agenda: http://www.infinibandta.org/events/IBTATechForum08_ The IBTA needs your help spreading the word! The OFA is one of the sponsors for the networking reception taking place immediately following the technical forum. We would like to see the OFA well represented. The IBTA's Marketing Working Group has created a formal invitation (please see the attached) for you to forward to colleagues/vendors/partners/customers. Please assist the IBTA in spreading the word to the entire InfiniBand community, and plan on joining us in Las Vegas! If you have any questions, please contact Samantha Spears at 206-322-1167 x115 or samanthas at owenmedia.com. ----------------------------------- Jeffrey Scott Split Rock Communications 408-884-4017 408-348-3651 Mobile 408-884-3900 Fax www.SplitRockPR.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: IBTA TechForum 08 Invite.pdf Type: application/pdf Size: 2029867 bytes Desc: not available URL: From sashak at voltaire.com Tue Aug 19 12:01:04 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 22:01:04 +0300 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <1219161065.29252.129.camel@cardanus.llnl.gov> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> <20080818230113.GU27204@sashak.voltaire.com> <1219161065.29252.129.camel@cardanus.llnl.gov> Message-ID: <20080819190104.GD27535@sashak.voltaire.com> Hi Al, On 08:51 Tue 19 Aug , Al Chu wrote: > > I also realized that this approach (which is in the patch I posted last > night), would allow typos like "star" and "hel" to work. > > Is that ok with you? Yes, absolutely. I see this as not not a typos but rather as "partial" printing :). Sasha From sashak at voltaire.com Tue Aug 19 12:13:33 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 19 Aug 2008 22:13:33 +0300 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <1219106390.29252.125.camel@cardanus.llnl.gov> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> <20080818230113.GU27204@sashak.voltaire.com> <1219106390.29252.125.camel@cardanus.llnl.gov> Message-ID: <20080819191333.GE27535@sashak.voltaire.com> On 17:39 Mon 18 Aug , Al Chu wrote: > From 8a1f05b353469054564da75663e3a95744a1f8ec Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Wed, 13 Aug 2008 13:53:14 -0700 > Subject: [PATCH] parse sim cmds via full name > > > Signed-off-by: Albert Chu Applied. Thanks. When testing this, I paid attention that originally "#" command was used for printing line. Frankly I have no idea why it was needed, but put it back anyway. Also some flow simplification. Looks fine for you? Sasha >From aa5ee3a71a4b3bd39e0d258614be7db6f1d640a6 Mon Sep 17 00:00:00 2001 From: Sasha Khapyorsky Date: Tue, 19 Aug 2008 22:06:50 +0300 Subject: [PATCH] ibsim/sim_cmd: consolidate flows Consolidate and simplify flow. Return back line printing on '#' command. Signed-off-by: Sasha Khapyorsky --- ibsim/sim_cmd.c | 22 +++++++--------------- 1 files changed, 7 insertions(+), 15 deletions(-) diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c index 4d70517..d55fb4c 100644 --- a/ibsim/sim_cmd.c +++ b/ibsim/sim_cmd.c @@ -762,16 +762,14 @@ int do_cmd(char *buf, FILE *f) for (line = buf; *line && isspace(*line); line++) ; - /* special cases */ - if (*line == '!') - r = sim_cmd_file(f, line); - else if (*line == '#' || *line == '\n' || *line == '\0') - goto out; - while (!isspace(line[cmd_len])) cmd_len++; - if (!strncasecmp(line, "Dump", cmd_len)) + if (*line == '#') + fprintf(f, line); + else if (*line == '!') + r = sim_cmd_file(f, line); + else if (!strncasecmp(line, "Dump", cmd_len)) r = dump_net(f, line); else if (!strncasecmp(line, "Route", cmd_len)) r = dump_route(f, line); @@ -816,14 +814,8 @@ int do_cmd(char *buf, FILE *f) * * please specify new command support below this comment. */ - else { - char cmdbuf[cmd_len+1]; - - memset(cmdbuf, '\0', cmd_len+1); - strncpy(cmdbuf, line, cmd_len); + else if (*line != '\n' && *line != '\0') + fprintf(f, "command \'%s\' unknown - skipped\n", line); - fprintf(f, "command %s unknown - skipped\n", cmdbuf); - } -out: return r; } -- 1.5.4.rc2.60.gb2e62 From rdreier at cisco.com Tue Aug 19 12:52:44 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 19 Aug 2008 12:52:44 -0700 Subject: [ofa-general] How many processes on a node can open IB device ? In-Reply-To: <58C6777539C300489D145B0F8E29C32815EA8DD024@GVW0673EXC.americas.hpqcorp.net> (Changqing Tang's message of "Sat, 16 Aug 2008 15:14:08 +0000") References: <58C6777539C300489D145B0F8E29C32815EA8DD024@GVW0673EXC.americas.hpqcorp.net> Message-ID: > I have simple IBV code, which only open the device and create PD. > (attached below), then the code sleep there. > > When I start as many processes as I could, it fails at 895 copies, it fails with error: That sounds right for mlx4 with default firmware on a 4KB page size (ie x86) system. There are 1024 pages of user access registers available, but 128 + 1 = 129 are reserved for internal driver use. So that would leave 895 available for userspace use, exactly as you found. You should be able to build firmware that supports more processes, but I believe there may be some performance/stability tradeoffs related to that -- Mellanox could tell you more. - R. From changquing.tang at hp.com Tue Aug 19 13:08:31 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Tue, 19 Aug 2008 20:08:31 +0000 Subject: [ofa-general] How many processes on a node can open IB device ? In-Reply-To: References: <58C6777539C300489D145B0F8E29C32815EA8DD024@GVW0673EXC.americas.hpqcorp.net> Message-ID: <58C6777539C300489D145B0F8E29C32815F875FBD0@GVW0673EXC.americas.hpqcorp.net> Roland: Thank you very much for the info. I hope Mellanox can tell me what to do next, Our project needs to run 2048 ranks on a node, every rank has IB communication(most of them are sleeping, only a few are active). --CQ > -----Original Message----- > From: Roland Dreier [mailto:rdreier at cisco.com] > Sent: Tuesday, August 19, 2008 2:53 PM > To: Tang, Changqing > Cc: general at lists.openfabrics.org > Subject: Re: [ofa-general] How many processes on a node can > open IB device ? > > > I have simple IBV code, which only open the device and create PD. > > (attached below), then the code sleep there. > > > > When I start as many processes as I could, it fails at > 895 copies, it fails with error: > > That sounds right for mlx4 with default firmware on a 4KB > page size (ie > x86) system. There are 1024 pages of user access registers > available, but 128 + 1 = 129 are reserved for internal driver > use. So that would leave 895 available for userspace use, > exactly as you found. > > You should be able to build firmware that supports more > processes, but I believe there may be some > performance/stability tradeoffs related to that -- Mellanox > could tell you more. > > - R. > From chu11 at llnl.gov Tue Aug 19 14:22:43 2008 From: chu11 at llnl.gov (Al Chu) Date: Tue, 19 Aug 2008 14:22:43 -0700 Subject: [ofa-general] Re: [IBSIM] Parse sim cmds by name not first character In-Reply-To: <20080819191333.GE27535@sashak.voltaire.com> References: <1218661626.16508.579.camel@cardanus.llnl.gov> <20080818203946.GN27204@sashak.voltaire.com> <1219094843.29252.104.camel@cardanus.llnl.gov> <20080818222206.GT27204@sashak.voltaire.com> <1219099803.29252.118.camel@cardanus.llnl.gov> <20080818230113.GU27204@sashak.voltaire.com> <1219106390.29252.125.camel@cardanus.llnl.gov> <20080819191333.GE27535@sashak.voltaire.com> Message-ID: <1219180963.29252.131.camel@cardanus.llnl.gov> Hey Sasha, On Tue, 2008-08-19 at 22:13 +0300, Sasha Khapyorsky wrote: > On 17:39 Mon 18 Aug , Al Chu wrote: > > From 8a1f05b353469054564da75663e3a95744a1f8ec Mon Sep 17 00:00:00 2001 > > From: Albert Chu > > Date: Wed, 13 Aug 2008 13:53:14 -0700 > > Subject: [PATCH] parse sim cmds via full name > > > > > > Signed-off-by: Albert Chu > > Applied. Thanks. > > When testing this, I paid attention that originally "#" command was > used for printing line. Frankly I have no idea why it was needed, but put > it back anyway. Also some flow simplification. Looks fine for you? Looks fine to me. Thanks. Al > Sasha > > > From aa5ee3a71a4b3bd39e0d258614be7db6f1d640a6 Mon Sep 17 00:00:00 2001 > From: Sasha Khapyorsky > Date: Tue, 19 Aug 2008 22:06:50 +0300 > Subject: [PATCH] ibsim/sim_cmd: consolidate flows > > Consolidate and simplify flow. Return back line printing on '#' command. > > Signed-off-by: Sasha Khapyorsky > --- > ibsim/sim_cmd.c | 22 +++++++--------------- > 1 files changed, 7 insertions(+), 15 deletions(-) > > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index 4d70517..d55fb4c 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -762,16 +762,14 @@ int do_cmd(char *buf, FILE *f) > > for (line = buf; *line && isspace(*line); line++) ; > > - /* special cases */ > - if (*line == '!') > - r = sim_cmd_file(f, line); > - else if (*line == '#' || *line == '\n' || *line == '\0') > - goto out; > - > while (!isspace(line[cmd_len])) > cmd_len++; > > - if (!strncasecmp(line, "Dump", cmd_len)) > + if (*line == '#') > + fprintf(f, line); > + else if (*line == '!') > + r = sim_cmd_file(f, line); > + else if (!strncasecmp(line, "Dump", cmd_len)) > r = dump_net(f, line); > else if (!strncasecmp(line, "Route", cmd_len)) > r = dump_route(f, line); > @@ -816,14 +814,8 @@ int do_cmd(char *buf, FILE *f) > * > * please specify new command support below this comment. > */ > - else { > - char cmdbuf[cmd_len+1]; > - > - memset(cmdbuf, '\0', cmd_len+1); > - strncpy(cmdbuf, line, cmd_len); > + else if (*line != '\n' && *line != '\0') > + fprintf(f, "command \'%s\' unknown - skipped\n", line); > > - fprintf(f, "command %s unknown - skipped\n", cmdbuf); > - } > -out: > return r; > } -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory From rdreier at cisco.com Tue Aug 19 15:03:13 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 19 Aug 2008 15:03:13 -0700 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus Dave Olson (1): IB/ipath: Fix incorrect check for max physical address in TID Ralph Campbell (1): IB/ipath: Fix lost UD send work request Roland Dreier (2): IPoIB: Fix deadlock on RTNL in ipoib_stop() Merge branches 'ipath' and 'ipoib' into for-linus drivers/infiniband/hw/ipath/ipath_iba7220.c | 2 +- drivers/infiniband/hw/ipath/ipath_ud.c | 8 ++++++-- drivers/infiniband/ulp/ipoib/ipoib_main.c | 19 +++++++++---------- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 10 +++++++++- 4 files changed, 25 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_iba7220.c b/drivers/infiniband/hw/ipath/ipath_iba7220.c index d90f5e9..9839e20 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba7220.c +++ b/drivers/infiniband/hw/ipath/ipath_iba7220.c @@ -1720,7 +1720,7 @@ static void ipath_7220_put_tid(struct ipath_devdata *dd, u64 __iomem *tidptr, "not 2KB aligned!\n", pa); return; } - if (pa >= (1UL << IBA7220_TID_SZ_SHIFT)) { + if (chippa >= (1UL << IBA7220_TID_SZ_SHIFT)) { ipath_dev_err(dd, "BUG: Physical page address 0x%lx " "larger than supported\n", pa); diff --git a/drivers/infiniband/hw/ipath/ipath_ud.c b/drivers/infiniband/hw/ipath/ipath_ud.c index 36aa242..729446f 100644 --- a/drivers/infiniband/hw/ipath/ipath_ud.c +++ b/drivers/infiniband/hw/ipath/ipath_ud.c @@ -267,6 +267,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) u16 lrh0; u16 lid; int ret = 0; + int next_cur; spin_lock_irqsave(&qp->s_lock, flags); @@ -290,8 +291,9 @@ int ipath_make_ud_req(struct ipath_qp *qp) goto bail; wqe = get_swqe_ptr(qp, qp->s_cur); - if (++qp->s_cur >= qp->s_size) - qp->s_cur = 0; + next_cur = qp->s_cur + 1; + if (next_cur >= qp->s_size) + next_cur = 0; /* Construct the header. */ ah_attr = &to_iah(wqe->wr.wr.ud.ah)->attr; @@ -315,6 +317,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) qp->s_flags |= IPATH_S_WAIT_DMA; goto bail; } + qp->s_cur = next_cur; spin_unlock_irqrestore(&qp->s_lock, flags); ipath_ud_loopback(qp, wqe); spin_lock_irqsave(&qp->s_lock, flags); @@ -323,6 +326,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) } } + qp->s_cur = next_cur; extra_bytes = -wqe->length & 3; nwords = (wqe->length + extra_bytes) >> 2; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index f51201b..7e9e218 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -156,14 +156,8 @@ static int ipoib_stop(struct net_device *dev) netif_stop_queue(dev); - /* - * Now flush workqueue to make sure a scheduled task doesn't - * bring our internal state back up. - */ - flush_workqueue(ipoib_workqueue); - - ipoib_ib_dev_down(dev, 1); - ipoib_ib_dev_stop(dev, 1); + ipoib_ib_dev_down(dev, 0); + ipoib_ib_dev_stop(dev, 0); if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { struct ipoib_dev_priv *cpriv; @@ -1314,7 +1308,7 @@ sysfs_failed: register_failed: ib_unregister_event_handler(&priv->event_handler); - flush_scheduled_work(); + flush_workqueue(ipoib_workqueue); event_failed: ipoib_dev_cleanup(priv->dev); @@ -1373,7 +1367,12 @@ static void ipoib_remove_one(struct ib_device *device) list_for_each_entry_safe(priv, tmp, dev_list, list) { ib_unregister_event_handler(&priv->event_handler); - flush_scheduled_work(); + + rtnl_lock(); + dev_change_flags(priv->dev, priv->dev->flags & ~IFF_UP); + rtnl_unlock(); + + flush_workqueue(ipoib_workqueue); unregister_netdev(priv->dev); ipoib_dev_cleanup(priv->dev); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 8950e95..ac33c8f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -392,8 +392,16 @@ static int ipoib_mcast_join_complete(int status, &priv->mcast_task, 0); mutex_unlock(&mcast_mutex); - if (mcast == priv->broadcast) + if (mcast == priv->broadcast) { + /* + * Take RTNL lock here to avoid racing with + * ipoib_stop() and turning the carrier back + * on while a device is being removed. + */ + rtnl_lock(); netif_carrier_on(dev); + rtnl_unlock(); + } return 0; } From rdreier at cisco.com Tue Aug 19 15:20:44 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 19 Aug 2008 15:20:44 -0700 Subject: [ofa-general] [PATCH v2] ipiob: fix rtnl deadlock In-Reply-To: (Roland Dreier's message of "Sun, 17 Aug 2008 17:31:05 -0700") References: <4899CF0A.1060509@Voltaire.COM> <32cb786f0808081155o19f8fb9dm217cd6996dffa3e5@mail.gmail.com> <32cb786f0808090538j272842b1r5117547cccde0d06@mail.gmail.com> <32cb786f0808161218o417553b5w1738a517f0eb468a@mail.gmail.com> Message-ID: OK, I went ahead and merged my patch and sent it on to Linus. From jeff at splitrockpr.com Tue Aug 19 15:36:25 2008 From: jeff at splitrockpr.com (Jeffrey Scott) Date: Tue, 19 Aug 2008 15:36:25 -0700 Subject: [ofa-general] Don't miss the IBTA Technical Forum '08! Message-ID: <2CC125646A8E469BA411DE30A861EEDC@Gaucho> Hello OFA Members. We are rapidly approaching this year's IBTA Technical Forum; it's just four weeks away! The theme this year is "InfiniBand and the Enterprise Data Center" and the IBTA has put together a compelling agenda with end-user presentations from General Motors, France Telecom and others, as well as an analyst presentation from Gartner and an interactive panel discussion on the future of InfiniBand. Please see below for more information; register now to receive the early bird discount. Date: Monday, September 15, 2008 Time: 8am - 5pm with networking reception immediately following Location: Harrah's Las Vegas Register: www.regonline.com/IBTATechForum08 Rate: Early bird rate is $249; after September 1 the rate increases to $299 Agenda: http://www.infinibandta.org/events/IBTATechForum08_ The IBTA needs your help spreading the word! The OFA is one of the sponsors for the networking reception taking place immediately following the technical forum. We would like to see the OFA well represented. The IBTA's Marketing Working Group has created a formal invitation (please see the attached) for you to forward to colleagues/vendors/partners/customers. Please assist the IBTA in spreading the word to the entire InfiniBand community, and plan on joining us in Las Vegas! If you have any questions, please contact Samantha Spears at 206-322-1167 x115 or samanthas at owenmedia.com. ----------------------------------- Jeffrey Scott Split Rock Communications 408-884-4017 408-348-3651 Mobile 408-884-3900 Fax www.SplitRockPR.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: IBTA TechForum 08 Invite.pdf Type: application/pdf Size: 2029867 bytes Desc: not available URL: From cameron at harr.org Tue Aug 19 16:35:45 2008 From: cameron at harr.org (Cameron Harr) Date: Tue, 19 Aug 2008 17:35:45 -0600 Subject: [ofa-general] SRP f/s data corruption Message-ID: <48AB58D1.7090904@harr.org> Hello, I'm seeing data corruption on an SRP-exported device and I'm fishing for any suggestions. I've seen the corruption in several ways, but here's a really simple way to reproduce it: I format /dev/md0 with ext3 on a host (medusa) and export md0 via SRP. I mount it on the initiator (harpie), copy over a large file and verify that it's md5sum is the same as the original. Then I unmount/remount and see that the md5sum is different. [root at harpie ~]# mount /mnt/medusa/ [root at harpie ~]# cp /usr/src/OFED-1.3.1.tgz /mnt/medusa/ [root at harpie ~]# md5sum /usr/src/OFED-1.3.1.tgz 69fe510fc78a39b627713cfb49ad4ca3 /usr/src/OFED-1.3.1.tgz [root at harpie ~]# md5sum /mnt/medusa/OFED-1.3.1.tgz 69fe510fc78a39b627713cfb49ad4ca3 /mnt/medusa/OFED-1.3.1.tgz [root at harpie ~]# umount /mnt/medusa/ [root at harpie ~]# mount /mnt/medusa/ [root at harpie ~]# md5sum /mnt/medusa/OFED-1.3.1.tgz 5b761a931bf8fa7273cccc505ff13121 /mnt/medusa/OFED-1.3.1.tgz As a side note, right after I copy over the file and see that it has the correct md5sum, I can mount the same device read only on the target server and see the file, but it has a different md5sum. In searching, I saw this problem here and tried dropping scst_threads to 1, to no avail: http://osdir.com/ml/windows.devel.drivers.openib/2007-12/msg00050.html Ideas? Thanks, Cameron From yevgenyp at mellanox.co.il Wed Aug 20 06:15:30 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:15:30 +0300 Subject: [ofa-general][PATCH 01/11 v3] mlx4: Qp range reservation Message-ID: <48AC18F2.80600@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 09:59:07 +0300 Subject: [PATCH] mlx4: Qp range reservation Prior to allocating a qp, one need to reserve an aligned range of qps. The change is made to enable allocation of consecutive qps. Diff from previous version: -mlx4_bitmap_free() uses mlx4_bitmap_free_range() with range=1. -Free qp range if failed to allocate qp. Signed-off-by: Yevgeny Petrilin --- drivers/infiniband/hw/mlx4/qp.c | 13 ++++++- drivers/net/mlx4/alloc.c | 73 +++++++++++++++++++++++++++++++++++++- drivers/net/mlx4/mlx4.h | 2 + drivers/net/mlx4/qp.c | 44 ++++++++++++++++------- include/linux/mlx4/device.h | 5 ++- 5 files changed, 120 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index f29dbb7..472bd60 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -545,10 +545,17 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } } - err = mlx4_qp_alloc(dev->dev, sqpn, &qp->mqp); + if (!sqpn) + err = mlx4_qp_reserve_range(dev->dev, 1, 1, &sqpn); if (err) goto err_wrid; + err = mlx4_qp_alloc(dev->dev, sqpn, &qp->mqp); + if (err) { + mlx4_qp_release_range(dev->dev, sqpn, 1); + goto err_wrid; + } + /* * Hardware wants QPN written in big-endian order (after * shifting) for send doorbell. Precompute this value to save @@ -655,6 +662,10 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp, mlx4_ib_unlock_cqs(send_cq, recv_cq); mlx4_qp_free(dev->dev, &qp->mqp); + + if (!is_sqp(dev, qp)) + mlx4_qp_release_range(dev->dev, qp->mqp.qpn, 1); + mlx4_mtt_cleanup(dev->dev, &qp->mtt); if (is_user) { diff --git a/drivers/net/mlx4/alloc.c b/drivers/net/mlx4/alloc.c index 096bca5..8cdf26a 100644 --- a/drivers/net/mlx4/alloc.c +++ b/drivers/net/mlx4/alloc.c @@ -65,16 +65,85 @@ u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap) void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj) { + mlx4_bitmap_free_range(bitmap, obj, 1); +} + +static unsigned long find_aligned_range(unsigned long *bitmap, + u32 start, u32 nbits, + int len, int align) +{ + unsigned long end, i; + +again: + start = ALIGN(start, align); + while ((start < nbits) && test_bit(start, bitmap)) + start += align; + if (start >= nbits) + return -1; + + end = start+len; + if (end > nbits) + return -1; + for (i = start+1; i < end; i++) { + if (test_bit(i, bitmap)) { + start = i+1; + goto again; + } + } + return start; +} + +u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align) +{ + u32 obj, i; + + if (likely(cnt == 1 && align == 1)) + return mlx4_bitmap_alloc(bitmap); + + spin_lock(&bitmap->lock); + + obj = find_aligned_range(bitmap->table, bitmap->last, + bitmap->max, cnt, align); + if (obj >= bitmap->max) { + bitmap->top = (bitmap->top + bitmap->max) & bitmap->mask; + obj = find_aligned_range(bitmap->table, 0, + bitmap->max, + cnt, align); + } + + if (obj < bitmap->max) { + for (i = 0; i < cnt; i++) + set_bit(obj+i, bitmap->table); + if (obj == bitmap->last) { + bitmap->last = (obj + cnt); + if (bitmap->last >= bitmap->max) + bitmap->last = 0; + } + obj |= bitmap->top; + } else + obj = -1; + + spin_unlock(&bitmap->lock); + + return obj; +} + +void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt) +{ + u32 i; + obj &= bitmap->max - 1; spin_lock(&bitmap->lock); - clear_bit(obj, bitmap->table); + for (i = 0; i < cnt; i++) + clear_bit(obj+i, bitmap->table); bitmap->last = min(bitmap->last, obj); bitmap->top = (bitmap->top + bitmap->max) & bitmap->mask; spin_unlock(&bitmap->lock); } -int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask, u32 reserved) +int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, + u32 num, u32 mask, u32 reserved) { int i; diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 5337e3a..b55ddab 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -288,6 +288,8 @@ static inline struct mlx4_priv *mlx4_priv(struct mlx4_dev *dev) u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap); void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj); +u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align); +void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt); int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask, u32 reserved); void mlx4_bitmap_cleanup(struct mlx4_bitmap *bitmap); diff --git a/drivers/net/mlx4/qp.c b/drivers/net/mlx4/qp.c index c49a860..18ba72a 100644 --- a/drivers/net/mlx4/qp.c +++ b/drivers/net/mlx4/qp.c @@ -147,19 +147,42 @@ int mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt, } EXPORT_SYMBOL_GPL(mlx4_qp_modify); -int mlx4_qp_alloc(struct mlx4_dev *dev, int sqpn, struct mlx4_qp *qp) +int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align, int *base) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + struct mlx4_qp_table *qp_table = &priv->qp_table; + int qpn; + + qpn = mlx4_bitmap_alloc_range(&qp_table->bitmap, cnt, align); + if (qpn == -1) + return -ENOMEM; + + *base = qpn; + return 0; +} +EXPORT_SYMBOL_GPL(mlx4_qp_reserve_range); + +void mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + struct mlx4_qp_table *qp_table = &priv->qp_table; + if (base_qpn < dev->caps.sqp_start + 8) + return; + + mlx4_bitmap_free_range(&qp_table->bitmap, base_qpn, cnt); +} +EXPORT_SYMBOL_GPL(mlx4_qp_release_range); + +int mlx4_qp_alloc(struct mlx4_dev *dev, int qpn, struct mlx4_qp *qp) { struct mlx4_priv *priv = mlx4_priv(dev); struct mlx4_qp_table *qp_table = &priv->qp_table; int err; - if (sqpn) - qp->qpn = sqpn; - else { - qp->qpn = mlx4_bitmap_alloc(&qp_table->bitmap); - if (qp->qpn == -1) - return -ENOMEM; - } + if (!qpn) + return -EINVAL; + + qp->qpn = qpn; err = mlx4_table_get(dev, &qp_table->qp_table, qp->qpn); if (err) @@ -208,9 +231,6 @@ err_put_qp: mlx4_table_put(dev, &qp_table->qp_table, qp->qpn); err_out: - if (!sqpn) - mlx4_bitmap_free(&qp_table->bitmap, qp->qpn); - return err; } EXPORT_SYMBOL_GPL(mlx4_qp_alloc); @@ -240,8 +260,6 @@ void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp) mlx4_table_put(dev, &qp_table->auxc_table, qp->qpn); mlx4_table_put(dev, &qp_table->qp_table, qp->qpn); - if (qp->qpn >= dev->caps.sqp_start + 8) - mlx4_bitmap_free(&qp_table->bitmap, qp->qpn); } EXPORT_SYMBOL_GPL(mlx4_qp_free); diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 655ea0d..1483c09 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -396,7 +396,10 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, int collapsed); void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq); -int mlx4_qp_alloc(struct mlx4_dev *dev, int sqpn, struct mlx4_qp *qp); +int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align, int *base); +void mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt); + +int mlx4_qp_alloc(struct mlx4_dev *dev, int qpn, struct mlx4_qp *qp); void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp); int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, struct mlx4_mtt *mtt, -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:16:52 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:16:52 +0300 Subject: [ofa-general][PATCH 02/11 v3] mlx4: Pre reserved Qp regions Message-ID: <48AC1944.9020100@mellanox.co.il> >From 1bd27a5c77823a2c7cce79fe662b54fa13eb4479 Mon Sep 17 00:00:00 2001 From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:01:52 +0300 Subject: [PATCH] mlx4: Pre reserved Qp regions. We reserve Qp ranges to be used by other modules in case the ports come up as Ethernet ports. The qps are reserved at the end of the QP table. (This way we assure that they are alligned to their size) We need to consider theese reserved ranges in bitmap creation : The reserved_top parameter. Diff from prevoius version: Using Log base 2 for max mac and vlan numbers Allocatin only required size in bitmap allocation, without the bits reserved from top Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/alloc.c | 30 ++++++++++++-------- drivers/net/mlx4/cq.c | 2 +- drivers/net/mlx4/eq.c | 2 +- drivers/net/mlx4/fw.c | 5 +++ drivers/net/mlx4/fw.h | 2 + drivers/net/mlx4/main.c | 63 ++++++++++++++++++++++++++++++++++++++---- drivers/net/mlx4/mcg.c | 4 +- drivers/net/mlx4/mlx4.h | 4 ++- drivers/net/mlx4/mr.c | 2 +- drivers/net/mlx4/pd.c | 4 +- drivers/net/mlx4/qp.c | 50 ++++++++++++++++++++++++++++++++- drivers/net/mlx4/srq.c | 2 +- include/linux/mlx4/device.h | 19 ++++++++++++- include/linux/mlx4/qp.h | 4 +++ 14 files changed, 163 insertions(+), 30 deletions(-) diff --git a/drivers/net/mlx4/alloc.c b/drivers/net/mlx4/alloc.c index 8cdf26a..5538db1 100644 --- a/drivers/net/mlx4/alloc.c +++ b/drivers/net/mlx4/alloc.c @@ -47,13 +47,16 @@ u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap) obj = find_next_zero_bit(bitmap->table, bitmap->max, bitmap->last); if (obj >= bitmap->max) { - bitmap->top = (bitmap->top + bitmap->max) & bitmap->mask; + bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top) + & bitmap->mask; obj = find_first_zero_bit(bitmap->table, bitmap->max); } if (obj < bitmap->max) { set_bit(obj, bitmap->table); - bitmap->last = (obj + 1) & (bitmap->max - 1); + bitmap->last = (obj + 1); + if (bitmap->last == bitmap->max) + bitmap->last = 0; obj |= bitmap->top; } else obj = -1; @@ -105,9 +108,9 @@ u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align) obj = find_aligned_range(bitmap->table, bitmap->last, bitmap->max, cnt, align); if (obj >= bitmap->max) { - bitmap->top = (bitmap->top + bitmap->max) & bitmap->mask; - obj = find_aligned_range(bitmap->table, 0, - bitmap->max, + bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top) + & bitmap->mask; + obj = find_aligned_range(bitmap->table, 0, bitmap->max, cnt, align); } @@ -132,18 +135,19 @@ void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt) { u32 i; - obj &= bitmap->max - 1; + obj &= bitmap->max + bitmap->reserved_top - 1; spin_lock(&bitmap->lock); for (i = 0; i < cnt; i++) clear_bit(obj+i, bitmap->table); bitmap->last = min(bitmap->last, obj); - bitmap->top = (bitmap->top + bitmap->max) & bitmap->mask; + bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top) + & bitmap->mask; spin_unlock(&bitmap->lock); } -int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, - u32 num, u32 mask, u32 reserved) +int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask, + u32 reserved_bot, u32 reserved_top) { int i; @@ -153,14 +157,16 @@ int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, bitmap->last = 0; bitmap->top = 0; - bitmap->max = num; + bitmap->max = num - reserved_top; bitmap->mask = mask; + bitmap->reserved_top = reserved_top; spin_lock_init(&bitmap->lock); - bitmap->table = kzalloc(BITS_TO_LONGS(num) * sizeof (long), GFP_KERNEL); + bitmap->table = kzalloc(BITS_TO_LONGS(bitmap->max) * + sizeof (long), GFP_KERNEL); if (!bitmap->table) return -ENOMEM; - for (i = 0; i < reserved; ++i) + for (i = 0; i < reserved_bot; ++i) set_bit(i, bitmap->table); return 0; diff --git a/drivers/net/mlx4/cq.c b/drivers/net/mlx4/cq.c index 9bb50e3..b7ad282 100644 --- a/drivers/net/mlx4/cq.c +++ b/drivers/net/mlx4/cq.c @@ -300,7 +300,7 @@ int mlx4_init_cq_table(struct mlx4_dev *dev) INIT_RADIX_TREE(&cq_table->tree, GFP_ATOMIC); err = mlx4_bitmap_init(&cq_table->bitmap, dev->caps.num_cqs, - dev->caps.num_cqs - 1, dev->caps.reserved_cqs); + dev->caps.num_cqs - 1, dev->caps.reserved_cqs, 0); if (err) return err; diff --git a/drivers/net/mlx4/eq.c b/drivers/net/mlx4/eq.c index 8a8b561..de16933 100644 --- a/drivers/net/mlx4/eq.c +++ b/drivers/net/mlx4/eq.c @@ -558,7 +558,7 @@ int mlx4_init_eq_table(struct mlx4_dev *dev) int i; err = mlx4_bitmap_init(&priv->eq_table.bitmap, dev->caps.num_eqs, - dev->caps.num_eqs - 1, dev->caps.reserved_eqs); + dev->caps.num_eqs - 1, dev->caps.reserved_eqs, 0); if (err) return err; diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 7e32955..40d8142 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -357,6 +357,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) #define QUERY_PORT_MTU_OFFSET 0x01 #define QUERY_PORT_WIDTH_OFFSET 0x06 #define QUERY_PORT_MAX_GID_PKEY_OFFSET 0x07 +#define QUERY_PORT_MAX_MACVLAN_OFFSET 0x0a #define QUERY_PORT_MAX_VL_OFFSET 0x0b for (i = 1; i <= dev_cap->num_ports; ++i) { @@ -374,6 +375,10 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_pkeys[i] = 1 << (field & 0xf); MLX4_GET(field, outbox, QUERY_PORT_MAX_VL_OFFSET); dev_cap->max_vl[i] = field & 0xf; + MLX4_GET(field, outbox, QUERY_PORT_MAX_MACVLAN_OFFSET); + dev_cap->log_max_macs[i] = field & 0xf; + dev_cap->log_max_vlans[i] = field >> 4; + } } diff --git a/drivers/net/mlx4/fw.h b/drivers/net/mlx4/fw.h index decbb5c..c34e726 100644 --- a/drivers/net/mlx4/fw.h +++ b/drivers/net/mlx4/fw.h @@ -102,6 +102,8 @@ struct mlx4_dev_cap { u32 reserved_lkey; u64 max_icm_sz; int max_gso_sz; + u8 log_max_macs[MLX4_MAX_PORTS + 1]; + u8 log_max_vlans[MLX4_MAX_PORTS + 1]; }; struct mlx4_adapter { diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 1252a91..f172bb3 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -85,6 +85,20 @@ static struct mlx4_profile default_profile = { .num_mtt = 1 << 20, }; +static int log_num_mac = 2; +module_param_named(log_num_mac, log_num_mac, int, 0444); +MODULE_PARM_DESC(log_num_mac, "Log 2 Max number of MACs per ETH port (1-7)"); + +static int log_num_vlan; +module_param_named(log_num_vlan, log_num_vlan, int, 0444); +MODULE_PARM_DESC(log_num_vlan, "Log 2 Max number of VLANs per ETH port (0-7)"); + +static int use_prio; +module_param_named(use_prio, use_prio, bool, 0444); +MODULE_PARM_DESC(use_prio, "Enable steering by VLAN priority on ETH ports " + "(0/1, default 0)"); + + static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) { int err; @@ -134,7 +148,6 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.max_rq_sg = dev_cap->max_rq_sg; dev->caps.max_wqes = dev_cap->max_qp_sz; dev->caps.max_qp_init_rdma = dev_cap->max_requester_per_qp; - dev->caps.reserved_qps = dev_cap->reserved_qps; dev->caps.max_srq_wqes = dev_cap->max_srq_sz; dev->caps.max_srq_sge = dev_cap->max_rq_sg - 1; dev->caps.reserved_srqs = dev_cap->reserved_srqs; @@ -163,6 +176,39 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.stat_rate_support = dev_cap->stat_rate_support; dev->caps.max_gso_sz = dev_cap->max_gso_sz; + dev->caps.log_num_macs = log_num_mac; + dev->caps.log_num_vlans = log_num_vlan; + dev->caps.log_num_prios = use_prio ? 3 : 0; + + for (i = 1; i <= dev->caps.num_ports; ++i) { + if (dev->caps.log_num_macs > dev_cap->log_max_macs[i]) { + dev->caps.log_num_macs = dev_cap->log_max_macs[i]; + mlx4_warn(dev, "Requested number of MACs is too much " + "for port %d, reducing to %d.\n", + i, 1 << dev->caps.log_num_macs); + } + if (dev->caps.log_num_vlans > dev_cap->log_max_vlans[i]) { + dev->caps.log_num_vlans = dev_cap->log_max_vlans[i]; + mlx4_warn(dev, "Requested number of VLANs is too much " + "for port %d, reducing to %d.\n", + i, 1 << dev->caps.log_num_vlans); + } + } + + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] = dev_cap->reserved_qps; + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] = + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_ADDR] = + (1 << dev->caps.log_num_macs)* + (1 << dev->caps.log_num_vlans)* + (1 << dev->caps.log_num_prios)* + dev->caps.num_ports; + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH] = MLX4_NUM_FEXCH; + + dev->caps.reserved_qps = dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] + + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] + + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH] + + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH]; + return 0; } @@ -211,7 +257,8 @@ static int mlx4_init_cmpt_table(struct mlx4_dev *dev, u64 cmpt_base, ((u64) (MLX4_CMPT_TYPE_QP * cmpt_entry_sz) << MLX4_CMPT_SHIFT), cmpt_entry_sz, dev->caps.num_qps, - dev->caps.reserved_qps, 0, 0); + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], + 0, 0); if (err) goto err; @@ -336,7 +383,8 @@ static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap, init_hca->qpc_base, dev_cap->qpc_entry_sz, dev->caps.num_qps, - dev->caps.reserved_qps, 0, 0); + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], + 0, 0); if (err) { mlx4_err(dev, "Failed to map QP context memory, aborting.\n"); goto err_unmap_dmpt; @@ -346,7 +394,8 @@ static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap, init_hca->auxc_base, dev_cap->aux_entry_sz, dev->caps.num_qps, - dev->caps.reserved_qps, 0, 0); + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], + 0, 0); if (err) { mlx4_err(dev, "Failed to map AUXC context memory, aborting.\n"); goto err_unmap_qp; @@ -356,7 +405,8 @@ static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap, init_hca->altc_base, dev_cap->altc_entry_sz, dev->caps.num_qps, - dev->caps.reserved_qps, 0, 0); + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], + 0, 0); if (err) { mlx4_err(dev, "Failed to map ALTC context memory, aborting.\n"); goto err_unmap_auxc; @@ -366,7 +416,8 @@ static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap, init_hca->rdmarc_base, dev_cap->rdmarc_entry_sz << priv->qp_table.rdmarc_shift, dev->caps.num_qps, - dev->caps.reserved_qps, 0, 0); + dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], + 0, 0); if (err) { mlx4_err(dev, "Failed to map RDMARC context memory, aborting\n"); goto err_unmap_altc; diff --git a/drivers/net/mlx4/mcg.c b/drivers/net/mlx4/mcg.c index c83f88c..592c01a 100644 --- a/drivers/net/mlx4/mcg.c +++ b/drivers/net/mlx4/mcg.c @@ -368,8 +368,8 @@ int mlx4_init_mcg_table(struct mlx4_dev *dev) struct mlx4_priv *priv = mlx4_priv(dev); int err; - err = mlx4_bitmap_init(&priv->mcg_table.bitmap, - dev->caps.num_amgms, dev->caps.num_amgms - 1, 0); + err = mlx4_bitmap_init(&priv->mcg_table.bitmap, dev->caps.num_amgms, + dev->caps.num_amgms - 1, 0, 0); if (err) return err; diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index b55ddab..9e2f44c 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -111,6 +111,7 @@ struct mlx4_bitmap { u32 last; u32 top; u32 max; + u32 reserved_top; u32 mask; spinlock_t lock; unsigned long *table; @@ -290,7 +291,8 @@ u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap); void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj); u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align); void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt); -int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask, u32 reserved); +int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask, + u32 reserved_bot, u32 resetrved_top); void mlx4_bitmap_cleanup(struct mlx4_bitmap *bitmap); int mlx4_reset(struct mlx4_dev *dev); diff --git a/drivers/net/mlx4/mr.c b/drivers/net/mlx4/mr.c index 62071d9..d7c6ea5 100644 --- a/drivers/net/mlx4/mr.c +++ b/drivers/net/mlx4/mr.c @@ -459,7 +459,7 @@ int mlx4_init_mr_table(struct mlx4_dev *dev) int err; err = mlx4_bitmap_init(&mr_table->mpt_bitmap, dev->caps.num_mpts, - ~0, dev->caps.reserved_mrws); + ~0, dev->caps.reserved_mrws, 0); if (err) return err; diff --git a/drivers/net/mlx4/pd.c b/drivers/net/mlx4/pd.c index aa61689..26d1a7a 100644 --- a/drivers/net/mlx4/pd.c +++ b/drivers/net/mlx4/pd.c @@ -62,7 +62,7 @@ int mlx4_init_pd_table(struct mlx4_dev *dev) struct mlx4_priv *priv = mlx4_priv(dev); return mlx4_bitmap_init(&priv->pd_bitmap, dev->caps.num_pds, - (1 << 24) - 1, dev->caps.reserved_pds); + (1 << 24) - 1, dev->caps.reserved_pds, 0); } void mlx4_cleanup_pd_table(struct mlx4_dev *dev) @@ -100,7 +100,7 @@ int mlx4_init_uar_table(struct mlx4_dev *dev) return mlx4_bitmap_init(&mlx4_priv(dev)->uar_table.bitmap, dev->caps.num_uars, dev->caps.num_uars - 1, - max(128, dev->caps.reserved_uars)); + max(128, dev->caps.reserved_uars), 0); } void mlx4_cleanup_uar_table(struct mlx4_dev *dev) diff --git a/drivers/net/mlx4/qp.c b/drivers/net/mlx4/qp.c index 18ba72a..51e1481 100644 --- a/drivers/net/mlx4/qp.c +++ b/drivers/net/mlx4/qp.c @@ -273,6 +273,7 @@ int mlx4_init_qp_table(struct mlx4_dev *dev) { struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table; int err; + int reserved_from_top = 0; spin_lock_init(&qp_table->lock); INIT_RADIX_TREE(&dev->qp_table_tree, GFP_ATOMIC); @@ -282,9 +283,40 @@ int mlx4_init_qp_table(struct mlx4_dev *dev) * block of special QPs must be aligned to a multiple of 8, so * round up. */ - dev->caps.sqp_start = ALIGN(dev->caps.reserved_qps, 8); + dev->caps.sqp_start = + ALIGN(dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW], 8); + + { + int sort[MLX4_QP_REGION_COUNT]; + int i, j, tmp; + int last_base = dev->caps.num_qps; + + for (i = 1; i < MLX4_QP_REGION_COUNT; ++i) + sort[i] = i; + + for (i = MLX4_QP_REGION_COUNT; i > 0; --i) { + for (j = 2; j < i; ++j) { + if (dev->caps.reserved_qps_cnt[sort[j]] > + dev->caps.reserved_qps_cnt[sort[j - 1]]) { + tmp = sort[j]; + sort[j] = sort[j - 1]; + sort[j - 1] = tmp; + } + } + } + + for (i = 1; i < MLX4_QP_REGION_COUNT; ++i) { + last_base -= dev->caps.reserved_qps_cnt[sort[i]]; + dev->caps.reserved_qps_base[sort[i]] = last_base; + reserved_from_top += + dev->caps.reserved_qps_cnt[sort[i]]; + } + + } + err = mlx4_bitmap_init(&qp_table->bitmap, dev->caps.num_qps, - (1 << 24) - 1, dev->caps.sqp_start + 8); + (1 << 23) - 1, dev->caps.sqp_start + 8, + reserved_from_top); if (err) return err; @@ -297,6 +329,20 @@ void mlx4_cleanup_qp_table(struct mlx4_dev *dev) mlx4_bitmap_cleanup(&mlx4_priv(dev)->qp_table.bitmap); } +int mlx4_qp_get_region(struct mlx4_dev *dev, + enum qp_region region, + int *base_qpn, int *cnt) +{ + if ((region < 0) || (region >= MLX4_QP_REGION_COUNT)) + return -EINVAL; + + *base_qpn = dev->caps.reserved_qps_base[region]; + *cnt = dev->caps.reserved_qps_cnt[region]; + + return 0; +} +EXPORT_SYMBOL_GPL(mlx4_qp_get_region); + int mlx4_qp_query(struct mlx4_dev *dev, struct mlx4_qp *qp, struct mlx4_qp_context *context) { diff --git a/drivers/net/mlx4/srq.c b/drivers/net/mlx4/srq.c index 533eb6d..fe9f218 100644 --- a/drivers/net/mlx4/srq.c +++ b/drivers/net/mlx4/srq.c @@ -245,7 +245,7 @@ int mlx4_init_srq_table(struct mlx4_dev *dev) INIT_RADIX_TREE(&srq_table->tree, GFP_ATOMIC); err = mlx4_bitmap_init(&srq_table->bitmap, dev->caps.num_srqs, - dev->caps.num_srqs - 1, dev->caps.reserved_srqs); + dev->caps.num_srqs - 1, dev->caps.reserved_srqs, 0); if (err) return err; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 1483c09..0a43891 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -141,6 +141,18 @@ enum { MLX4_STAT_RATE_OFFSET = 5 }; +enum qp_region { + MLX4_QP_REGION_FW = 0, + MLX4_QP_REGION_ETH_ADDR, + MLX4_QP_REGION_FC_ADDR, + MLX4_QP_REGION_FC_EXCH, + MLX4_QP_REGION_COUNT +}; + +enum { + MLX4_NUM_FEXCH = 64 * 1024, +}; + static inline u64 mlx4_fw_ver(u64 major, u64 minor, u64 subminor) { return (major << 32) | (minor << 16) | subminor; @@ -165,7 +177,6 @@ struct mlx4_caps { int max_rq_desc_sz; int max_qp_init_rdma; int max_qp_dest_rdma; - int reserved_qps; int sqp_start; int num_srqs; int max_srq_wqes; @@ -197,6 +208,12 @@ struct mlx4_caps { u16 stat_rate_support; u8 port_width_cap[MLX4_MAX_PORTS + 1]; int max_gso_sz; + int reserved_qps_cnt[MLX4_QP_REGION_COUNT]; + int reserved_qps; + int reserved_qps_base[MLX4_QP_REGION_COUNT]; + int log_num_macs; + int log_num_vlans; + int log_num_prios; }; struct mlx4_buf_list { diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index bf8f119..03802fc 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -317,4 +317,8 @@ static inline struct mlx4_qp *__mlx4_qp_lookup(struct mlx4_dev *dev, u32 qpn) void mlx4_qp_remove(struct mlx4_dev *dev, struct mlx4_qp *qp); +int mlx4_qp_get_region(struct mlx4_dev *dev, + enum qp_region region, + int *base_qpn, int *cnt); + #endif /* MLX4_QP_H */ -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:17:37 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:17:37 +0300 Subject: [ofa-general][PATCH 03/11 v3] mlx4: Different port type support Message-ID: <48AC1971.2000803@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:03:23 +0300 Subject: [PATCH] mlx4: Different port type support Multiprotocol supports different port types. The port types are delivered through module parameters, crossed with firmware capabilities. Each consumer of mlx4_core should query for supported port types, mlx4_ib can no longer assume that all phisical ports belong to it. Signed-off-by: Yevgeny Petrilin --- drivers/infiniband/hw/mlx4/mad.c | 6 +- drivers/infiniband/hw/mlx4/main.c | 12 ++++- drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 + drivers/net/mlx4/fw.c | 4 ++ drivers/net/mlx4/fw.h | 1 + drivers/net/mlx4/main.c | 82 ++++++++++++++++++++++++++++++++++ include/linux/mlx4/device.h | 32 +++++++++++++ 7 files changed, 134 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c index cdca3a5..606f1e2 100644 --- a/drivers/infiniband/hw/mlx4/mad.c +++ b/drivers/infiniband/hw/mlx4/mad.c @@ -298,7 +298,7 @@ int mlx4_ib_mad_init(struct mlx4_ib_dev *dev) int p, q; int ret; - for (p = 0; p < dev->dev->caps.num_ports; ++p) + for (p = 0; p < dev->num_ports; ++p) for (q = 0; q <= 1; ++q) { agent = ib_register_mad_agent(&dev->ib_dev, p + 1, q ? IB_QPT_GSI : IB_QPT_SMI, @@ -314,7 +314,7 @@ int mlx4_ib_mad_init(struct mlx4_ib_dev *dev) return 0; err: - for (p = 0; p < dev->dev->caps.num_ports; ++p) + for (p = 0; p < dev->num_ports; ++p) for (q = 0; q <= 1; ++q) if (dev->send_agent[p][q]) ib_unregister_mad_agent(dev->send_agent[p][q]); @@ -327,7 +327,7 @@ void mlx4_ib_mad_cleanup(struct mlx4_ib_dev *dev) struct ib_mad_agent *agent; int p, q; - for (p = 0; p < dev->dev->caps.num_ports; ++p) { + for (p = 0; p < dev->num_ports; ++p) { for (q = 0; q <= 1; ++q) { agent = dev->send_agent[p][q]; dev->send_agent[p][q] = NULL; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index a3c2851..ff60f77 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -569,12 +569,16 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) MLX4_INIT_DOORBELL_LOCK(&ibdev->uar_lock); ibdev->dev = dev; + ibdev->ports_map = mlx4_get_ports_of_type(dev, MLX4_PORT_TYPE_IB); strlcpy(ibdev->ib_dev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); ibdev->ib_dev.owner = THIS_MODULE; ibdev->ib_dev.node_type = RDMA_NODE_IB_CA; ibdev->ib_dev.local_dma_lkey = dev->caps.reserved_lkey; - ibdev->ib_dev.phys_port_cnt = dev->caps.num_ports; + ibdev->num_ports = 0; + mlx4_foreach_port(i, ibdev->ports_map) + ibdev->num_ports++; + ibdev->ib_dev.phys_port_cnt = ibdev->num_ports; ibdev->ib_dev.num_comp_vectors = 1; ibdev->ib_dev.dma_device = &dev->pdev->dev; @@ -691,7 +695,7 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr) struct mlx4_ib_dev *ibdev = ibdev_ptr; int p; - for (p = 1; p <= dev->caps.num_ports; ++p) + for (p = 1; p <= ibdev->num_ports; ++p) mlx4_CLOSE_PORT(dev, p); mlx4_ib_mad_cleanup(ibdev); @@ -706,6 +710,10 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr, enum mlx4_dev_event event, int port) { struct ib_event ibev; + struct mlx4_ib_dev *ibdev = to_mdev((struct ib_device *) ibdev_ptr); + + if (port > ibdev->num_ports) + return; switch (event) { case MLX4_DEV_EVENT_PORT_UP: diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 6e2b0dc..2b7a8d4 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -162,6 +162,8 @@ struct mlx4_ib_ah { struct mlx4_ib_dev { struct ib_device ib_dev; struct mlx4_dev *dev; + u32 ports_map; + int num_ports; void __iomem *uar_map; struct mlx4_uar priv_uar; diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 40d8142..5136953 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -354,6 +354,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_pkeys[i] = 1 << (field & 0xf); } } else { +#define QUERY_PORT_SUPPORTED_TYPE_OFFSET 0x00 #define QUERY_PORT_MTU_OFFSET 0x01 #define QUERY_PORT_WIDTH_OFFSET 0x06 #define QUERY_PORT_MAX_GID_PKEY_OFFSET 0x07 @@ -366,6 +367,9 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) if (err) goto out; + MLX4_GET(field, outbox, + QUERY_PORT_SUPPORTED_TYPE_OFFSET); + dev_cap->supported_port_types[i] = field & 3; MLX4_GET(field, outbox, QUERY_PORT_MTU_OFFSET); dev_cap->max_mtu[i] = field & 0xf; MLX4_GET(field, outbox, QUERY_PORT_WIDTH_OFFSET); diff --git a/drivers/net/mlx4/fw.h b/drivers/net/mlx4/fw.h index c34e726..acf6375 100644 --- a/drivers/net/mlx4/fw.h +++ b/drivers/net/mlx4/fw.h @@ -102,6 +102,7 @@ struct mlx4_dev_cap { u32 reserved_lkey; u64 max_icm_sz; int max_gso_sz; + u8 supported_port_types[MLX4_MAX_PORTS + 1]; u8 log_max_macs[MLX4_MAX_PORTS + 1]; u8 log_max_vlans[MLX4_MAX_PORTS + 1]; }; diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index f172bb3..ba327ee 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -98,11 +98,48 @@ module_param_named(use_prio, use_prio, bool, 0444); MODULE_PARM_DESC(use_prio, "Enable steering by VLAN priority on ETH ports " "(0/1, default 0)"); +static char *port_type_arr[MLX4_MAX_PORTS] = {[0 ... (MLX4_MAX_PORTS-1)] = "ib"}; +module_param_array_named(port_type, port_type_arr, charp, NULL, 0444); +MODULE_PARM_DESC(port_type, "Ports L2 type (ib/eth/auto, entry per port, " + "comma seperated, default ib for all)"); + +static int mlx4_check_port_params(struct mlx4_dev *dev, + enum mlx4_port_type *port_type) +{ + if (port_type[0] != port_type[1] && + !(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) { + mlx4_err(dev, "Only same port types supported " + "on this HCA, aborting.\n"); + return -EINVAL; + } + if ((port_type[0] == MLX4_PORT_TYPE_ETH) && + (port_type[1] == MLX4_PORT_TYPE_IB)) { + mlx4_err(dev, "eth-ib configuration is not supported.\n"); + return -EINVAL; + } + return 0; +} + +static void mlx4_str2port_type(char **port_str, + enum mlx4_port_type *port_type) +{ + int i; + + for (i = 0; i < MLX4_MAX_PORTS; i++) { + if (!strcmp(port_str[i], "eth")) + port_type[i] = MLX4_PORT_TYPE_ETH; + else + port_type[i] = MLX4_PORT_TYPE_IB; + } +} static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) { int err; int i; + enum mlx4_port_type port_type[MLX4_MAX_PORTS]; + + mlx4_str2port_type(port_type_arr, port_type); err = mlx4_QUERY_DEV_CAP(dev, dev_cap); if (err) { @@ -180,7 +217,24 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.log_num_vlans = log_num_vlan; dev->caps.log_num_prios = use_prio ? 3 : 0; + err = mlx4_check_port_params(dev, port_type); + if (err) + return err; + for (i = 1; i <= dev->caps.num_ports; ++i) { + if (!dev_cap->supported_port_types[i]) { + mlx4_warn(dev, "FW doesn't support Multi Protocol, " + "loading IB only\n"); + dev->caps.port_type[i] = MLX4_PORT_TYPE_IB; + continue; + } + if (port_type[i-1] & dev_cap->supported_port_types[i]) + dev->caps.port_type[i] = port_type[i-1]; + else { + mlx4_err(dev, "Requested port type for port %d " + "not supported by HW\n", i); + return -ENODEV; + } if (dev->caps.log_num_macs > dev_cap->log_max_macs[i]) { dev->caps.log_num_macs = dev_cap->log_max_macs[i]; mlx4_warn(dev, "Requested number of MACs is too much " @@ -1011,10 +1065,38 @@ static struct pci_driver mlx4_driver = { .remove = __devexit_p(mlx4_remove_one) }; +static int __init mlx4_verify_params(void) +{ + int i; + + for (i = 0; i < MLX4_MAX_PORTS; ++i) { + if (strcmp(port_type_arr[i], "eth") && + strcmp(port_type_arr[i], "ib")) { + printk(KERN_WARNING "mlx4_core: bad port_type for " + "port %d: %s\n", i, port_type_arr[i]); + return -1; + } + } + if ((log_num_mac < 0) || (log_num_mac > 7)) { + printk(KERN_WARNING "mlx4_core: bad num_mac: %d\n", log_num_mac); + return -1; + } + + if ((log_num_vlan < 0) || (log_num_vlan > 7)) { + printk(KERN_WARNING "mlx4_core: bad num_vlan: %d\n", log_num_vlan); + return -1; + } + + return 0; +} + static int __init mlx4_init(void) { int ret; + if (mlx4_verify_params()) + return -EINVAL; + ret = mlx4_catas_init(); if (ret) return ret; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 0a43891..758a50e 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -60,6 +60,7 @@ enum { MLX4_DEV_CAP_FLAG_IPOIB_CSUM = 1 << 7, MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1 << 8, MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1 << 9, + MLX4_DEV_CAP_FLAG_DPDP = 1 << 12, MLX4_DEV_CAP_FLAG_MEM_WINDOW = 1 << 16, MLX4_DEV_CAP_FLAG_APM = 1 << 17, MLX4_DEV_CAP_FLAG_ATOMIC = 1 << 18, @@ -149,6 +150,11 @@ enum qp_region { MLX4_QP_REGION_COUNT }; +enum mlx4_port_type { + MLX4_PORT_TYPE_IB = 1 << 0, + MLX4_PORT_TYPE_ETH = 1 << 1, +}; + enum { MLX4_NUM_FEXCH = 64 * 1024, }; @@ -214,6 +220,7 @@ struct mlx4_caps { int log_num_macs; int log_num_vlans; int log_num_prios; + enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1]; }; struct mlx4_buf_list { @@ -368,6 +375,31 @@ struct mlx4_init_port_param { u64 si_guid; }; +static inline void mlx4_query_steer_cap(struct mlx4_dev *dev, int *log_mac, + int *log_vlan, int *log_prio) +{ + *log_mac = dev->caps.log_num_macs; + *log_vlan = dev->caps.log_num_vlans; + *log_prio = dev->caps.log_num_prios; +} + +static inline u32 mlx4_get_ports_of_type(struct mlx4_dev *dev, + enum mlx4_port_type ptype) +{ + u32 ret = 0; + int i; + + for (i = 1; i <= dev->caps.num_ports; ++i) { + if (dev->caps.port_type[i] == ptype) + ret |= 1 << (i-1); + } + return ret; +} + +#define mlx4_foreach_port(port, bitmap) \ + for ((port) = 1; (port) <= MLX4_MAX_PORTS; (port)++) \ + if (bitmap & 1 << ((port)-1)) + int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct, struct mlx4_buf *buf); void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf); -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:19:10 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:19:10 +0300 Subject: [ofa-general][PATCH 04/11 v3] mlx4: Port Ethernet mtu capabilities handle Message-ID: <48AC19CE.9080901@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:07:30 +0300 Subject: [PATCH] mlx4: Port Ethernet mtu capabilities handle Ethernet max mtu and default Mac address are revealed through QUERY_DEV_CAP command. Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/fw.c | 11 ++++++----- drivers/net/mlx4/fw.h | 4 +++- drivers/net/mlx4/main.c | 4 +++- include/linux/mlx4/device.h | 4 +++- 4 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 5136953..6643cfb 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -346,7 +346,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) MLX4_GET(field, outbox, QUERY_DEV_CAP_VL_PORT_OFFSET); dev_cap->max_vl[i] = field >> 4; MLX4_GET(field, outbox, QUERY_DEV_CAP_MTU_WIDTH_OFFSET); - dev_cap->max_mtu[i] = field >> 4; + dev_cap->ib_mtu[i] = field >> 4; dev_cap->max_port_width[i] = field & 0xf; MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_GID_OFFSET); dev_cap->max_gids[i] = 1 << (field & 0xf); @@ -371,7 +371,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) QUERY_PORT_SUPPORTED_TYPE_OFFSET); dev_cap->supported_port_types[i] = field & 3; MLX4_GET(field, outbox, QUERY_PORT_MTU_OFFSET); - dev_cap->max_mtu[i] = field & 0xf; + dev_cap->ib_mtu[i] = field & 0xf; MLX4_GET(field, outbox, QUERY_PORT_WIDTH_OFFSET); dev_cap->max_port_width[i] = field & 0xf; MLX4_GET(field, outbox, QUERY_PORT_MAX_GID_PKEY_OFFSET); @@ -382,7 +382,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) MLX4_GET(field, outbox, QUERY_PORT_MAX_MACVLAN_OFFSET); dev_cap->log_max_macs[i] = field & 0xf; dev_cap->log_max_vlans[i] = field >> 4; - + dev_cap->eth_mtu[i] = be16_to_cpu(((u16 *) outbox)[1]); + dev_cap->def_mac[i] = be64_to_cpu(((u64 *) outbox)[2]); } } @@ -416,7 +417,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) mlx4_dbg(dev, "Max CQEs: %d, max WQEs: %d, max SRQ WQEs: %d\n", dev_cap->max_cq_sz, dev_cap->max_qp_sz, dev_cap->max_srq_sz); mlx4_dbg(dev, "Local CA ACK delay: %d, max MTU: %d, port width cap: %d\n", - dev_cap->local_ca_ack_delay, 128 << dev_cap->max_mtu[1], + dev_cap->local_ca_ack_delay, 128 << dev_cap->ib_mtu[1], dev_cap->max_port_width[1]); mlx4_dbg(dev, "Max SQ desc size: %d, max SQ S/G: %d\n", dev_cap->max_sq_desc_sz, dev_cap->max_sq_sg); @@ -828,7 +829,7 @@ int mlx4_INIT_PORT(struct mlx4_dev *dev, int port) flags |= (dev->caps.port_width_cap[port] & 0xf) << INIT_PORT_PORT_WIDTH_SHIFT; MLX4_PUT(inbox, flags, INIT_PORT_FLAGS_OFFSET); - field = 128 << dev->caps.mtu_cap[port]; + field = 128 << dev->caps.ib_mtu_cap[port]; MLX4_PUT(inbox, field, INIT_PORT_MTU_OFFSET); field = dev->caps.gid_table_len[port]; MLX4_PUT(inbox, field, INIT_PORT_MAX_GID_OFFSET); diff --git a/drivers/net/mlx4/fw.h b/drivers/net/mlx4/fw.h index acf6375..5ca3ad8 100644 --- a/drivers/net/mlx4/fw.h +++ b/drivers/net/mlx4/fw.h @@ -66,11 +66,13 @@ struct mlx4_dev_cap { int local_ca_ack_delay; int num_ports; u32 max_msg_sz; - int max_mtu[MLX4_MAX_PORTS + 1]; + int ib_mtu[MLX4_MAX_PORTS + 1]; int max_port_width[MLX4_MAX_PORTS + 1]; int max_vl[MLX4_MAX_PORTS + 1]; int max_gids[MLX4_MAX_PORTS + 1]; int max_pkeys[MLX4_MAX_PORTS + 1]; + u64 def_mac[MLX4_MAX_PORTS + 1]; + int eth_mtu[MLX4_MAX_PORTS + 1]; u16 stat_rate_support; u32 flags; int reserved_uars; diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index ba327ee..abaa7b9 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -171,10 +171,12 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.num_ports = dev_cap->num_ports; for (i = 1; i <= dev->caps.num_ports; ++i) { dev->caps.vl_cap[i] = dev_cap->max_vl[i]; - dev->caps.mtu_cap[i] = dev_cap->max_mtu[i]; + dev->caps.ib_mtu_cap[i] = dev_cap->ib_mtu[i]; dev->caps.gid_table_len[i] = dev_cap->max_gids[i]; dev->caps.pkey_table_len[i] = dev_cap->max_pkeys[i]; dev->caps.port_width_cap[i] = dev_cap->max_port_width[i]; + dev->caps.eth_mtu_cap[i] = dev_cap->eth_mtu[i]; + dev->caps.def_mac[i] = dev_cap->def_mac[i]; } dev->caps.num_uars = dev_cap->uar_size / PAGE_SIZE; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 758a50e..5be49e9 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -168,7 +168,9 @@ struct mlx4_caps { u64 fw_ver; int num_ports; int vl_cap[MLX4_MAX_PORTS + 1]; - int mtu_cap[MLX4_MAX_PORTS + 1]; + int ib_mtu_cap[MLX4_MAX_PORTS + 1]; + u64 def_mac[MLX4_MAX_PORTS + 1]; + int eth_mtu_cap[MLX4_MAX_PORTS + 1]; int gid_table_len[MLX4_MAX_PORTS + 1]; int pkey_table_len[MLX4_MAX_PORTS + 1]; int local_ca_ack_delay; -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:19:59 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:19:59 +0300 Subject: [ofa-general][PATCH 05/11 v3] mlx4: Mac Vlan Management Message-ID: <48AC19FF.4040404@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:10:25 +0300 Subject: [PATCH] mlx4: Mac Vlan Management mlx4_core is now responsible for managing Mac and Vlan filters for each port. It also notifies the FW which port type will be loaded, using the SET_PORT command Signed-off-by: Yevgeny Petrilin Diff from previous version: - port_info structures are listed from 1 - mac and vlan tables are initialsed through port_info initiallization Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/Makefile | 2 +- drivers/net/mlx4/main.c | 27 +++++ drivers/net/mlx4/mlx4.h | 37 ++++++ drivers/net/mlx4/port.c | 273 +++++++++++++++++++++++++++++++++++++++++++ include/linux/mlx4/cmd.h | 9 ++ include/linux/mlx4/device.h | 6 + 6 files changed, 353 insertions(+), 1 deletions(-) create mode 100644 drivers/net/mlx4/port.c diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile index 0952a65..f4932d8 100644 --- a/drivers/net/mlx4/Makefile +++ b/drivers/net/mlx4/Makefile @@ -1,4 +1,4 @@ obj-$(CONFIG_MLX4_CORE) += mlx4_core.o mlx4_core-y := alloc.o catas.o cmd.o cq.o eq.o fw.o icm.o intf.o main.o mcg.o \ - mr.o pd.o profile.o qp.o reset.o srq.o + mr.o pd.o profile.o qp.o reset.o srq.o port.o diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index abaa7b9..f5f4560 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -672,6 +672,7 @@ static int mlx4_setup_hca(struct mlx4_dev *dev) { struct mlx4_priv *priv = mlx4_priv(dev); int err; + int port; err = mlx4_init_uar_table(dev); if (err) { @@ -770,8 +771,20 @@ static int mlx4_setup_hca(struct mlx4_dev *dev) goto err_qp_table_free; } + for (port = 1; port <= dev->caps.num_ports; port++) { + err = mlx4_SET_PORT(dev, port); + if (err) { + mlx4_err(dev, "Failed to set port %d, aborting\n", + port); + goto err_mcg_table_free; + } + } + return 0; +err_mcg_table_free: + mlx4_cleanup_mcg_table(dev); + err_qp_table_free: mlx4_cleanup_qp_table(dev); @@ -835,11 +848,22 @@ no_msi: priv->eq_table.eq[i].irq = dev->pdev->irq; } +static void mlx4_init_port_info(struct mlx4_dev *dev, int port) +{ + struct mlx4_port_info *info = &mlx4_priv(dev)->port[port]; + + info->dev = dev; + info->port = port; + mlx4_init_mac_table(dev, &info->mac_table); + mlx4_init_vlan_table(dev, &info->vlan_table); +} + static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) { struct mlx4_priv *priv; struct mlx4_dev *dev; int err; + int port; printk(KERN_INFO PFX "Initializing %s\n", pci_name(pdev)); @@ -949,6 +973,9 @@ static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) if (err) goto err_close; + for (port = 1; port <= dev->caps.num_ports; port++) + mlx4_init_port_info(dev, port); + err = mlx4_register_device(dev); if (err) goto err_cleanup; diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 9e2f44c..0b94823 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -252,6 +252,37 @@ struct mlx4_catas_err { struct list_head list; }; +struct mlx4_mac_table { +#define MLX4_MAX_MAC_NUM 128 +#define MLX4_MAC_MASK 0xffffffffffff +#define MLX4_MAC_VALID_SHIFT 63 +#define MLX4_MAC_TABLE_SIZE (MLX4_MAX_MAC_NUM << 3) + __be64 entries[MLX4_MAX_MAC_NUM]; + int refs[MLX4_MAX_MAC_NUM]; + struct semaphore mac_sem; + int total; + int max; +}; + +struct mlx4_vlan_table { +#define MLX4_MAX_VLAN_NUM 126 +#define MLX4_VLAN_MASK 0xfff +#define MLX4_VLAN_VALID (1 << 31) +#define MLX4_VLAN_TABLE_SIZE (MLX4_MAX_VLAN_NUM << 2) + __be32 entries[MLX4_MAX_VLAN_NUM]; + int refs[MLX4_MAX_VLAN_NUM]; + struct semaphore vlan_sem; + int total; + int max; +}; + +struct mlx4_port_info { + struct mlx4_dev *dev; + int port; + struct mlx4_mac_table mac_table; + struct mlx4_vlan_table vlan_table; +}; + struct mlx4_priv { struct mlx4_dev dev; @@ -280,6 +311,7 @@ struct mlx4_priv { struct mlx4_uar driver_uar; void __iomem *kar; + struct mlx4_port_info port[MLX4_MAX_PORTS + 1]; }; static inline struct mlx4_priv *mlx4_priv(struct mlx4_dev *dev) @@ -350,4 +382,9 @@ void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type); void mlx4_handle_catas_err(struct mlx4_dev *dev); +void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table); +void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table); + +int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port); + #endif /* MLX4_H */ diff --git a/drivers/net/mlx4/port.c b/drivers/net/mlx4/port.c new file mode 100644 index 0000000..321d024 --- /dev/null +++ b/drivers/net/mlx4/port.c @@ -0,0 +1,273 @@ +/* + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#include +#include + +#include + +#include "mlx4.h" + +void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table) +{ + int i; + + sema_init(&table->mac_sem, 1); + for (i = 0; i < MLX4_MAX_MAC_NUM; i++) { + table->entries[i] = 0; + table->refs[i] = 0; + } + table->max = 1 << dev->caps.log_num_macs; + table->total = 0; +} + +void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table) +{ + int i; + + sema_init(&table->vlan_sem, 1); + for (i = 0; i < MLX4_MAX_MAC_NUM; i++) { + table->entries[i] = 0; + table->refs[i] = 0; + } + table->max = 1 << dev->caps.log_num_vlans; + table->total = 0; +} + +static int mlx4_SET_PORT_mac_table(struct mlx4_dev *dev, u8 port, + __be64 *entries) +{ + struct mlx4_cmd_mailbox *mailbox; + u32 in_mod; + int err; + + mailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(mailbox)) + return PTR_ERR(mailbox); + + memcpy(mailbox->buf, entries, MLX4_MAC_TABLE_SIZE); + + in_mod = MLX4_SET_PORT_MAC_TABLE << 8 | port; + err = mlx4_cmd(dev, mailbox->dma, in_mod, 1, MLX4_CMD_SET_PORT, + MLX4_CMD_TIME_CLASS_B); + + mlx4_free_cmd_mailbox(dev, mailbox); + return err; +} + +int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *index) +{ + struct mlx4_mac_table *table = &mlx4_priv(dev)->port[port].mac_table; + int i, err = 0; + int free = -1; + u64 valid = 1; + + mlx4_dbg(dev, "Registering mac : 0x%llx\n", mac); + down(&table->mac_sem); + for (i = 0; i < MLX4_MAX_MAC_NUM - 1; i++) { + if (free < 0 && !table->refs[i]) { + free = i; + continue; + } + + if (mac == (MLX4_MAC_MASK & be64_to_cpu(table->entries[i]))) { + /* Mac already registered, increase refernce count */ + *index = i; + ++table->refs[i]; + goto out; + } + } + mlx4_dbg(dev, "Free mac index is %d\n", free); + + if (table->total == table->max) { + /* No free mac entries */ + err = -ENOSPC; + goto out; + } + + /* Register new MAC */ + table->refs[free] = 1; + table->entries[free] = cpu_to_be64(mac | valid << MLX4_MAC_VALID_SHIFT); + + err = mlx4_SET_PORT_mac_table(dev, port, table->entries); + if (unlikely(err)) { + mlx4_err(dev, "Failed adding mac: 0x%llx\n", mac); + table->refs[free] = 0; + table->entries[free] = 0; + goto out; + } + + *index = free; + ++table->total; +out: + up(&table->mac_sem); + return err; +} +EXPORT_SYMBOL_GPL(mlx4_register_mac); + +void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, int index) +{ + struct mlx4_mac_table *table = &mlx4_priv(dev)->port[port].mac_table; + + down(&table->mac_sem); + if (!table->refs[index]) { + mlx4_warn(dev, "No mac entry for index %d\n", index); + goto out; + } + if (--table->refs[index]) { + mlx4_warn(dev, "Have more references for index %d," + "no need to modify mac table\n", index); + goto out; + } + table->entries[index] = 0; + mlx4_SET_PORT_mac_table(dev, port, table->entries); + --table->total; +out: + up(&table->mac_sem); +} +EXPORT_SYMBOL_GPL(mlx4_unregister_mac); + +static int mlx4_SET_PORT_vlan_table(struct mlx4_dev *dev, u8 port, + __be32 *entries) +{ + struct mlx4_cmd_mailbox *mailbox; + u32 in_mod; + int err; + + mailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(mailbox)) + return PTR_ERR(mailbox); + + memcpy(mailbox->buf, entries, MLX4_VLAN_TABLE_SIZE); + in_mod = MLX4_SET_PORT_VLAN_TABLE << 8 | port; + err = mlx4_cmd(dev, mailbox->dma, in_mod, 1, MLX4_CMD_SET_PORT, + MLX4_CMD_TIME_CLASS_B); + + mlx4_free_cmd_mailbox(dev, mailbox); + + return err; +} + +int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index) +{ + struct mlx4_vlan_table *table = &mlx4_priv(dev)->port[port].vlan_table; + int i, err = 0; + int free = -1; + + down(&table->vlan_sem); + for (i = 0; i < MLX4_MAX_VLAN_NUM; i++) { + if (free < 0 && (table->refs[i] == 0)) { + free = i; + continue; + } + + if (table->refs[i] && + (vlan == (MLX4_VLAN_MASK & + be32_to_cpu(table->entries[i])))) { + /* Vlan already registered, increase refernce count */ + *index = i; + ++table->refs[i]; + goto out; + } + } + + if (table->total == table->max) { + /* No free vlan entries */ + err = -ENOSPC; + goto out; + } + + /* Register new MAC */ + table->refs[free] = 1; + table->entries[free] = cpu_to_be32(vlan | MLX4_VLAN_VALID); + + err = mlx4_SET_PORT_vlan_table(dev, port, table->entries); + if (unlikely(err)) { + mlx4_warn(dev, "Failed adding vlan: %u\n", vlan); + table->refs[free] = 0; + table->entries[free] = 0; + goto out; + } + + *index = free; + ++table->total; +out: + up(&table->vlan_sem); + return err; +} +EXPORT_SYMBOL_GPL(mlx4_register_vlan); + +void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, int index) +{ + struct mlx4_vlan_table *table = &mlx4_priv(dev)->port[port].vlan_table; + + down(&table->vlan_sem); + if (!table->refs[index]) { + mlx4_warn(dev, "No vlan entry for index %d\n", index); + goto out; + } + if (--table->refs[index]) { + mlx4_dbg(dev, "Have more references for index %d," + "no need to modify vlan table\n", index); + goto out; + } + table->entries[index] = 0; + mlx4_SET_PORT_vlan_table(dev, port, table->entries); + --table->total; +out: + up(&table->vlan_sem); +} +EXPORT_SYMBOL_GPL(mlx4_unregister_vlan); + +int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port) +{ + struct mlx4_cmd_mailbox *mailbox; + int err; + u8 is_eth = (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH) ? 1 : 0; + + mailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(mailbox)) + return PTR_ERR(mailbox); + + memset(mailbox->buf, 0, 256); + if (is_eth) { + ((u8 *) mailbox->buf)[3] = 6; + ((__be16 *) mailbox->buf)[4] = cpu_to_be16(1 << 15); + ((__be16 *) mailbox->buf)[6] = cpu_to_be16(1 << 15); + } + err = mlx4_cmd(dev, mailbox->dma, port, is_eth, MLX4_CMD_SET_PORT, + MLX4_CMD_TIME_CLASS_B); + + mlx4_free_cmd_mailbox(dev, mailbox); + return err; +} diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h index 77323a7..cf9c679 100644 --- a/include/linux/mlx4/cmd.h +++ b/include/linux/mlx4/cmd.h @@ -132,6 +132,15 @@ enum { MLX4_MAILBOX_SIZE = 4096 }; +enum { + /* set port opcode modifiers */ + MLX4_SET_PORT_GENERAL = 0x0, + MLX4_SET_PORT_RQP_CALC = 0x1, + MLX4_SET_PORT_MAC_TABLE = 0x2, + MLX4_SET_PORT_VLAN_TABLE = 0x3, + MLX4_SET_PORT_PRIO_MAP = 0x4, +}; + struct mlx4_dev; struct mlx4_cmd_mailbox { diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 5be49e9..06bac35 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -466,6 +466,12 @@ int mlx4_multicast_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16], int block_mcast_loopback); int mlx4_multicast_detach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16]); +int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *index); +void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, int index); + +int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index); +void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, int index); + int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, int npages, u64 iova, u32 *lkey, u32 *rkey); int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages, -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:20:56 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:20:56 +0300 Subject: [ofa-general][PATCH 06/11 v3] mlx4: Dynamic port configuration Message-ID: <48AC1A38.9000809@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:13:40 +0300 Subject: [PATCH] mlx4: Dynamic port configuration Port type can be set using sysfs interface when the low level driver is up. The low level driver unregisters all its customers and then registers them again with the new port types (which they query for in add_one) Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/main.c | 132 ++++++++++++++++++++++++++++++++++++++++++++-- drivers/net/mlx4/mlx4.h | 4 ++ 2 files changed, 130 insertions(+), 6 deletions(-) diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index f5f4560..65ab668 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -268,6 +268,92 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) return 0; } +/* Changes the port configuration of the device. + * Every user of this function must hold the port lock */ +static int mlx4_change_port_types(struct mlx4_dev *dev, + enum mlx4_port_type *port_types) +{ + int err = 0; + int change = 0; + int port; + + for (port = 0; port < MLX4_MAX_PORTS; port++) { + if (port_types[port] != dev->caps.port_type[port + 1]) { + change = 1; + dev->caps.port_type[port + 1] = port_types[port]; + } + } + if (change) { + mlx4_unregister_device(dev); + for (port = 1; port <= dev->caps.num_ports; port++) { + mlx4_CLOSE_PORT(dev, port); + err = mlx4_SET_PORT(dev, port); + if (err) { + mlx4_err(dev, "Failed to set port %d, " + "aborting\n", port); + goto out; + } + } + err = mlx4_register_device(dev); + } + +out: + return err; +} + +static ssize_t show_port_type(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info, + port_attr); + struct mlx4_dev *mdev = info->dev; + + sprintf(buf, "%s\n", (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_IB) ? + "ib" : "eth"); + return strlen(buf); +} + +static ssize_t set_port_type(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info, + port_attr); + struct mlx4_dev *mdev = info->dev; + struct mlx4_priv *priv = mlx4_priv(mdev); + enum mlx4_port_type types[MLX4_MAX_PORTS]; + int i; + int err = 0; + + if (!strcmp(buf, "ib\n")) + info->tmp_type = MLX4_PORT_TYPE_IB; + else if (!strcmp(buf, "eth\n")) + info->tmp_type = MLX4_PORT_TYPE_ETH; + else { + mlx4_err(mdev, "%s is not supported port type\n", buf); + return -EINVAL; + } + + spin_lock(&priv->port_lock); + for (i = 0; i < mdev->caps.num_ports; i++) + types[i] = priv->port[i+1].tmp_type ? priv->port[i+1].tmp_type : + mdev->caps.port_type[i+1]; + + err = mlx4_check_port_params(mdev, types); + if (err) + goto out; + + for (i = 1; i <= mdev->caps.num_ports; i++) + priv->port[i].tmp_type = 0; + + err = mlx4_change_port_types(mdev, types); + +out: + spin_unlock(&priv->port_lock); + return err ? err : count; +} + static int mlx4_load_fw(struct mlx4_dev *dev) { struct mlx4_priv *priv = mlx4_priv(dev); @@ -848,14 +934,38 @@ no_msi: priv->eq_table.eq[i].irq = dev->pdev->irq; } -static void mlx4_init_port_info(struct mlx4_dev *dev, int port) +static int mlx4_init_port_info(struct mlx4_dev *dev, int port) { struct mlx4_port_info *info = &mlx4_priv(dev)->port[port]; + struct attribute attr = {.name = info->dev_name, + .mode = S_IWUGO | S_IRUGO}; + int err = 0; info->dev = dev; info->port = port; mlx4_init_mac_table(dev, &info->mac_table); mlx4_init_vlan_table(dev, &info->vlan_table); + + sprintf(info->dev_name, "mlx4_port%d", port); + memcpy(&info->port_attr.attr, &attr, sizeof(attr)); + info->port_attr.show = show_port_type; + info->port_attr.store = set_port_type; + + err = device_create_file(&dev->pdev->dev, &info->port_attr); + if (err) { + mlx4_err(dev, "Failed to create file for port %d\n", port); + info->port = -1; + } + + return err; +} + +static void mlx4_cleanup_port_info(struct mlx4_port_info *info) +{ + if (info->port < 0) + return; + + device_remove_file(&info->dev->pdev->dev, &info->port_attr); } static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) @@ -938,6 +1048,8 @@ static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&priv->ctx_list); spin_lock_init(&priv->ctx_lock); + spin_lock_init(&priv->port_lock); + INIT_LIST_HEAD(&priv->pgdir_list); mutex_init(&priv->pgdir_mutex); @@ -973,18 +1085,24 @@ static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) if (err) goto err_close; - for (port = 1; port <= dev->caps.num_ports; port++) - mlx4_init_port_info(dev, port); + for (port = 1; port <= dev->caps.num_ports; port++) { + err = mlx4_init_port_info(dev, port); + if (err) + goto err_port; + } err = mlx4_register_device(dev); if (err) - goto err_cleanup; + goto err_port; pci_set_drvdata(pdev, dev); return 0; -err_cleanup: +err_port: + for (port = 1; port <= dev->caps.num_ports; port++) + mlx4_cleanup_port_info(&priv->port[port]); + mlx4_cleanup_mcg_table(dev); mlx4_cleanup_qp_table(dev); mlx4_cleanup_srq_table(dev); @@ -1041,8 +1159,10 @@ static void mlx4_remove_one(struct pci_dev *pdev) if (dev) { mlx4_unregister_device(dev); - for (p = 1; p <= dev->caps.num_ports; ++p) + for (p = 1; p <= dev->caps.num_ports; p++) { + mlx4_cleanup_port_info(&priv->port[p]); mlx4_CLOSE_PORT(dev, p); + } mlx4_cleanup_mcg_table(dev); mlx4_cleanup_qp_table(dev); diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 0b94823..68e60c5 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -279,6 +279,9 @@ struct mlx4_vlan_table { struct mlx4_port_info { struct mlx4_dev *dev; int port; + char dev_name[16]; + struct device_attribute port_attr; + enum mlx4_port_type tmp_type; struct mlx4_mac_table mac_table; struct mlx4_vlan_table vlan_table; }; @@ -312,6 +315,7 @@ struct mlx4_priv { struct mlx4_uar driver_uar; void __iomem *kar; struct mlx4_port_info port[MLX4_MAX_PORTS + 1]; + spinlock_t port_lock; }; static inline struct mlx4_priv *mlx4_priv(struct mlx4_dev *dev) -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:21:36 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:21:36 +0300 Subject: [ofa-general][PATCH 07/11 v3] mlx4: Multiple completion vectors support Message-ID: <48AC1A60.3000806@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:16:15 +0300 Subject: [PATCH] mlx4: Multiple completion vectors support The driver now creates a completion EQ for every cpu. While allocating CQ a ULP asks a completion vector number it wants the CQ to be attached to. The number of completion vectors is populated via ib_device.num_comp_vectors Signed-off-by: Yevgeny Petrilin --- drivers/infiniband/hw/mlx4/cq.c | 2 +- drivers/infiniband/hw/mlx4/main.c | 2 +- drivers/net/mlx4/cq.c | 14 ++++++++-- drivers/net/mlx4/eq.c | 47 ++++++++++++++++++++++++------------ drivers/net/mlx4/main.c | 14 ++++++---- drivers/net/mlx4/mlx4.h | 4 +- include/linux/mlx4/device.h | 4 ++- 7 files changed, 57 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index d0866a3..5de41bd 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -222,7 +222,7 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector } err = mlx4_cq_alloc(dev->dev, entries, &cq->buf.mtt, uar, - cq->db.dma, &cq->mcq, 0); + cq->db.dma, &cq->mcq, vector, 0); if (err) goto err_dbmap; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index ff60f77..e30d81a 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -579,7 +579,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) mlx4_foreach_port(i, ibdev->ports_map) ibdev->num_ports++; ibdev->ib_dev.phys_port_cnt = ibdev->num_ports; - ibdev->ib_dev.num_comp_vectors = 1; + ibdev->ib_dev.num_comp_vectors = dev->caps.num_comp_vectors; ibdev->ib_dev.dma_device = &dev->pdev->dev; ibdev->ib_dev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; diff --git a/drivers/net/mlx4/cq.c b/drivers/net/mlx4/cq.c index b7ad282..a675e85 100644 --- a/drivers/net/mlx4/cq.c +++ b/drivers/net/mlx4/cq.c @@ -189,7 +189,7 @@ EXPORT_SYMBOL_GPL(mlx4_cq_resize); int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, struct mlx4_uar *uar, u64 db_rec, struct mlx4_cq *cq, - int collapsed) + unsigned vector, int collapsed) { struct mlx4_priv *priv = mlx4_priv(dev); struct mlx4_cq_table *cq_table = &priv->cq_table; @@ -227,7 +227,15 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, cq_context->flags = cpu_to_be32(!!collapsed << 18); cq_context->logsize_usrpage = cpu_to_be32((ilog2(nent) << 24) | uar->index); - cq_context->comp_eqn = priv->eq_table.eq[MLX4_EQ_COMP].eqn; + + if (vector >= dev->caps.num_comp_vectors) { + err = -EINVAL; + goto err_radix; + } + + cq->comp_eq_idx = MLX4_EQ_COMP_CPU0 + vector; + cq_context->comp_eqn = priv->eq_table.eq[MLX4_EQ_COMP_CPU0 + + vector].eqn; cq_context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT; mtt_addr = mlx4_mtt_addr(dev, mtt); @@ -276,7 +284,7 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) if (err) mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn); - synchronize_irq(priv->eq_table.eq[MLX4_EQ_COMP].irq); + synchronize_irq(priv->eq_table.eq[cq->comp_eq_idx].irq); spin_lock_irq(&cq_table->lock); radix_tree_delete(&cq_table->tree, cq->cqn); diff --git a/drivers/net/mlx4/eq.c b/drivers/net/mlx4/eq.c index de16933..b436234 100644 --- a/drivers/net/mlx4/eq.c +++ b/drivers/net/mlx4/eq.c @@ -266,7 +266,7 @@ static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr) writel(priv->eq_table.clr_mask, priv->eq_table.clr_int); - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < MLX4_EQ_COMP_CPU0 + dev->caps.num_comp_vectors; ++i) work |= mlx4_eq_int(dev, &priv->eq_table.eq[i]); return IRQ_RETVAL(work); @@ -483,7 +483,7 @@ static void mlx4_free_irqs(struct mlx4_dev *dev) if (eq_table->have_irq) free_irq(dev->pdev->irq, dev); - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < MLX4_EQ_COMP_CPU0 + dev->caps.num_comp_vectors; ++i) if (eq_table->eq[i].have_irq) free_irq(eq_table->eq[i].irq, eq_table->eq + i); } @@ -554,6 +554,7 @@ void mlx4_unmap_eq_icm(struct mlx4_dev *dev) int mlx4_init_eq_table(struct mlx4_dev *dev) { struct mlx4_priv *priv = mlx4_priv(dev); + int req_eqs; int err; int i; @@ -574,11 +575,21 @@ int mlx4_init_eq_table(struct mlx4_dev *dev) priv->eq_table.clr_int = priv->clr_base + (priv->eq_table.inta_pin < 32 ? 4 : 0); - err = mlx4_create_eq(dev, dev->caps.num_cqs + MLX4_NUM_SPARE_EQE, - (dev->flags & MLX4_FLAG_MSI_X) ? MLX4_EQ_COMP : 0, - &priv->eq_table.eq[MLX4_EQ_COMP]); - if (err) - goto err_out_unmap; + dev->caps.num_comp_vectors = 0; + req_eqs = (dev->flags & MLX4_FLAG_MSI_X) ? num_online_cpus() : 1; + while (req_eqs) { + err = mlx4_create_eq( + dev, dev->caps.num_cqs + MLX4_NUM_SPARE_EQE, + (dev->flags & MLX4_FLAG_MSI_X) ? + (MLX4_EQ_COMP_CPU0 + dev->caps.num_comp_vectors) : 0, + &priv->eq_table.eq[MLX4_EQ_COMP_CPU0 + + dev->caps.num_comp_vectors]); + if (err) + goto err_out_comp; + + dev->caps.num_comp_vectors++; + req_eqs--; + } err = mlx4_create_eq(dev, MLX4_NUM_ASYNC_EQE + MLX4_NUM_SPARE_EQE, (dev->flags & MLX4_FLAG_MSI_X) ? MLX4_EQ_ASYNC : 0, @@ -587,12 +598,16 @@ int mlx4_init_eq_table(struct mlx4_dev *dev) goto err_out_comp; if (dev->flags & MLX4_FLAG_MSI_X) { - static const char *eq_name[] = { - [MLX4_EQ_COMP] = DRV_NAME " (comp)", - [MLX4_EQ_ASYNC] = DRV_NAME " (async)" - }; + static char eq_name[MLX4_NUM_EQ][20]; + + for (i = 0; i < MLX4_EQ_COMP_CPU0 + + dev->caps.num_comp_vectors; ++i) { + if (i == 0) + snprintf(eq_name[0], 20, DRV_NAME "(async)"); + else + snprintf(eq_name[i], 20, "comp_" DRV_NAME "%d", + i - 1); - for (i = 0; i < MLX4_NUM_EQ; ++i) { err = request_irq(priv->eq_table.eq[i].irq, mlx4_msi_x_interrupt, 0, eq_name[i], priv->eq_table.eq + i); @@ -617,7 +632,7 @@ int mlx4_init_eq_table(struct mlx4_dev *dev) mlx4_warn(dev, "MAP_EQ for async EQ %d failed (%d)\n", priv->eq_table.eq[MLX4_EQ_ASYNC].eqn, err); - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < MLX4_EQ_COMP_CPU0 + dev->caps.num_comp_vectors; ++i) eq_set_ci(&priv->eq_table.eq[i], 1); return 0; @@ -626,9 +641,9 @@ err_out_async: mlx4_free_eq(dev, &priv->eq_table.eq[MLX4_EQ_ASYNC]); err_out_comp: - mlx4_free_eq(dev, &priv->eq_table.eq[MLX4_EQ_COMP]); + for (i = 0; i < dev->caps.num_comp_vectors; ++i) + mlx4_free_eq(dev, &priv->eq_table.eq[MLX4_EQ_COMP_CPU0 + i]); -err_out_unmap: mlx4_unmap_clr_int(dev); mlx4_free_irqs(dev); @@ -647,7 +662,7 @@ void mlx4_cleanup_eq_table(struct mlx4_dev *dev) mlx4_free_irqs(dev); - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < MLX4_EQ_COMP_CPU0 + dev->caps.num_comp_vectors; ++i) mlx4_free_eq(dev, &priv->eq_table.eq[i]); mlx4_unmap_clr_int(dev); diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 65ab668..b8213ca 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -907,22 +907,24 @@ static void mlx4_enable_msi_x(struct mlx4_dev *dev) { struct mlx4_priv *priv = mlx4_priv(dev); struct msix_entry entries[MLX4_NUM_EQ]; + int needed_vectors = MLX4_EQ_COMP_CPU0 + num_online_cpus(); int err; int i; if (msi_x) { - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < needed_vectors; ++i) entries[i].entry = i; - err = pci_enable_msix(dev->pdev, entries, ARRAY_SIZE(entries)); + err = pci_enable_msix(dev->pdev, entries, needed_vectors); if (err) { if (err > 0) - mlx4_info(dev, "Only %d MSI-X vectors available, " - "not using MSI-X\n", err); + mlx4_info(dev, "Only %d MSI-X vectors " + "available, need %d. Not using MSI-X\n", + err, needed_vectors); goto no_msi; } - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < needed_vectors; ++i) priv->eq_table.eq[i].irq = entries[i].vector; dev->flags |= MLX4_FLAG_MSI_X; @@ -930,7 +932,7 @@ static void mlx4_enable_msi_x(struct mlx4_dev *dev) } no_msi: - for (i = 0; i < MLX4_NUM_EQ; ++i) + for (i = 0; i < needed_vectors; ++i) priv->eq_table.eq[i].irq = dev->pdev->irq; } diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 68e60c5..4197cd0 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -64,8 +64,8 @@ enum { enum { MLX4_EQ_ASYNC, - MLX4_EQ_COMP, - MLX4_NUM_EQ + MLX4_EQ_COMP_CPU0, + MLX4_NUM_EQ = MLX4_EQ_COMP_CPU0 + NR_CPUS }; enum { diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 06bac35..67ade67 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -195,6 +195,7 @@ struct mlx4_caps { int reserved_cqs; int num_eqs; int reserved_eqs; + int num_comp_vectors; int num_mpts; int num_mtt_segs; int fmr_reserved_mtts; @@ -315,6 +316,7 @@ struct mlx4_cq { int arm_sn; int cqn; + int comp_eq_idx; atomic_t refcount; struct completion free; @@ -444,7 +446,7 @@ void mlx4_free_hwq_res(struct mlx4_dev *mdev, struct mlx4_hwq_resources *wqres, int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, struct mlx4_uar *uar, u64 db_rec, struct mlx4_cq *cq, - int collapsed); + unsigned vector, int collapsed); void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq); int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align, int *base); -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:21:56 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:21:56 +0300 Subject: [ofa-general][PATCH 08/11 v4] mlx4: Default value for automatic completion vector selection Message-ID: <48AC1A74.30305@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:18:58 +0300 Subject: [PATCH] mlx4: Default value for automatic completion vector selection When the vector number passed to mlx4_cq_alloc is MLX4_LEAST_ATTACHED_VECTOR (0xffffffff), the driver selects the completion vector that has the least CQ's attached to it and attaches the CQ to the chosen vector. IB_CQ_VECTOR_LEAST_ATTACHED is defined in rdma/ib_verbs.h, when mlx4_ib driver, receives this cq vector number, it uses MLX4_LEAST_ATTACHED_VECTOR at CQ creation. Signed-off-by: Yevgeny Petrilin --- drivers/infiniband/hw/mlx4/cq.c | 4 +++- drivers/net/mlx4/cq.c | 22 +++++++++++++++++++++- drivers/net/mlx4/mlx4.h | 1 + include/linux/mlx4/device.h | 2 ++ include/rdma/ib_verbs.h | 10 +++++++++- 5 files changed, 36 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 5de41bd..384e616 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -222,7 +222,9 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector } err = mlx4_cq_alloc(dev->dev, entries, &cq->buf.mtt, uar, - cq->db.dma, &cq->mcq, vector, 0); + cq->db.dma, &cq->mcq, + vector == IB_CQ_VECTOR_LEAST_ATTACHED ? + MLX4_LEAST_ATTACHED_VECTOR : vector, 0); if (err) goto err_dbmap; diff --git a/drivers/net/mlx4/cq.c b/drivers/net/mlx4/cq.c index a675e85..31a5190 100644 --- a/drivers/net/mlx4/cq.c +++ b/drivers/net/mlx4/cq.c @@ -187,6 +187,22 @@ int mlx4_cq_resize(struct mlx4_dev *dev, struct mlx4_cq *cq, } EXPORT_SYMBOL_GPL(mlx4_cq_resize); +static int mlx4_find_least_loaded_vector(struct mlx4_priv *priv) +{ + int i; + int index = 0; + int min = priv->eq_table.eq[MLX4_EQ_COMP_CPU0].load; + + for (i = 1; i < priv->dev.caps.num_comp_vectors; i++) { + if (priv->eq_table.eq[MLX4_EQ_COMP_CPU0 + i].load < min) { + index = i; + min = priv->eq_table.eq[MLX4_EQ_COMP_CPU0 + i].load; + } + } + + return index; +} + int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, struct mlx4_uar *uar, u64 db_rec, struct mlx4_cq *cq, unsigned vector, int collapsed) @@ -228,7 +244,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, cq_context->flags = cpu_to_be32(!!collapsed << 18); cq_context->logsize_usrpage = cpu_to_be32((ilog2(nent) << 24) | uar->index); - if (vector >= dev->caps.num_comp_vectors) { + if (vector == MLX4_LEAST_ATTACHED_VECTOR) + vector = mlx4_find_least_loaded_vector(priv); + else if (vector >= dev->caps.num_comp_vectors) { err = -EINVAL; goto err_radix; } @@ -248,6 +266,7 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt, if (err) goto err_radix; + priv->eq_table.eq[cq->comp_eq_idx].load++; cq->cons_index = 0; cq->arm_sn = 1; cq->uar = uar; @@ -285,6 +304,7 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn); synchronize_irq(priv->eq_table.eq[cq->comp_eq_idx].irq); + priv->eq_table.eq[cq->comp_eq_idx].load--; spin_lock_irq(&cq_table->lock); radix_tree_delete(&cq_table->tree, cq->cqn); diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 4197cd0..4c18e36 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -145,6 +145,7 @@ struct mlx4_eq { u16 irq; u16 have_irq; int nent; + int load; struct mlx4_buf_list *page_list; struct mlx4_mtt mtt; }; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 67ade67..727e7f1 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -159,6 +159,8 @@ enum { MLX4_NUM_FEXCH = 64 * 1024, }; +#define MLX4_LEAST_ATTACHED_VECTOR 0xffffffff + static inline u64 mlx4_fw_ver(u64 major, u64 minor, u64 subminor) { return (major << 32) | (minor << 16) | subminor; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 936e333..038997e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1448,6 +1448,13 @@ static inline int ib_post_recv(struct ib_qp *qp, return qp->device->post_recv(qp, recv_wr, bad_recv_wr); } +/* + * IB_CQ_VECTOR_LEAST_ATTACHED: The constatnt specifies that + * teh cq will be attached to the least attached + * completion vector + */ +#define IB_CQ_VECTOR_LEAST_ATTACHED 0xffffffff + /** * ib_create_cq - Creates a CQ on the specified device. * @device: The device on which to create the CQ. @@ -1459,7 +1466,8 @@ static inline int ib_post_recv(struct ib_qp *qp, * the associated completion and event handlers. * @cqe: The minimum size of the CQ. * @comp_vector - Completion vector used to signal completion events. - * Must be >= 0 and < context->num_comp_vectors. + * Must be >= 0 and < context->num_comp_vectors + * or IB_CQ_VECTOR_LEAST_ATTACHED. * * Users can examine the cq structure to determine the actual CQ size. */ -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:22:31 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:22:31 +0300 Subject: [ofa-general][PATCH 09/11 v3] mlx4: Fiber Channel support Message-ID: <48AC1A97.2090609@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:22:17 +0300 Subject: [PATCH] mlx4: Fiber Channel support As we did with QPs, some of the MPTs are pre-reserved (the MPTs that are mapped for FEXCHs, 2*64K of them). So needed to split the operation of allocating an MPT to two: The allocation of a bit from the bitmap The actual creation of the entry (and it's MTT). So, mr_alloc_reserved() is the second part, where you know which MPT number was allocated. mr_alloc() is the one that allocates a number from the bitmap. Normal users keep using the original mr_alloc(). For FEXCH, when we know the pre-reserved MPT entry, we call mr_alloc_reserved() directly. Same with the mr_free() and corresponding mr_free_reserved(). The first will just put back the bit, the later will actually destroy the entry, but will leave the bit set. map_phys_fmr_fbo() is very much like the original map_phys_fmr() allows setting an FBO (First Byte Offset) for the MPT allows setting the data length for the MPT does not increase the higher bits of the key after every map. Signed-off-by: Yevgeny Petrilin --- drivers/infiniband/hw/mlx4/main.c | 4 +- drivers/net/mlx4/main.c | 2 +- drivers/net/mlx4/mr.c | 128 ++++++++++++++++++++++++++++++++----- include/linux/mlx4/device.h | 19 ++++++ include/linux/mlx4/qp.h | 11 +++- 5 files changed, 145 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index e30d81a..b4d786f 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -126,7 +126,9 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, dev->dev->caps.max_rq_sg); props->max_cq = dev->dev->caps.num_cqs - dev->dev->caps.reserved_cqs; props->max_cqe = dev->dev->caps.max_cqes; - props->max_mr = dev->dev->caps.num_mpts - dev->dev->caps.reserved_mrws; + props->max_mr = dev->dev->caps.num_mpts - + dev->dev->caps.reserved_mrws - + dev->dev->caps.num_fexch_mpts; props->max_pd = dev->dev->caps.num_pds - dev->dev->caps.reserved_pds; props->max_qp_rd_atom = dev->dev->caps.max_qp_dest_rdma; props->max_qp_init_rd_atom = dev->dev->caps.max_qp_init_rdma; diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index b8213ca..4f6ec90 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -81,7 +81,7 @@ static struct mlx4_profile default_profile = { .rdmarc_per_qp = 1 << 4, .num_cq = 1 << 16, .num_mcg = 1 << 13, - .num_mpt = 1 << 17, + .num_mpt = 1 << 18, .num_mtt = 1 << 20, }; diff --git a/drivers/net/mlx4/mr.c b/drivers/net/mlx4/mr.c index d7c6ea5..65561ca 100644 --- a/drivers/net/mlx4/mr.c +++ b/drivers/net/mlx4/mr.c @@ -52,7 +52,9 @@ struct mlx4_mpt_entry { __be64 length; __be32 lkey; __be32 win_cnt; - u8 reserved1[3]; + u8 reserved1; + u8 flags2; + u8 reserved2; u8 mtt_rep; __be64 mtt_seg; __be32 mtt_sz; @@ -72,6 +74,8 @@ struct mlx4_mpt_entry { #define MLX4_MTT_FLAG_PRESENT 1 +#define MLX4_MPT_FLAG2_FBO_EN (1 << 7) + #define MLX4_MPT_STATUS_SW 0xF0 #define MLX4_MPT_STATUS_HW 0x00 @@ -264,6 +268,21 @@ static int mlx4_HW2SW_MPT(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox !mailbox, MLX4_CMD_HW2SW_MPT, MLX4_CMD_TIME_CLASS_B); } +int mlx4_mr_alloc_reserved(struct mlx4_dev *dev, u32 mridx, u32 pd, + u64 iova, u64 size, u32 access, int npages, + int page_shift, struct mlx4_mr *mr) +{ + mr->iova = iova; + mr->size = size; + mr->pd = pd; + mr->access = access; + mr->enabled = 0; + mr->key = hw_index_to_key(mridx); + + return mlx4_mtt_init(dev, npages, page_shift, &mr->mtt); +} +EXPORT_SYMBOL_GPL(mlx4_mr_alloc_reserved); + int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access, int npages, int page_shift, struct mlx4_mr *mr) { @@ -275,14 +294,8 @@ int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access, if (index == -1) return -ENOMEM; - mr->iova = iova; - mr->size = size; - mr->pd = pd; - mr->access = access; - mr->enabled = 0; - mr->key = hw_index_to_key(index); - - err = mlx4_mtt_init(dev, npages, page_shift, &mr->mtt); + err = mlx4_mr_alloc_reserved(dev, index, pd, iova, size, + access, npages, page_shift, mr); if (err) mlx4_bitmap_free(&priv->mr_table.mpt_bitmap, index); @@ -290,9 +303,8 @@ int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access, } EXPORT_SYMBOL_GPL(mlx4_mr_alloc); -void mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr) +void mlx4_mr_free_reserved(struct mlx4_dev *dev, struct mlx4_mr *mr) { - struct mlx4_priv *priv = mlx4_priv(dev); int err; if (mr->enabled) { @@ -304,6 +316,13 @@ void mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr) } mlx4_mtt_cleanup(dev, &mr->mtt); +} +EXPORT_SYMBOL_GPL(mlx4_mr_free_reserved); + +void mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + mlx4_mr_free_reserved(dev, mr); mlx4_bitmap_free(&priv->mr_table.mpt_bitmap, key_to_hw_index(mr->key)); } EXPORT_SYMBOL_GPL(mlx4_mr_free); @@ -458,8 +477,16 @@ int mlx4_init_mr_table(struct mlx4_dev *dev) struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table; int err; + if (!is_power_of_2(dev->caps.num_mpts)) + return -EINVAL; + + dev->caps.num_fexch_mpts = + 2 * dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH]; + dev->caps.reserved_fexch_mpts_base = dev->caps.num_mpts - + dev->caps.num_fexch_mpts; err = mlx4_bitmap_init(&mr_table->mpt_bitmap, dev->caps.num_mpts, - ~0, dev->caps.reserved_mrws, 0); + ~0, dev->caps.reserved_mrws, + dev->caps.reserved_fexch_mpts_base); if (err) return err; @@ -523,8 +550,9 @@ static inline int mlx4_check_fmr(struct mlx4_fmr *fmr, u64 *page_list, return 0; } -int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, - int npages, u64 iova, u32 *lkey, u32 *rkey) +int mlx4_map_phys_fmr_fbo(struct mlx4_dev *dev, struct mlx4_fmr *fmr, + u64 *page_list, int npages, u64 iova, u32 fbo, + u32 len, u32 *lkey, u32 *rkey, int same_key) { u32 key; int i, err; @@ -536,7 +564,8 @@ int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list ++fmr->maps; key = key_to_hw_index(fmr->mr.key); - key += dev->caps.num_mpts; + if (!same_key) + key += dev->caps.num_mpts; *lkey = *rkey = fmr->mr.key = hw_index_to_key(key); *(u8 *) fmr->mpt = MLX4_MPT_STATUS_SW; @@ -552,8 +581,10 @@ int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list fmr->mpt->key = cpu_to_be32(key); fmr->mpt->lkey = cpu_to_be32(key); - fmr->mpt->length = cpu_to_be64(npages * (1ull << fmr->page_shift)); + fmr->mpt->length = cpu_to_be64(len); fmr->mpt->start = cpu_to_be64(iova); + fmr->mpt->first_byte_offset = cpu_to_be32(fbo & 0x001fffff); + fmr->mpt->flags2 = (fbo ? MLX4_MPT_FLAG2_FBO_EN : 0); /* Make MTT entries are visible before setting MPT status */ wmb(); @@ -565,6 +596,16 @@ int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list return 0; } +EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr_fbo); + +int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, + int npages, u64 iova, u32 *lkey, u32 *rkey) +{ + u32 len = npages * (1ull << fmr->page_shift); + + return mlx4_map_phys_fmr_fbo(dev, fmr, page_list, npages, iova, 0, + len, lkey, rkey, 0); +} EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr); int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages, @@ -609,6 +650,49 @@ err_free: } EXPORT_SYMBOL_GPL(mlx4_fmr_alloc); +int mlx4_fmr_alloc_reserved(struct mlx4_dev *dev, u32 mridx, + u32 pd, u32 access, int max_pages, + int max_maps, u8 page_shift, struct mlx4_fmr *fmr) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + u64 mtt_seg; + int err = -ENOMEM; + + if (page_shift < (ffs(dev->caps.page_size_cap) - 1) || page_shift >= 32) + return -EINVAL; + + /* All MTTs must fit in the same page */ + if (max_pages * sizeof *fmr->mtts > PAGE_SIZE) + return -EINVAL; + + fmr->page_shift = page_shift; + fmr->max_pages = max_pages; + fmr->max_maps = max_maps; + fmr->maps = 0; + + err = mlx4_mr_alloc_reserved(dev, mridx, pd, 0, 0, access, max_pages, + page_shift, &fmr->mr); + if (err) + return err; + + mtt_seg = fmr->mr.mtt.first_seg * dev->caps.mtt_entry_sz; + + fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table, + fmr->mr.mtt.first_seg, + &fmr->dma_handle); + if (!fmr->mtts) { + err = -ENOMEM; + goto err_free; + } + + return 0; + +err_free: + mlx4_mr_free_reserved(dev, &fmr->mr); + return err; +} +EXPORT_SYMBOL_GPL(mlx4_fmr_alloc_reserved); + int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr) { struct mlx4_priv *priv = mlx4_priv(dev); @@ -651,6 +735,18 @@ int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr) } EXPORT_SYMBOL_GPL(mlx4_fmr_free); +int mlx4_fmr_free_reserved(struct mlx4_dev *dev, struct mlx4_fmr *fmr) +{ + if (fmr->maps) + return -EBUSY; + + fmr->mr.enabled = 0; + mlx4_mr_free_reserved(dev, &fmr->mr); + + return 0; +} +EXPORT_SYMBOL_GPL(mlx4_fmr_free_reserved); + int mlx4_SYNC_TPT(struct mlx4_dev *dev) { return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT, 1000); diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 727e7f1..3496a27 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -226,6 +226,8 @@ struct mlx4_caps { int log_num_vlans; int log_num_prios; enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1]; + int reserved_fexch_mpts_base; + int num_fexch_mpts; }; struct mlx4_buf_list { @@ -406,6 +408,12 @@ static inline u32 mlx4_get_ports_of_type(struct mlx4_dev *dev, for ((port) = 1; (port) <= MLX4_MAX_PORTS; (port)++) \ if (bitmap & 1 << ((port)-1)) + +static inline int mlx4_get_fexch_mpts_base(struct mlx4_dev *dev) +{ + return dev->caps.reserved_fexch_mpts_base; +} + int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct, struct mlx4_buf *buf); void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf); @@ -429,8 +437,12 @@ int mlx4_mtt_init(struct mlx4_dev *dev, int npages, int page_shift, void mlx4_mtt_cleanup(struct mlx4_dev *dev, struct mlx4_mtt *mtt); u64 mlx4_mtt_addr(struct mlx4_dev *dev, struct mlx4_mtt *mtt); +int mlx4_mr_alloc_reserved(struct mlx4_dev *dev, u32 mridx, u32 pd, + u64 iova, u64 size, u32 access, int npages, + int page_shift, struct mlx4_mr *mr); int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access, int npages, int page_shift, struct mlx4_mr *mr); +void mlx4_mr_free_reserved(struct mlx4_dev *dev, struct mlx4_mr *mr); void mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr); int mlx4_mr_enable(struct mlx4_dev *dev, struct mlx4_mr *mr); int mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt, @@ -476,13 +488,20 @@ void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, int index); int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index); void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, int index); +int mlx4_map_phys_fmr_fbo(struct mlx4_dev *dev, struct mlx4_fmr *fmr, + u64 *page_list, int npages, u64 iova, u32 fbo, + u32 len, u32 *lkey, u32 *rkey, int same_key); int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, int npages, u64 iova, u32 *lkey, u32 *rkey); +int mlx4_fmr_alloc_reserved(struct mlx4_dev *dev, u32 mridx, u32 pd, + u32 access, int max_pages, int max_maps, + u8 page_shift, struct mlx4_fmr *fmr); int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages, int max_maps, u8 page_shift, struct mlx4_fmr *fmr); int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr); void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u32 *lkey, u32 *rkey); +int mlx4_fmr_free_reserved(struct mlx4_dev *dev, struct mlx4_fmr *fmr); int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr); int mlx4_SYNC_TPT(struct mlx4_dev *dev); diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index 03802fc..f8f49d8 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -151,7 +151,16 @@ struct mlx4_qp_context { u8 reserved4[2]; u8 mtt_base_addr_h; __be32 mtt_base_addr_l; - u32 reserved5[10]; + u8 VE; + u8 reserved5; + __be16 VFT_id_prio; + u8 reserved6; + u8 exch_size; + __be16 exch_base; + u8 VFT_hop_cnt; + u8 my_fc_id_idx; + __be16 reserved7; + u32 reserved8[7]; }; /* Which firmware version adds support for NEC (NoErrorCompletion) bit */ -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:22:52 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:22:52 +0300 Subject: [ofa-general][PATCH 11/11 v3] mlx4_core: Added support to Ethernet device id's Message-ID: <48AC1AAC.8030303@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:27:35 +0300 Subject: [PATCH] mlx4_core: Added support to Ethernet device id's Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/main.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 92784ad..b5c530a 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -1264,6 +1264,8 @@ static struct pci_device_id mlx4_pci_table[] = { { PCI_VDEVICE(MELLANOX, 0x6354) }, /* MT25408 "Hermon" QDR */ { PCI_VDEVICE(MELLANOX, 0x6732) }, /* MT25408 "Hermon" DDR PCIe gen2 */ { PCI_VDEVICE(MELLANOX, 0x673c) }, /* MT25408 "Hermon" QDR PCIe gen2 */ + { PCI_VDEVICE(MELLANOX, 0x6368) }, /* MT25408 "Hermon"EN 10GigE */ + { PCI_VDEVICE(MELLANOX, 0x6750) }, /* MT25408 "Hermon"EN 10GigE + Gen2 */ { 0, } }; -- 1.5.4 From yevgenyp at mellanox.co.il Wed Aug 20 06:22:45 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Wed, 20 Aug 2008 16:22:45 +0300 Subject: [ofa-general][PATCH 10/11 v3] mlx4_core: Auto negotiation support Message-ID: <48AC1AA5.2060808@mellanox.co.il> From: Yevgeny Petrilin Date: Wed, 20 Aug 2008 10:25:17 +0300 Subject: [PATCH] mlx4_core: Auto negotiation support At any time when port link is down (except to driver restart), and port is configured to auto sensing, we try to sense port's configuration in order to determine how to initialize the port. If port type need to be changed, all ulp's are unregistered and then registered again with the new port types. Sense is done with intervals that move between 1-3 seconds. Signed-off-by: Yevgeny Petrilin --- drivers/net/mlx4/Makefile | 2 +- drivers/net/mlx4/eq.c | 16 +++-- drivers/net/mlx4/intf.c | 4 + drivers/net/mlx4/main.c | 97 +++++++++++++++++++++++----- drivers/net/mlx4/mlx4.h | 24 +++++++ drivers/net/mlx4/sense.c | 148 +++++++++++++++++++++++++++++++++++++++++++ include/linux/mlx4/cmd.h | 1 + include/linux/mlx4/device.h | 6 +- 8 files changed, 272 insertions(+), 26 deletions(-) create mode 100644 drivers/net/mlx4/sense.c diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile index f4932d8..3f71687 100644 --- a/drivers/net/mlx4/Makefile +++ b/drivers/net/mlx4/Makefile @@ -1,4 +1,4 @@ obj-$(CONFIG_MLX4_CORE) += mlx4_core.o mlx4_core-y := alloc.o catas.o cmd.o cq.o eq.o fw.o icm.o intf.o main.o mcg.o \ - mr.o pd.o profile.o qp.o reset.o srq.o port.o + mr.o pd.o profile.o qp.o reset.o srq.o port.o sense.o diff --git a/drivers/net/mlx4/eq.c b/drivers/net/mlx4/eq.c index b436234..bd3ce60 100644 --- a/drivers/net/mlx4/eq.c +++ b/drivers/net/mlx4/eq.c @@ -163,6 +163,7 @@ static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq *eq) int cqn; int eqes_found = 0; int set_ci = 0; + int port; while ((eqe = next_eqe_sw(eq))) { /* @@ -203,11 +204,16 @@ static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq *eq) break; case MLX4_EVENT_TYPE_PORT_CHANGE: - mlx4_dispatch_event(dev, - eqe->subtype == MLX4_PORT_CHANGE_SUBTYPE_ACTIVE ? - MLX4_DEV_EVENT_PORT_UP : - MLX4_DEV_EVENT_PORT_DOWN, - be32_to_cpu(eqe->event.port_change.port) >> 28); + port = be32_to_cpu(eqe->event.port_change.port) >> 28; + if (eqe->subtype == MLX4_PORT_CHANGE_SUBTYPE_DOWN) { + mlx4_dispatch_event(dev, MLX4_DEV_EVENT_PORT_DOWN, + port); + mlx4_priv(dev)->sense.do_sense_port[port] = 1; + } else { + mlx4_dispatch_event(dev, MLX4_DEV_EVENT_PORT_UP, + port); + mlx4_priv(dev)->sense.do_sense_port[port] = 0; + } break; case MLX4_EVENT_TYPE_CQ_ERROR: diff --git a/drivers/net/mlx4/intf.c b/drivers/net/mlx4/intf.c index 0e7eb10..30ef000 100644 --- a/drivers/net/mlx4/intf.c +++ b/drivers/net/mlx4/intf.c @@ -141,6 +141,8 @@ int mlx4_register_device(struct mlx4_dev *dev) mutex_unlock(&intf_mutex); mlx4_start_catas_poll(dev); + mlx4_start_sense(dev); + return 0; } @@ -149,6 +151,8 @@ void mlx4_unregister_device(struct mlx4_dev *dev) struct mlx4_priv *priv = mlx4_priv(dev); struct mlx4_interface *intf; + mlx4_stop_sense(dev); + mlx4_stop_catas_poll(dev); mutex_lock(&intf_mutex); diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 4f6ec90..92784ad 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -103,18 +103,21 @@ module_param_array_named(port_type, port_type_arr, charp, NULL, 0444); MODULE_PARM_DESC(port_type, "Ports L2 type (ib/eth/auto, entry per port, " "comma seperated, default ib for all)"); -static int mlx4_check_port_params(struct mlx4_dev *dev, - enum mlx4_port_type *port_type) +int mlx4_check_port_params(struct mlx4_dev *dev, + enum mlx4_port_type *port_type) { - if (port_type[0] != port_type[1] && + if ((port_type[0] != port_type[1] || + port_type[0] == MLX4_PORT_TYPE_AUTO) && !(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) { mlx4_err(dev, "Only same port types supported " "on this HCA, aborting.\n"); return -EINVAL; } - if ((port_type[0] == MLX4_PORT_TYPE_ETH) && - (port_type[1] == MLX4_PORT_TYPE_IB)) { - mlx4_err(dev, "eth-ib configuration is not supported.\n"); + if ((port_type[0] == MLX4_PORT_TYPE_ETH && + port_type[1] != MLX4_PORT_TYPE_ETH) || + (port_type[1] == MLX4_PORT_TYPE_IB && + port_type[0] != MLX4_PORT_TYPE_IB)) { + mlx4_err(dev, "Given port configuration is not supported.\n"); return -EINVAL; } return 0; @@ -128,8 +131,10 @@ static void mlx4_str2port_type(char **port_str, for (i = 0; i < MLX4_MAX_PORTS; i++) { if (!strcmp(port_str[i], "eth")) port_type[i] = MLX4_PORT_TYPE_ETH; - else + else if (!strcmp(port_str[i], "ib")) port_type[i] = MLX4_PORT_TYPE_IB; + else + port_type[i] = MLX4_PORT_TYPE_AUTO; } } @@ -228,11 +233,17 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) mlx4_warn(dev, "FW doesn't support Multi Protocol, " "loading IB only\n"); dev->caps.port_type[i] = MLX4_PORT_TYPE_IB; + dev->caps.possible_type[i] = MLX4_PORT_TYPE_IB; continue; } - if (port_type[i-1] & dev_cap->supported_port_types[i]) - dev->caps.port_type[i] = port_type[i-1]; - else { + mlx4_priv(dev)->sense.sense_allowed[i] = + (dev_cap->supported_port_types[i] >> 1) & 1; + if (port_type[i-1] & dev_cap->supported_port_types[i]) { + dev->caps.possible_type[i] = port_type[i-1]; + dev->caps.port_type[i] = + (port_type[i-1] == MLX4_PORT_TYPE_ETH) ? + port_type[i-1] : MLX4_PORT_TYPE_IB; + } else { mlx4_err(dev, "Requested port type for port %d " "not supported by HW\n", i); return -ENODEV; @@ -270,7 +281,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) /* Changes the port configuration of the device. * Every user of this function must hold the port lock */ -static int mlx4_change_port_types(struct mlx4_dev *dev, +int mlx4_change_port_types(struct mlx4_dev *dev, enum mlx4_port_type *port_types) { int err = 0; @@ -281,10 +292,13 @@ static int mlx4_change_port_types(struct mlx4_dev *dev, if (port_types[port] != dev->caps.port_type[port + 1]) { change = 1; dev->caps.port_type[port + 1] = port_types[port]; + if (dev->caps.possible_type[port + 1] != MLX4_PORT_TYPE_AUTO) + dev->caps.possible_type[port + 1] = port_types[port]; } } if (change) { mlx4_unregister_device(dev); + flush_workqueue(mlx4_priv(dev)->sense.sense_wq); for (port = 1; port <= dev->caps.num_ports; port++) { mlx4_CLOSE_PORT(dev, port); err = mlx4_SET_PORT(dev, port); @@ -308,9 +322,15 @@ static ssize_t show_port_type(struct device *dev, struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info, port_attr); struct mlx4_dev *mdev = info->dev; + char type[8]; - sprintf(buf, "%s\n", (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_IB) ? + sprintf(type, "%s", (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_IB) ? "ib" : "eth"); + if (mdev->caps.possible_type[info->port] == MLX4_PORT_TYPE_AUTO) + sprintf(buf, "auto (%s)\n", type); + else + sprintf(buf, "%s\n", type); + return strlen(buf); } @@ -330,6 +350,8 @@ static ssize_t set_port_type(struct device *dev, info->tmp_type = MLX4_PORT_TYPE_IB; else if (!strcmp(buf, "eth\n")) info->tmp_type = MLX4_PORT_TYPE_ETH; + else if (!strcmp(buf, "auto\n")) + info->tmp_type = MLX4_PORT_TYPE_AUTO; else { mlx4_err(mdev, "%s is not supported port type\n", buf); return -EINVAL; @@ -344,8 +366,13 @@ static ssize_t set_port_type(struct device *dev, if (err) goto out; - for (i = 1; i <= mdev->caps.num_ports; i++) - priv->port[i].tmp_type = 0; + for (i = 0; i < mdev->caps.num_ports; i++) { + mdev->caps.possible_type[i + 1] = types[i]; + if (types[i] == MLX4_PORT_TYPE_AUTO) + types[i] = mdev->caps.port_type[i + 1]; + + priv->port[i + 1].tmp_type = 0; + } err = mlx4_change_port_types(mdev, types); @@ -970,6 +997,31 @@ static void mlx4_cleanup_port_info(struct mlx4_port_info *info) device_remove_file(&info->dev->pdev->dev, &info->port_attr); } +static void mlx4_set_actual_type(struct mlx4_dev *dev) +{ + enum mlx4_port_type stype[dev->caps.num_ports]; + int i; + + if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) + return; + + for (i = 1; i <= dev->caps.num_ports; i++) { + stype[i-1] = 0; + if (mlx4_priv(dev)->sense.sense_allowed[i] && + dev->caps.possible_type[i] == MLX4_PORT_TYPE_AUTO) { + if (mlx4_SENSE_PORT(dev, i, &stype[i-1])) + return; + } + if (!stype[i-1]) + stype[i-1] = dev->caps.port_type[i]; + } + + if (!mlx4_check_port_params(dev, stype)) { + for (i = 1; i <= dev->caps.num_ports; i++) + dev->caps.port_type[i] = stype[i-1]; + } +} + static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) { struct mlx4_priv *priv; @@ -1093,14 +1145,23 @@ static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id) goto err_port; } - err = mlx4_register_device(dev); + mlx4_set_actual_type(dev); + + err = mlx4_sense_init(dev); if (err) goto err_port; + err = mlx4_register_device(dev); + if (err) + goto err_sense; + pci_set_drvdata(pdev, dev); return 0; +err_sense: + mlx4_sense_cleanup(dev); + err_port: for (port = 1; port <= dev->caps.num_ports; port++) mlx4_cleanup_port_info(&priv->port[port]); @@ -1160,12 +1221,11 @@ static void mlx4_remove_one(struct pci_dev *pdev) if (dev) { mlx4_unregister_device(dev); - + mlx4_sense_cleanup(dev); for (p = 1; p <= dev->caps.num_ports; p++) { mlx4_cleanup_port_info(&priv->port[p]); mlx4_CLOSE_PORT(dev, p); } - mlx4_cleanup_mcg_table(dev); mlx4_cleanup_qp_table(dev); mlx4_cleanup_srq_table(dev); @@ -1222,7 +1282,8 @@ static int __init mlx4_verify_params(void) for (i = 0; i < MLX4_MAX_PORTS; ++i) { if (strcmp(port_type_arr[i], "eth") && - strcmp(port_type_arr[i], "ib")) { + strcmp(port_type_arr[i], "ib") && + strcmp(port_type_arr[i], "auto")) { printk(KERN_WARNING "mlx4_core: bad port_type for " "port %d: %s\n", i, port_type_arr[i]); return -1; diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 4c18e36..be3496b 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -40,6 +40,8 @@ #include #include #include +#include +#include #include #include @@ -287,6 +289,14 @@ struct mlx4_port_info { struct mlx4_vlan_table vlan_table; }; +struct mlx4_sense { + struct mlx4_dev *dev; + u8 do_sense_port[MLX4_MAX_PORTS + 1]; + u8 sense_allowed[MLX4_MAX_PORTS + 1]; + struct delayed_work sense_poll; + struct workqueue_struct *sense_wq; +}; + struct mlx4_priv { struct mlx4_dev dev; @@ -316,6 +326,7 @@ struct mlx4_priv { struct mlx4_uar driver_uar; void __iomem *kar; struct mlx4_port_info port[MLX4_MAX_PORTS + 1]; + struct mlx4_sense sense; spinlock_t port_lock; }; @@ -324,6 +335,9 @@ static inline struct mlx4_priv *mlx4_priv(struct mlx4_dev *dev) return container_of(dev, struct mlx4_priv, dev); } +#define MIN_SENCE_RANGE HZ +#define MAX_SENCE_RANGE (HZ * 3) + u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap); void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj); u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align); @@ -387,6 +401,16 @@ void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type); void mlx4_handle_catas_err(struct mlx4_dev *dev); +void mlx4_start_sense(struct mlx4_dev *dev); +void mlx4_stop_sense(struct mlx4_dev *dev); +int mlx4_sense_init(struct mlx4_dev *dev); +void mlx4_sense_cleanup(struct mlx4_dev *dev); +int mlx4_SENSE_PORT(struct mlx4_dev *dev, int port, enum mlx4_port_type *type); +int mlx4_check_port_params(struct mlx4_dev *dev, + enum mlx4_port_type *port_type); +int mlx4_change_port_types(struct mlx4_dev *dev, + enum mlx4_port_type *port_types); + void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table); void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table); diff --git a/drivers/net/mlx4/sense.c b/drivers/net/mlx4/sense.c new file mode 100644 index 0000000..999dcce --- /dev/null +++ b/drivers/net/mlx4/sense.c @@ -0,0 +1,148 @@ +/* + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#include +#include + +#include + +#include "mlx4.h" + +static inline unsigned long mlx4_gen_sense_tout(void) +{ + return MIN_SENCE_RANGE + random32() % + (MAX_SENCE_RANGE - MIN_SENCE_RANGE); +} + +int mlx4_SENSE_PORT(struct mlx4_dev *dev, int port, enum mlx4_port_type *type) +{ + u64 out_param; + int err = 0; + + err = mlx4_cmd_imm(dev, 0, &out_param, port, 0, + MLX4_CMD_SENSE_PORT, MLX4_CMD_TIME_CLASS_B); + if (err) { + mlx4_err(dev, "Sense command failed for port: %d\n", port); + return err; + } + + if (out_param > 2) { + mlx4_err(dev, "Sense returned illegal value: 0x%llx\n", out_param); + return EINVAL; + } + + *type = out_param; + return 0; +} + +static void mlx4_sense_port(struct work_struct *work) +{ + struct delayed_work *delay = container_of(work, struct delayed_work, work); + struct mlx4_sense *sense = container_of(delay, struct mlx4_sense, + sense_poll); + struct mlx4_dev *dev = sense->dev; + struct mlx4_priv *priv = mlx4_priv(dev); + enum mlx4_port_type stype[dev->caps.num_ports]; + int err = 0; + int i; + + spin_lock(&priv->port_lock); + for (i = 1; i <= dev->caps.num_ports; i++) { + stype[i-1] = 0; + if (sense->do_sense_port[i] && sense->sense_allowed[i] && + dev->caps.possible_type[i] == MLX4_PORT_TYPE_AUTO) { + err = mlx4_SENSE_PORT(dev, i, &stype[i-1]); + if (err) + goto sense_again; + } + if (!stype[i-1]) + stype[i-1] = dev->caps.port_type[i]; + } + + if (mlx4_check_port_params(dev, stype)) + goto sense_again; + + if (mlx4_change_port_types(dev, stype)) + mlx4_err(dev, "Failed to change port_types\n"); + +sense_again: + spin_unlock(&priv->port_lock); + queue_delayed_work(sense->sense_wq , &sense->sense_poll, + mlx4_gen_sense_tout()); +} + + +void mlx4_start_sense(struct mlx4_dev *dev) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + struct mlx4_sense *sense = &priv->sense; + + if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) + return; + + queue_delayed_work(sense->sense_wq , &sense->sense_poll, + mlx4_gen_sense_tout()); +} + + +void mlx4_stop_sense(struct mlx4_dev *dev) +{ + cancel_delayed_work(&mlx4_priv(dev)->sense.sense_poll); +} + +int mlx4_sense_init(struct mlx4_dev *dev) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + struct mlx4_sense *sense = &priv->sense; + int port; + + sense->dev = dev; + sense->sense_wq = create_singlethread_workqueue("mlx4_sense"); + if (!sense->sense_wq) + return -ENOMEM; + + for (port = 1; port <= dev->caps.num_ports; port++) + sense->do_sense_port[port] = 1; + + INIT_DELAYED_WORK(&sense->sense_poll, mlx4_sense_port); + + return 0; +} + +void mlx4_sense_cleanup(struct mlx4_dev *dev) +{ + mlx4_stop_sense(dev); + flush_workqueue(mlx4_priv(dev)->sense.sense_wq); + destroy_workqueue(mlx4_priv(dev)->sense.sense_wq); +} + diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h index cf9c679..0f82293 100644 --- a/include/linux/mlx4/cmd.h +++ b/include/linux/mlx4/cmd.h @@ -55,6 +55,7 @@ enum { MLX4_CMD_CLOSE_PORT = 0xa, MLX4_CMD_QUERY_HCA = 0xb, MLX4_CMD_QUERY_PORT = 0x43, + MLX4_CMD_SENSE_PORT = 0x4d, MLX4_CMD_SET_PORT = 0xc, MLX4_CMD_ACCESS_DDR = 0x2e, MLX4_CMD_MAP_ICM = 0xffa, diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 3496a27..9e2fd41 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -151,8 +151,9 @@ enum qp_region { }; enum mlx4_port_type { - MLX4_PORT_TYPE_IB = 1 << 0, - MLX4_PORT_TYPE_ETH = 1 << 1, + MLX4_PORT_TYPE_IB = 1, + MLX4_PORT_TYPE_ETH = 2, + MLX4_PORT_TYPE_AUTO = 3 }; enum { @@ -226,6 +227,7 @@ struct mlx4_caps { int log_num_vlans; int log_num_prios; enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1]; + enum mlx4_port_type possible_type[MLX4_MAX_PORTS + 1]; int reserved_fexch_mpts_base; int num_fexch_mpts; }; -- 1.5.4 From mashirle at us.ibm.com Wed Aug 20 07:09:46 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 20 Aug 2008 07:09:46 -0700 Subject: [ofa-general] [PATCH] libibverbs: Replace eieio with sync for PPC wmb() In-Reply-To: <1216764410.31058.11.camel@IBM-29AB850785D.beaverton.ibm.com> References: <1216764410.31058.11.camel@IBM-29AB850785D.beaverton.ibm.com> Message-ID: <1219241386.27003.0.camel@IBM-29AB850785D.beaverton.ibm.com> Roland, Have you had time reviewed below patch for libibverbs yet? Thanks Shirley On Tue, 2008-07-22 at 15:06 -0700, Shirley Ma wrote: > Hello Roland, > > We have found that the wmb() for PPC was incorrect defined as eieio > instruction in libibverbs. Instruction eieio applies either in a pure > I/O memory or a pure system memory. In the situation where the device > drivers use the d_map kernel services to share a portion of system > memory with an I/O adapter, we need to use sync() instead. See below > link for reference. > > http://www.ibm.com/developerworks/eserver/articles/powerpc.html > > Signed-off-by: Shirley Ma > > ------- > include/infiniband/arch.h | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/include/infiniband/arch.h b/include/infiniband/arch.h > index 6931bfc..d3e356f 100644 > --- a/include/infiniband/arch.h > +++ b/include/infiniband/arch.h > @@ -98,7 +98,7 @@ static inline uint64_t ntohll(uint64_t x) { return x; } > > #define mb() asm volatile("sync" ::: "memory") > #define rmb() mb() > -#define wmb() asm volatile("eieio" ::: "memory") > +#define wmb() mb() > #define wc_wmb() wmb() > > #elif defined(__sparc_v9__) > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at dev.mellanox.co.il Wed Aug 20 07:58:42 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 20 Aug 2008 17:58:42 +0300 Subject: [ofa-general] [PATCH] opensm/osm_state_mgr.c: fixing some typos Message-ID: <48AC3122.3090803@dev.mellanox.co.il> Cosmetics - fixing some typos in comments Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_state_mgr.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 287f015..b4eb87b 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -1026,7 +1026,7 @@ static void do_sweep(osm_sm_t * sm) /* * Need to force re-write of sm_base_lid to all ports * to do that we want all the ports to be considered - * foriegn + * foreign */ __osm_state_mgr_clean_known_lids(sm); @@ -1219,7 +1219,7 @@ _repeat_discovery: /* At this point we need to check the consistency of * the port_lid_tbl under the subnet. There might be - * errors in it if PortInfo Set reqeusts didn't reach + * errors in it if PortInfo Set requests didn't reach * their destination. */ __osm_state_mgr_check_tbl_consistency(sm); @@ -1250,7 +1250,7 @@ _repeat_discovery: } /* - * The LINK_PORTS state is required since we can not count on + * The LINK_PORTS state is required since we cannot count on * the port state change MADs to succeed. This is an artifact * of the spec defining state change from state X to state X * as an error. The hardware then is not required to process -- 1.5.1.4 From yossi.openib at gmail.com Wed Aug 20 09:20:45 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Wed, 20 Aug 2008 19:20:45 +0300 Subject: [ofa-general] ***SPAM*** [PATCH] libsdp: enable fallback to TCP for nonblocking sockets Message-ID: <48AC445D.2050704@gmail.com> Enable falling back to tcp even if the socket is nonblocking by doing the sdp connect() in blocking mode. This way, we know if sdp connection fails and can retry with tcp. Signed-off-by: Yossi Etigin --- src/port.c | 38 +++++++++++++------------------------- 1 file changed, 13 insertions(+), 25 deletions(-) Index: b/src/port.c =================================================================== --- a/src/port.c 2008-08-18 22:32:33.000000000 +0300 +++ b/src/port.c 2008-08-20 19:09:43.000000000 +0300 @@ -1423,34 +1423,10 @@ connect( program_invocation_short_name, fd, shadow_fd, serv_sin->sin_family, buf, ntohs( serv_sin->sin_port ) ); - fopts = _socket_funcs.fcntl(fd, F_GETFL); - __sdp_log( 1, "CONNECT: fd <%d> opts are <0x%x>\n", - fd, fopts); /* obtain the target address family */ target_family = __sdp_match_connect( serv_addr, addrlen ); - if ( ( fopts & O_NONBLOCK ) && - ( target_family == USE_BOTH ) && - ( shadow_fd != -1 )) { - static int print_once = 1; - - if ( print_once ) { - print_once = 0; - __sdp_log( 9, "CONNECT: libsdp does not support async connect in BOTH, moving to SDP only\n"); - } - - target_family = USE_SDP; - dup_ret = replace_fd_with_its_shadow( fd ); - if ( dup_ret < 0 ) { - __sdp_log( 9, "Error connect: " - "failed to dup2 shadow into orig fd:%d\n", fd ); - ret = dup_ret; - goto done; - } - shadow_fd = -1; - } - /* if we do not have a shadow - just do the work */ if ( shadow_fd == -1 ) { if ( get_is_sdp_socket( fd ) ) { @@ -1498,6 +1474,15 @@ connect( #endif __sdp_log( 1, "CONNECT: connecting SDP fd:%d\n", shadow_fd ); + + /* make the socket blocking on shadow SDP */ + fopts = _socket_funcs.fcntl(shadow_fd, F_GETFL); + if ( ( target_family == USE_BOTH ) && ( fopts & O_NONBLOCK ) ) { + __sdp_log( 1, "CONNECT: shadow_fd <%d> will be blocking during connect\n", + shadow_fd); + _socket_funcs.fcntl(shadow_fd, F_SETFL, fopts & (~O_NONBLOCK)); + } + ret = _socket_funcs.connect( shadow_fd, ( struct sockaddr * )&sdp_addr, sizeof sdp_addr ); @@ -1509,6 +1494,9 @@ connect( __sdp_log( 7, "CONNECT: connected SDP fd:%d to:%s port %d\n", fd, buf, ntohs( serv_sin->sin_port ) ); } + + /* restore socket options */ + _socket_funcs.fcntl(shadow_fd, F_SETFL, fopts); } /* if target is SDP or we succeeded we need to dup SDP fd into TCP fd */ @@ -1537,7 +1525,7 @@ connect( __sdp_log( 7, "CONNECT: connected TCP fd:%d to:%s port %d\n", fd, buf, ntohs( serv_sin->sin_port ) ); - if ( ( target_family == USE_TCP ) || ( ret >= 0 ) ) { + if ( ( target_family == USE_TCP ) || ( ret >= 0 ) || (errno == EINPROGRESS) ) { if ( cleanup_shadow( fd ) < 0 ) __sdp_log( 9, "Error connect: failed to cleanup shadow for fd:%d\n", fd ); From rdreier at cisco.com Wed Aug 20 09:36:36 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 20 Aug 2008 09:36:36 -0700 Subject: [ofa-general] Re: [PATCH] libibverbs: Replace eieio with sync for PPC wmb() In-Reply-To: <1216764410.31058.11.camel@IBM-29AB850785D.beaverton.ibm.com> (Shirley Ma's message of "Tue, 22 Jul 2008 15:06:50 -0700") References: <1216764410.31058.11.camel@IBM-29AB850785D.beaverton.ibm.com> Message-ID: Sorry, I lost this initially. Anyway, applied. From yossi.openib at gmail.com Wed Aug 20 09:51:24 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Wed, 20 Aug 2008 19:51:24 +0300 Subject: [ofa-general] ***SPAM*** [PATCH] libsdp: write fcntl argument in debug prints Message-ID: <48AC4B8C.2020909@gmail.com> Log the actual argument, instead of 0. Signed-off-by: Yossi Etigin --- src/port.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: b/src/port.c =================================================================== --- a/src/port.c 2008-08-18 22:31:24.000000000 +0300 +++ b/src/port.c 2008-08-18 22:31:55.000000000 +0300 @@ -745,16 +745,16 @@ fcntl( shadow_fd = get_shadow_fd_by_fd( fd ); - __sdp_log( 2, "FCNTL: <%s:%d:%d> command <%d> argument <%d>\n", - program_invocation_short_name, fd, shadow_fd, cmd, 0 ); + __sdp_log( 2, "FCNTL: <%s:%d:%d> command <%d> argument <%p>\n", + program_invocation_short_name, fd, shadow_fd, cmd, arg ); ret = _socket_funcs.fcntl( fd, cmd, arg ); if ( ( ret >= 0 ) && ( -1 != shadow_fd ) ) { sret = _socket_funcs.fcntl( shadow_fd, cmd, arg ); if ( sret < 0 ) { __sdp_log( 9, "Error fcntl:" - " <%d> calling fcntl(%d, %d, %x) for SDP socket. Closing it.\n", - shadow_fd, cmd, 0, errno ); + " <%d> calling fcntl(%d, %d, %p) for SDP socket. Closing it.\n", + shadow_fd, cmd, arg, errno ); cleanup_shadow( fd ); } } From changquing.tang at hp.com Wed Aug 20 10:42:15 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 20 Aug 2008 17:42:15 +0000 Subject: [ofa-general] uDAPL private data size issue Message-ID: <58C6777539C300489D145B0F8E29C32815F87601C2@GVW0673EXC.americas.hpqcorp.net> Hi, uDAPL Developers: I have a system running uDAPL 2.0, the dat_ia_query() return the provider attributes, the 'max_private_data_size' is only 48. The standard 2.0 says that private data size is at least 64 bytes: Is there any way to tune private data size to 128 bytes, or at least 64 bytes ? Thanks. CQ Tang, HP-MPI team From sashak at voltaire.com Wed Aug 20 10:43:25 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 20 Aug 2008 20:43:25 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_state_mgr.c: fixing some typos In-Reply-To: <48AC3122.3090803@dev.mellanox.co.il> References: <48AC3122.3090803@dev.mellanox.co.il> Message-ID: <20080820174325.GD29440@sashak.voltaire.com> On 17:58 Wed 20 Aug , Yevgeny Kliteynik wrote: > Cosmetics - fixing some typos in comments > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From truelove at array.ca Wed Aug 20 11:38:13 2008 From: truelove at array.ca (Steven Truelove) Date: Wed, 20 Aug 2008 14:38:13 -0400 Subject: [ofa-general] ConnectX IB HCA with Ubuntu 8.04 Message-ID: <48AC6495.1040807@array.ca> Hi, I am trying to get Infiniband up and running on a Ubuntu 8.04 system. I can load the modules and see plenty of infiniband content under /sys/class, but when I try to run ibv_devices, I get this error: libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 Here is a listing from lspci: 07:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0 2.5GT/s] (rev a0) And from dmesg: [ 107.496097] mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007) [ 107.496101] mlx4_core: Initializing 0000:07:00.0 ls -al from /sys/class/infini* : /sys/class/infiniband: total 0 drwxr-xr-x 3 root root 0 2008-08-20 14:25 . drwxr-xr-x 31 root root 0 2008-08-20 14:26 .. drwxr-xr-x 3 root root 0 2008-08-20 14:25 mlx4_0 /sys/class/infiniband_cm: total 0 drwxr-xr-x 3 root root 0 2008-08-20 14:26 . drwxr-xr-x 31 root root 0 2008-08-20 14:26 .. -r--r--r-- 1 root root 4096 2008-08-20 14:35 abi_version drwxr-xr-x 2 root root 0 2008-08-20 14:26 ucm0 /sys/class/infiniband_mad: total 0 drwxr-xr-x 6 root root 0 2008-08-20 14:26 . drwxr-xr-x 31 root root 0 2008-08-20 14:26 .. -r--r--r-- 1 root root 4096 2008-08-20 14:31 abi_version drwxr-xr-x 2 root root 0 2008-08-20 14:26 issm0 drwxr-xr-x 2 root root 0 2008-08-20 14:26 issm1 drwxr-xr-x 2 root root 0 2008-08-20 14:26 umad0 drwxr-xr-x 2 root root 0 2008-08-20 14:26 umad1 /sys/class/infiniband_verbs: total 0 drwxr-xr-x 3 root root 0 2008-08-20 14:26 . drwxr-xr-x 31 root root 0 2008-08-20 14:26 .. -r--r--r-- 1 root root 4096 2008-08-20 14:26 abi_version drwxr-xr-x 2 root root 0 2008-08-20 14:26 uverbs0 root at msh-new:/sys/class/infiniband_verbs/uverbs0# cat /sys/class/infiniband/mlx4_0/board_id SM_1021000001 root at msh-new:/sys/class/infiniband_verbs/uverbs0# cat /sys/class/infiniband/mlx4_0/hca_type MT25418 root at msh-new:/sys/class/infiniband_verbs/uverbs0# cat /sys/class/infiniband/mlx4_0/hw_rev 0 root at msh-new:/sys/class/infiniband_verbs/uverbs0# cat /sys/class/infiniband/mlx4_0/fw_ver 2.3.0 Assistance would be very much appreciated. Let me know if there is more information I should provide. Thanks, Steven Truelove From changquing.tang at hp.com Wed Aug 20 11:52:42 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 20 Aug 2008 18:52:42 +0000 Subject: [ofa-general] uDAPL private data size issue In-Reply-To: References: <58C6777539C300489D145B0F8E29C32815F87601C2@GVW0673EXC.americas.hpqcorp.net> Message-ID: <58C6777539C300489D145B0F8E29C32815F876028A@GVW0673EXC.americas.hpqcorp.net> How to switch to scm ? Here is my /etc/dat.conf and libdapl*: mpixbl09:/usr/lib64:cat /etc/dat.conf # # DAT 1.2 and 2.0 configuration file # # Each entry should have the following fields: # # \ # # # For the uDAPL cma provder, specify as one of the following: # network address, network hostname, or netdev name and 0 for port # # Simple (OpenIB-cma) default with netdev name provided first on list # to enable use of same dat.conf version on all nodes # # 1.2 and 2.0 examples for multiple interfaces, IPoIB HA failover, bonding: # OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" "" OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" "" OpenIB-cma-2 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib2 0" "" OpenIB-cma-3 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib3 0" "" OpenIB-bond u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "bond0 0" "" ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" "" ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" "" ofa-v2-ib2 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib2 0" "" ofa-v2-ib3 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib3 0" "" ofa-v2-bond u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "bond0 0" "" mpixbl09:/usr/lib64: mpixbl09:/usr/lib64:ls libdapl* libdaplcma.so libdaplcma.so.1.0.2 libdaplofa.so libdaplofa.so.2.0.0 libdaplcma.so.1 libdaplofa.a libdaplofa.so.2 mpixbl09:/usr/lib64: > -----Original Message----- > From: Davis, Arlin R [mailto:arlin.r.davis at intel.com] > Sent: Wednesday, August 20, 2008 1:31 PM > To: Tang, Changqing; general at lists.openfabrics.org > Cc: Sean Hefty > Subject: RE: [ofa-general] uDAPL private data size issue > > > >Hi, uDAPL Developers: > > > > I have a system running uDAPL 2.0, the dat_ia_query() return > >the provider attributes, the 'max_private_data_size' is only 48. > > > > The standard 2.0 says that private data size is at least 64 > >bytes: > > > > Is there any way to tune private data size to 128 > bytes, or at > >least 64 bytes ? > > It actually should report 56 bytes.. > > IB cm supports 92 bytes on requests but rdma_cm steals 36 bytes as > follow: > > union cma_ip_addr { > struct in6_addr ip6; > struct { > __u32 pad[3]; > __u32 addr; > } ip4; > }; > > struct cma_hdr { > u8 cma_version; > u8 ip_version; /* IP version: 7:4 */ > __u16 port; > union cma_ip_addr src_addr; > union cma_ip_addr dst_addr; > }; > > I guess that makes the uDAPL cma provider non-compliant > unless there is a way for rdma_cm to give back some of IB CM > request private data area. > > Sean, is there anything that can be done here? > > CQ Tang, can you use the uDAPL scm provider instead of cma? > > -arlin > > From rdreier at cisco.com Wed Aug 20 12:21:54 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 20 Aug 2008 12:21:54 -0700 Subject: [ofa-general] ConnectX IB HCA with Ubuntu 8.04 In-Reply-To: <48AC6495.1040807@array.ca> (Steven Truelove's message of "Wed, 20 Aug 2008 14:38:13 -0400") References: <48AC6495.1040807@array.ca> Message-ID: > I am trying to get Infiniband up and running on a Ubuntu 8.04 > system. I can load the modules and see plenty of infiniband content > under /sys/class, but when I try to run ibv_devices, I get this error: > > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 That's because you need to install the device-specific userspace driver ;) Add my PPA to your software sources: deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main and do "aptitude install libmlx4-1" and you should be all set. (the libmlx4 packages are also in the 8.10/Intrepid archive already). Let me know if you have any issues. - R. From cameron at harr.org Wed Aug 20 12:21:57 2008 From: cameron at harr.org (Cameron Harr) Date: Wed, 20 Aug 2008 13:21:57 -0600 Subject: [ofa-general] Re: SRP f/s data corruption In-Reply-To: <48AB58D1.7090904@harr.org> References: <48AB58D1.7090904@harr.org> Message-ID: <48AC6ED5.4020603@harr.org> Well, after more testing, it turns out this only happens on my Fedora 8 boxes (not on CentOS) and on the flash drive I'm testing - but not on a standard hard drive. Very interesting, but I'll move to CentOS on my Fedora boxes. Cameron Cameron Harr wrote: > Hello, > I'm seeing data corruption on an SRP-exported device and I'm fishing > for any suggestions. I've seen the corruption in several ways, but > here's a really simple way to reproduce it: > > I format /dev/md0 with ext3 on a host (medusa) and export md0 via SRP. > I mount it on the initiator (harpie), copy over a large file and > verify that it's md5sum is the same as the original. Then I > unmount/remount and see that the md5sum is different. > > [root at harpie ~]# mount /mnt/medusa/ > [root at harpie ~]# cp /usr/src/OFED-1.3.1.tgz /mnt/medusa/ > [root at harpie ~]# md5sum /usr/src/OFED-1.3.1.tgz > 69fe510fc78a39b627713cfb49ad4ca3 /usr/src/OFED-1.3.1.tgz > [root at harpie ~]# md5sum /mnt/medusa/OFED-1.3.1.tgz > 69fe510fc78a39b627713cfb49ad4ca3 /mnt/medusa/OFED-1.3.1.tgz > [root at harpie ~]# umount /mnt/medusa/ > [root at harpie ~]# mount /mnt/medusa/ > [root at harpie ~]# md5sum /mnt/medusa/OFED-1.3.1.tgz > 5b761a931bf8fa7273cccc505ff13121 /mnt/medusa/OFED-1.3.1.tgz > > As a side note, right after I copy over the file and see that it has > the correct md5sum, I can mount the same device read only on the > target server and see the file, but it has a different md5sum. > > In searching, I saw this problem here and tried dropping scst_threads > to 1, to no avail: > http://osdir.com/ml/windows.devel.drivers.openib/2007-12/msg00050.html > > Ideas? > Thanks, > Cameron > From jeff at splitrockpr.com Wed Aug 20 12:53:31 2008 From: jeff at splitrockpr.com (Jeffrey Scott) Date: Wed, 20 Aug 2008 12:53:31 -0700 Subject: [ofa-general] Don't miss the IBTA Technical Forum '08! Message-ID: Hello OFA Members. We are rapidly approaching this year's IBTA Technical Forum; it's just four weeks away! The theme this year is "InfiniBand and the Enterprise Data Center" and the IBTA has put together a compelling agenda with end-user presentations from General Motors, France Telecom and others, as well as an analyst presentation from Gartner and an interactive panel discussion on the future of InfiniBand. Please see below for more information; register now to receive the early bird discount. Date: Monday, September 15, 2008 Time: 8am - 5pm with networking reception immediately following Location: Harrah's Las Vegas Register: www.regonline.com/IBTATechForum08 Rate: Early bird rate is $249; after September 1 the rate increases to $299 Agenda: http://www.infinibandta.org/events/IBTATechForum08_ The IBTA needs your help spreading the word! The OFA is one of the sponsors for the networking reception taking place immediately following the technical forum. We would like to see the OFA well represented, and we'd like your help in spreading the word about this event to your colleagues/vendors/partners/customers. We hope to see all of you in Las Vegas! If you have any questions, please contact Samantha Spears at 206-322-1167 x115 or samanthas at owenmedia.com. ----------------------------------- Jeffrey Scott Split Rock Communications 408-884-4017 408-348-3651 Mobile 408-884-3900 Fax www.SplitRockPR.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jeffrey.C.Becker at nasa.gov Wed Aug 20 17:48:20 2008 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Wed, 20 Aug 2008 17:48:20 -0700 Subject: [ofa-general] STOP the onslaught of EWG spam In-Reply-To: References: Message-ID: <48ACBB54.3010808@nasa.gov> Hi all. I'm very sorry all the SPAM happened. Unfortunately, I left on vacation just before it started, and was away from e-mail. I just got back this afternoon. A big thank you to Jeff Squyres for stepping in. I'll see about upgrading Mailman to the latest version on the current OFA server. We are in the midst of transitioning to the new server (which has a much more recent version of Mailman, and will be upgraded on a timely basis). Unfortunately, I am very busy trying to get NFS-RDMA backports ready for OFED 1.4, and as I believe this is a higher priority, I don't really have much time to tend to the server switchover. If anyone would like to help, I'd greatly appreciate it. In fact, if someone else wants to replace me as admin (I've been doing it over a year now), I wouldn't mind (although I'm perfectly happy to continue - just much busier than when I started). Please let me know. Thanks, and again, apologies for the SPAM. -jeff Jeff Squyres wrote: > The EWG list has gotten spam bombed over the last few hours. I lost > count at 500+ spams in my inbox. > > I therefore logged into openfabrics.org and changed the site-wide > password for Mailman (I have notified Jeff Becker of the new > password). I then changed the EWG list to silently discard all > non-member posts. Since I didn't know if other OF lists were being > spam-bombed, I did the same for all OF lists as well. > > The spam onslaught has now stopped. > > I also notice that our mailmain installation is hopelessly out of > date; it's v2.1.5 and the current version (including several important > security fixes since v2.1.5) is v2.1.11. Someone needs to fix this ASAP. > From alevchuk at gmail.com Wed Aug 20 17:55:14 2008 From: alevchuk at gmail.com (Aleksandr Levchuk) Date: Wed, 20 Aug 2008 17:55:14 -0700 Subject: ***SPAM*** Re: [ofa-general] Problem with ConnectX HBA Message-ID: >> >>>>> "Tziporet" == Tziporet Koren writes: >> >> Tziporet> Roland Fehrenbacher wrote: >> >> Hi, >> >> >> >> when running MPI codes, we have the following error messages >> >> coming from some of our servers running 2.6.22.16 with kernel >> >> modules from ofa_kernel-1.2.5.4: >> >> >> >> mlx4_core 0000:08:00.0: SW2HW_MPT failed (-16) >> >> >> >> The communication on the corresponding machines is completely >> >> blocked, and ibstat is just hanging. >> >> >> >> Any idea what could be wrong? Just for additional info: When >> >> running the kernel with the original 2.6.22 drivers, I had >> >> these kind of error messages at a much higher rate. >> >> >> >> >> >> >> Tziporet> What is the FW version you use? >> >> # ibstat >> CA 'mlx4_0' >> CA type: MT25418 >> Number of ports: 2 >> Firmware version: 2.3.0 >> Hardware version: 0 >> Node GUID: 0x0002c9020025a69c >> System image GUID: 0x0002c9020025a69f >> Port 1: >> State: Active >> Physical state: LinkUp >> Rate: 20 >> Base lid: 199 >> LMC: 0 >> SM lid: 1 >> Capability mask: 0x02510868 >> Port GUID: 0x0002c9020025a69d >> >> >> >> Tziporet> What is the type of machine used? >> >> It is a dual Xeon (Quad core) on a 5000P chipset board. >> >> Tziporet> Can you send us description how to reproduce? >> >> I started a 100 node / 8 core = 800 processes mvapich job >> (linpack). The issue occured after about 1 hour of runtime. A 50 node >> / 8 core = 400 processes mvapich job ran fine several times for more >> than 36 hours (including the node on which this issue occured now). >> >> Roland Hi Roland, I am having the same problem. After running an MPI job over mvapich2-1.0.3 on 8 nodes (64 CPU cores total) my application crashes with the following error: [0] Abort: [] Got completion with error 12, vendor code=81, dest rank=61 at line 546 in file ibv_channel_manager.c After this an HCA on one of the nodes goes down and the node behaves just as you described: print a bunch of "mlx4_core ... SW2HW_MPT failed" to /var/log/kern.log I have the 2.6.24-etchnhalf.1-amd64 kernel, libibverbs 1.1.2, and librdmacm 1.0.7. My InfiniBand HCAs are same as your's (the hardware was put together by Verari): CA 'mlx4_0' CA type: MT25418 Number of ports: 2 Firmware version: 2.3.0 Hardware version: 0 Node GUID: 0x0002c9030000a910 System image GUID: 0x0002c9030000a913 Port 1: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510868 Port GUID: 0x0002c9030000a911 Port 2: State: Active Physical state: LinkUp Rate: 20 Base lid: 17 LMC: 0 SM lid: 16 Capability mask: 0x0251086a Port GUID: 0x0002c9030000a912 I am currently working on the approach described here: http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/Intel64Cluster/Doc/Compile.html#Compilers But, hacking out all the system(), fork(), or popen() in my application (NAMD2 with Charm++) is very hard. Another approach that I might attempt is to get my parallel application to run with RDMA bypassing MPI. That seems to be also possible, just by looking at the application's compilation options. Were you able to solve this problem? Alex -- -------------------------------------------- Aleksandr Levchuk University of California, Riverside 1-951-368-0004 -------------------------------------------- From aj.guillon at gmail.com Wed Aug 20 20:43:01 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Wed, 20 Aug 2008 23:43:01 -0400 Subject: [ofa-general] ***SPAM*** RDMA Operations and Endian-ness Message-ID: <9870a2060808202043k127f66abxe88bd84d47f50283@mail.gmail.com> Hey all, When I do RDMA operations between two nodes, I notice that atomic operations will automatically translate the values for endian-ness according to the spec. Is the same true with other RDMA operations, or do I have to translate an object representation once it has been transported? I found very little mention of endian-ness in the spec that did not relate to atomic values. Thanks! AJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanba at gmail.com Wed Aug 20 23:10:28 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 21 Aug 2008 09:10:28 +0300 Subject: [ofa-general] ***SPAM*** RDMA Operations and Endian-ness In-Reply-To: <9870a2060808202043k127f66abxe88bd84d47f50283@mail.gmail.com> References: <9870a2060808202043k127f66abxe88bd84d47f50283@mail.gmail.com> Message-ID: <2f3bf9a60808202310h141334fegb07ea4b17bf4daf0@mail.gmail.com> On Thu, Aug 21, 2008 at 6:43 AM, Adrien Guillon wrote: > Hey all, > > When I do RDMA operations between two nodes, I notice that atomic operations > will automatically translate the values for endian-ness according to the > spec. Is the same true with other RDMA operations, or do I have to > translate an object representation once it has been transported? I found > very little mention of endian-ness in the spec that did not relate to atomic > values. > Atomic operations transfer (64 bit) number, other RDMA operations transfer (undefined) blocks, so you have to take care of endianess issues in other RDMA operations. Dotan > Thanks! > > AJ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From kliteyn at dev.mellanox.co.il Thu Aug 21 07:25:15 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Thu, 21 Aug 2008 17:25:15 +0300 Subject: [ofa-general] [PATCH] opensm/osm_qos_policy.c: removing some log messages Message-ID: <48AD7ACB.3050803@dev.mellanox.co.il> Hi Sasha, Removing some log messages - all the info that they provide is printed in the osm_sa_(multi)path_record.c anyway. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_qos_policy.c | 24 +++--------------------- 1 files changed, 3 insertions(+), 21 deletions(-) diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c index ecc38b3..1b01524 100644 --- a/opensm/opensm/osm_qos_policy.c +++ b/opensm/opensm/osm_qos_policy.c @@ -997,36 +997,18 @@ static osm_qos_level_t * __qos_policy_get_qos_level_by_params( IN ib_net64_t comp_mask) { osm_qos_match_rule_t *p_qos_match_rule = NULL; - osm_qos_level_t *p_qos_level = NULL; - - OSM_LOG_ENTER(&p_qos_policy->p_subn->p_osm->log); if (!p_qos_policy) - goto Exit; + return NULL; p_qos_match_rule = __qos_policy_get_match_rule_by_params( p_qos_policy, service_id, qos_class, pkey, p_src_physp, p_dest_physp, comp_mask); if (p_qos_match_rule) - p_qos_level = p_qos_match_rule->p_qos_level; + return p_qos_match_rule->p_qos_level; else - p_qos_level = p_qos_policy->p_default_qos_level; - - OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, - "PathRecord request:" - "Src port 0x%016" PRIx64 ", " - "Dst port 0x%016" PRIx64 "\n", - cl_ntoh64(osm_physp_get_port_guid(p_src_physp)), - cl_ntoh64(osm_physp_get_port_guid(p_dest_physp))); - OSM_LOG(&p_qos_policy->p_subn->p_osm->log, OSM_LOG_DEBUG, - "Applying QoS Level %s (%s)\n", - p_qos_level->name, - (p_qos_level->use) ? p_qos_level->use : "no description"); - -Exit: - OSM_LOG_EXIT(&p_qos_policy->p_subn->p_osm->log); - return p_qos_level; + return p_qos_policy->p_default_qos_level; } /* __qos_policy_get_qos_level_by_params() */ /*************************************************** -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Thu Aug 21 07:29:54 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Thu, 21 Aug 2008 17:29:54 +0300 Subject: [ofa-general] [PATCH v4] opensm/osm_qos_policy.c: log matched qos criteria Message-ID: <48AD7BE2.7050402@dev.mellanox.co.il> Hi Sasha, Adding log message for matched criteria of the QoS policy rule. This patch addresses all the issues that were brought up during the previous versions: one log message for all the criteria, no string manipulation/sprintf. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_qos_policy.c | 36 +++++++++++++++++++++++++++++++----- 1 files changed, 31 insertions(+), 5 deletions(-) diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c index 1b01524..0052182 100644 --- a/opensm/opensm/osm_qos_policy.c +++ b/opensm/opensm/osm_qos_policy.c @@ -596,10 +596,19 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( { osm_qos_match_rule_t *p_qos_match_rule = NULL; cl_list_iterator_t list_iterator; + osm_log_t * p_log = &p_qos_policy->p_subn->p_osm->log; + + boolean_t matched_by_sguid = FALSE, + matched_by_dguid = FALSE, + matched_by_class = FALSE, + matched_by_sid = FALSE, + matched_by_pkey = FALSE; if (!cl_list_count(&p_qos_policy->qos_match_rules)) return NULL; + OSM_LOG_ENTER(p_log); + /* Go over all QoS match rules and find the one that matches the request */ list_iterator = cl_list_head(&p_qos_policy->qos_match_rules); @@ -622,6 +631,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( list_iterator = cl_list_next(list_iterator); continue; } + matched_by_sguid = TRUE; } /* If a match rule has Destination groups, PR request dest. has to be in this list */ @@ -635,6 +645,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( list_iterator = cl_list_next(list_iterator); continue; } + matched_by_dguid = TRUE; } /* If a match rule has QoS classes, PR request HAS @@ -653,7 +664,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( list_iterator = cl_list_next(list_iterator); continue; } - + matched_by_class = TRUE; } /* If a match rule has Service IDs, PR request HAS @@ -673,7 +684,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( list_iterator = cl_list_next(list_iterator); continue; } - + matched_by_sid = TRUE; } /* If a match rule has PKeys, PR request HAS @@ -692,7 +703,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( list_iterator = cl_list_next(list_iterator); continue; } - + matched_by_pkey = TRUE; } /* if we got here, then this match-rule matched this PR request */ @@ -700,10 +711,25 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params( } if (list_iterator == cl_list_end(&p_qos_policy->qos_match_rules)) - return NULL; + p_qos_match_rule = NULL; + if (p_qos_match_rule) + OSM_LOG(p_log, OSM_LOG_DEBUG, + "request matched rule (%s) by:%s%s%s%s%s\n", + (p_qos_match_rule->use) ? + p_qos_match_rule->use : "no description", + (matched_by_sguid) ? " SGUID" : "", + (matched_by_dguid) ? " DGUID" : "", + (matched_by_class) ? " QoS_Class" : "", + (matched_by_sid) ? " ServiceID" : "", + (matched_by_pkey) ? " PKey" : ""); + else + OSM_LOG(p_log, OSM_LOG_DEBUG, + "request not matched any rule\n"); + + OSM_LOG_EXIT(p_log); return p_qos_match_rule; -} /* __qos_policy_get_match_rule_by_pr() */ +} /* __qos_policy_get_match_rule_by_params() */ /*************************************************** ***************************************************/ -- 1.5.1.4 From weiyi.huang at gmail.com Sat Aug 16 16:51:03 2008 From: weiyi.huang at gmail.com (Huang Weiyi) Date: Sun, 17 Aug 2008 07:51:03 +0800 Subject: [ofa-general] ***SPAM*** [INFINIBAND] removed unused #include Message-ID: <20080817065244.1619.WEIYI.HUANG@gmail.com> The drivers below do not use LINUX_VERSION_CODE nor KERNEL_VERSION. drivers/infiniband/ulp/iser/iser_verbs.c This patch removes the said #include . Signed-off-by: Huang Weiyi diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c index 63462ec..26ff621 100644 --- a/drivers/infiniband/ulp/iser/iser_verbs.c +++ b/drivers/infiniband/ulp/iser/iser_verbs.c @@ -33,7 +33,6 @@ #include #include #include -#include #include "iscsi_iser.h" From yossi.openib at gmail.com Thu Aug 21 09:49:04 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Thu, 21 Aug 2008 19:49:04 +0300 Subject: [ofa-general] ***SPAM*** Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets In-Reply-To: <5D49E7A8952DC44FB38C38FA0D758EAD5865EA@mtlexch01.mtl.com> References: <48AC445D.2050704@gmail.com> <5D49E7A8952DC44FB38C38FA0D758EAD5865EA@mtlexch01.mtl.com> Message-ID: <48AD9C80.8030305@gmail.com> Hi Amir, What you suggesting is to replace almost all socket functions, and I don't think that this is good either. It would be write(), send(), recv(), sendto(), recvfrom(), sendmsg(), recvmsg(), and also need to change select() (to not return when fallback happens if SDP fails), and maybe also poll(). libsdp tries to avoid the fast path. Besides, how do we know when to do fallback - can we safely assume that if some socket operation fails, then it happened because connect() failed? Anyway, if I understand correctly, you suggest something like: int connect(fd, ...) { ... set_state(fd, SDP) ... } int read(int fd, ...) { int res = socket_funcs.read(shadow_fd(fd), ...); if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP) { sock_state = TCP; sockt_funs.connect(fd,...); close(shadow_fd(fd)); errno = EAGAIN; } return res; } --Yossi Amir Vadai wrote: > Yossi Hi, > > I think that breaking the semantic of non blocking socket is a bad idea. > > There is a solution that won't break this semantics: > > 1. User app calls connect(). > - libsdp try to connect through sdp. > 2. User app try another operation on the socket (e.g read/write) > - if sdp connection established successfully - great > - if sdp still not established - return -EAGAIN. This is the > same behaviour as if the tcp connection wasn't connected yet. > - if sdp timedout - return -EAGAIN and initiate TCP connect. > - if tcp connection established - use it > - if tcp connection timedout - return error. > > Maybe we could optimize it and initiate a tcp connection in parallel > with the sdp connection and use it only when the sdp connect is > timedout. > > I will add only the second patch (the debug print fix). > > - Amir > > From YJia at tmriusa.com Thu Aug 21 12:16:53 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Thu, 21 Aug 2008 14:16:53 -0500 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: Message-ID: Hi Hal, Can opensm just run once? When the subnet is up, it can exit assume that no change will be made in the subnet. Thanks! Yicheng "Hal Rosenstock" 07/10/2008 09:15 PM To "Yicheng Jia" cc "Jim Mott" , general at lists.openfabrics.org Subject Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: > >> If you want to avoid all the SM stuff, and are willing to program the >> switches directly (a few mads) > > Is it done by opensm? Yes. > What information should be set up in the switch by > opensm? Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >> Then to figure out QP connections, you just use a function of 3 >> parameters: >> my_qp_num = fn_sqp(my_node, target_node, qp_num) >> target_qp_num = fn_tqp(my_node, target_node, qp_num) >> Where qp_num is a small number between 0 and the maximum number of QPs you >> need active between any 2 endpoints. > > Can the qp_num be manually assigned? > Does it need opensm be involved? SM has nothing to do with QP numbers. >> If it works, you are done. If not, reset, up, wait for him to connect and >> send something to you. > > Is it reliable? I mean the QPs connection will keep alive during the QPs > lifecycle? For one thing, SM needs to try to keep ports at active. -- Hal > Best, > Yicheng > > > > "Jim Mott" > > 07/10/2008 04:17 PM > > To > "Yicheng Jia" , > cc > Subject > RE: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > If you want to avoid all the SM stuff, and are willing to program the > switches directly (a few mads), then I've used schemes like: > > Node LID=base + (switch port * constant) (base=0, constant = 1 works) > > Then to figure out QP connections, you just use a function of 3 parameters: > my_qp_num = fn_sqp(my_node, target_node, qp_num) > target_qp_num = fn_tqp(my_node, target_node, qp_num) > Where qp_num is a small number between 0 and the maximum number of QPs you > need active between any 2 endpoints. > > With the above scheme, you know your node_id (switch port number), your lid, > the lid of the target node, and the QPs on both sides. From there on, it > is clear sailing. You don't even need to send MADs; just transition the QP > up and try and use it. If it works, you are done. If not, reset, up, wait > for him to connect and send something to you. A little timer to make sure > everybody retries once in awhile and what can go wrong? > > Jim > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia > Sent: Thursday, July 10, 2008 2:59 PM > To: general at lists.openfabrics.org > Subject: [ofa-general] minimum sw components requirement for driver/opensm > in a single unmanaged switch network > > > Hi Folks, > > I have a IB network which consists of only a single unmanaged switch, all > end nodes connecting with the switch only need to do RDMA read/write > operation with each other. My question is, what are the indispensable > modules in driver's core and opensm that make the network up and run? > > I've been using only ib_mad module in driver's core with a managed switch > before, and the network works fine. So I assume that only the ib_mad module > in driver's core and SM in opensm are mandatory in my network. The LIDs are > assigned by them. The SA and CM modules are not useful in my case. Am I > right? > > I need to minimize driver and opensm to fit them in my network, the HCA > driver is mthca. > > Best, > Yicheng > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanba at gmail.com Thu Aug 21 12:34:14 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 21 Aug 2008 22:34:14 +0300 Subject: ***SPAM*** Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: References: Message-ID: <2f3bf9a60808211234w28ac093bh8c4d8182077f6964@mail.gmail.com> On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > > Hi Hal, > > Can opensm just run once? When the subnet is up, it can exit assume that no > change will be made in the subnet. > Yes, depend on the serives that you will need/use. For example: if you use operations that requires SA query, you must have a live SM. If you will connect the QPs in the subnet by yourself (for example, using socket) you can manage without a live SM in the subnet ... Dotan > Thanks! > Yicheng > > > > "Hal Rosenstock" > > 07/10/2008 09:15 PM > > To > "Yicheng Jia" > cc > "Jim Mott" , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >> >>> If you want to avoid all the SM stuff, and are willing to program the >>> switches directly (a few mads) >> >> Is it done by opensm? > > Yes. > >> What information should be set up in the switch by >> opensm? > > Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 > >>> Then to figure out QP connections, you just use a function of 3 >>> parameters: >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>> Where qp_num is a small number between 0 and the maximum number of QPs >>> you >>> need active between any 2 endpoints. >> >> Can the qp_num be manually assigned? >> Does it need opensm be involved? > > SM has nothing to do with QP numbers. > >>> If it works, you are done. If not, reset, up, wait for him to connect >>> and >>> send something to you. >> >> Is it reliable? I mean the QPs connection will keep alive during the QPs >> lifecycle? > > For one thing, SM needs to try to keep ports at active. > > -- Hal > >> Best, >> Yicheng >> >> >> >> "Jim Mott" >> >> 07/10/2008 04:17 PM >> >> To >> "Yicheng Jia" , >> cc >> Subject >> RE: [ofa-general] minimum sw components requirement for driver/opensm in a >> single unmanaged switch network >> >> >> >> >> If you want to avoid all the SM stuff, and are willing to program the >> switches directly (a few mads), then I've used schemes like: >> >> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >> >> Then to figure out QP connections, you just use a function of 3 >> parameters: >> my_qp_num = fn_sqp(my_node, target_node, qp_num) >> target_qp_num = fn_tqp(my_node, target_node, qp_num) >> Where qp_num is a small number between 0 and the maximum number of QPs you >> need active between any 2 endpoints. >> >> With the above scheme, you know your node_id (switch port number), your >> lid, >> the lid of the target node, and the QPs on both sides. From there on, it >> is clear sailing. You don't even need to send MADs; just transition the >> QP >> up and try and use it. If it works, you are done. If not, reset, up, >> wait >> for him to connect and send something to you. A little timer to make sure >> everybody retries once in awhile and what can go wrong? >> >> Jim >> From: general-bounces at lists.openfabrics.org >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia >> Sent: Thursday, July 10, 2008 2:59 PM >> To: general at lists.openfabrics.org >> Subject: [ofa-general] minimum sw components requirement for driver/opensm >> in a single unmanaged switch network >> >> >> Hi Folks, >> >> I have a IB network which consists of only a single unmanaged switch, all >> end nodes connecting with the switch only need to do RDMA read/write >> operation with each other. My question is, what are the indispensable >> modules in driver's core and opensm that make the network up and run? >> >> I've been using only ib_mad module in driver's core with a managed switch >> before, and the network works fine. So I assume that only the ib_mad >> module >> in driver's core and SM in opensm are mandatory in my network. The LIDs >> are >> assigned by them. The SA and CM modules are not useful in my case. Am I >> right? >> >> I need to minimize driver and opensm to fit them in my network, the HCA >> driver is mthca. >> >> Best, >> Yicheng >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From YJia at tmriusa.com Thu Aug 21 12:45:46 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Thu, 21 Aug 2008 14:45:46 -0500 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: <2f3bf9a60808211234w28ac093bh8c4d8182077f6964@mail.gmail.com> Message-ID: My operation is quite simple: connect QPs and do RDMA read/write. In this case, the opensm is not in need when the subnet is up, correct? Thanks! Yicheng "Dotan Barak" 08/21/2008 02:33 PM To "Yicheng Jia" cc "Hal Rosenstock" , general at lists.openfabrics.org Subject Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > > Hi Hal, > > Can opensm just run once? When the subnet is up, it can exit assume that no > change will be made in the subnet. > Yes, depend on the serives that you will need/use. For example: if you use operations that requires SA query, you must have a live SM. If you will connect the QPs in the subnet by yourself (for example, using socket) you can manage without a live SM in the subnet ... Dotan > Thanks! > Yicheng > > > > "Hal Rosenstock" > > 07/10/2008 09:15 PM > > To > "Yicheng Jia" > cc > "Jim Mott" , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >> >>> If you want to avoid all the SM stuff, and are willing to program the >>> switches directly (a few mads) >> >> Is it done by opensm? > > Yes. > >> What information should be set up in the switch by >> opensm? > > Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 > >>> Then to figure out QP connections, you just use a function of 3 >>> parameters: >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>> Where qp_num is a small number between 0 and the maximum number of QPs >>> you >>> need active between any 2 endpoints. >> >> Can the qp_num be manually assigned? >> Does it need opensm be involved? > > SM has nothing to do with QP numbers. > >>> If it works, you are done. If not, reset, up, wait for him to connect >>> and >>> send something to you. >> >> Is it reliable? I mean the QPs connection will keep alive during the QPs >> lifecycle? > > For one thing, SM needs to try to keep ports at active. > > -- Hal > >> Best, >> Yicheng >> >> >> >> "Jim Mott" >> >> 07/10/2008 04:17 PM >> >> To >> "Yicheng Jia" , >> cc >> Subject >> RE: [ofa-general] minimum sw components requirement for driver/opensm in a >> single unmanaged switch network >> >> >> >> >> If you want to avoid all the SM stuff, and are willing to program the >> switches directly (a few mads), then I've used schemes like: >> >> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >> >> Then to figure out QP connections, you just use a function of 3 >> parameters: >> my_qp_num = fn_sqp(my_node, target_node, qp_num) >> target_qp_num = fn_tqp(my_node, target_node, qp_num) >> Where qp_num is a small number between 0 and the maximum number of QPs you >> need active between any 2 endpoints. >> >> With the above scheme, you know your node_id (switch port number), your >> lid, >> the lid of the target node, and the QPs on both sides. From there on, it >> is clear sailing. You don't even need to send MADs; just transition the >> QP >> up and try and use it. If it works, you are done. If not, reset, up, >> wait >> for him to connect and send something to you. A little timer to make sure >> everybody retries once in awhile and what can go wrong? >> >> Jim >> From: general-bounces at lists.openfabrics.org >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia >> Sent: Thursday, July 10, 2008 2:59 PM >> To: general at lists.openfabrics.org >> Subject: [ofa-general] minimum sw components requirement for driver/opensm >> in a single unmanaged switch network >> >> >> Hi Folks, >> >> I have a IB network which consists of only a single unmanaged switch, all >> end nodes connecting with the switch only need to do RDMA read/write >> operation with each other. My question is, what are the indispensable >> modules in driver's core and opensm that make the network up and run? >> >> I've been using only ib_mad module in driver's core with a managed switch >> before, and the network works fine. So I assume that only the ib_mad >> module >> in driver's core and SM in opensm are mandatory in my network. The LIDs >> are >> assigned by them. The SA and CM modules are not useful in my case. Am I >> right? >> >> I need to minimize driver and opensm to fit them in my network, the HCA >> driver is mthca. >> >> Best, >> Yicheng >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> >> _____________________________________________________________________________ >> Scanned by IBM Email Security Management Services powered by MessageLabs. >> For more information please visit http://www.ers.ibm.com >> >> _____________________________________________________________________________ >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanba at gmail.com Thu Aug 21 13:53:15 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 21 Aug 2008 22:53:15 +0200 Subject: ***SPAM*** Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: References: Message-ID: <48ADD5BB.60901@gmail.com> Yicheng Jia wrote: > > My operation is quite simple: connect QPs and do RDMA read/write. In > this case, the opensm is not in need when the subnet is up, correct? Basically yes, but it depends on how you connect the QPs ... In the past i wrote even more complicated flows than this when the SM was down... If you'll connect the QPs using the sockets and you won't depend on other ULP (such as IPoIB, SDP or any other) you will be fine .. Dotan > > Thanks! > Yicheng > > > > *"Dotan Barak" * > > 08/21/2008 02:33 PM > > > To > "Yicheng Jia" > cc > "Hal Rosenstock" , > general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm > in a single unmanaged switch network > > > > > > > > > > On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > > > > Hi Hal, > > > > Can opensm just run once? When the subnet is up, it can exit assume > that no > > change will be made in the subnet. > > > Yes, depend on the serives that you will need/use. > > For example: if you use operations that requires SA query, you must > have a live SM. > > If you will connect the QPs in the subnet by yourself (for example, > using socket) you can manage without a live SM in the subnet ... > > Dotan > > Thanks! > > Yicheng > > > > > > > > "Hal Rosenstock" > > > > 07/10/2008 09:15 PM > > > > To > > "Yicheng Jia" > > cc > > "Jim Mott" , general at lists.openfabrics.org > > Subject > > Re: [ofa-general] minimum sw components requirement for > driver/opensm in a > > single unmanaged switch network > > > > > > > > > > On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: > >> > >>> If you want to avoid all the SM stuff, and are willing to program the > >>> switches directly (a few mads) > >> > >> Is it done by opensm? > > > > Yes. > > > >> What information should be set up in the switch by > >> opensm? > > > > Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 > > > >>> Then to figure out QP connections, you just use a function of 3 > >>> parameters: > >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) > >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) > >>> Where qp_num is a small number between 0 and the maximum number of QPs > >>> you > >>> need active between any 2 endpoints. > >> > >> Can the qp_num be manually assigned? > >> Does it need opensm be involved? > > > > SM has nothing to do with QP numbers. > > > >>> If it works, you are done. If not, reset, up, wait for him to connect > >>> and > >>> send something to you. > >> > >> Is it reliable? I mean the QPs connection will keep alive during > the QPs > >> lifecycle? > > > > For one thing, SM needs to try to keep ports at active. > > > > -- Hal > > > >> Best, > >> Yicheng > >> > >> > >> > >> "Jim Mott" > >> > >> 07/10/2008 04:17 PM > >> > >> To > >> "Yicheng Jia" , > >> cc > >> Subject > >> RE: [ofa-general] minimum sw components requirement for > driver/opensm in a > >> single unmanaged switch network > >> > >> > >> > >> > >> If you want to avoid all the SM stuff, and are willing to program the > >> switches directly (a few mads), then I've used schemes like: > >> > >> Node LID=base + (switch port * constant) (base=0, constant = 1 works) > >> > >> Then to figure out QP connections, you just use a function of 3 > >> parameters: > >> my_qp_num = fn_sqp(my_node, target_node, qp_num) > >> target_qp_num = fn_tqp(my_node, target_node, qp_num) > >> Where qp_num is a small number between 0 and the maximum number of > QPs you > >> need active between any 2 endpoints. > >> > >> With the above scheme, you know your node_id (switch port number), your > >> lid, > >> the lid of the target node, and the QPs on both sides. From there > on, it > >> is clear sailing. You don't even need to send MADs; just > transition the > >> QP > >> up and try and use it. If it works, you are done. If not, reset, up, > >> wait > >> for him to connect and send something to you. A little timer to > make sure > >> everybody retries once in awhile and what can go wrong? > >> > >> Jim > >> From: general-bounces at lists.openfabrics.org > >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia > >> Sent: Thursday, July 10, 2008 2:59 PM > >> To: general at lists.openfabrics.org > >> Subject: [ofa-general] minimum sw components requirement for > driver/opensm > >> in a single unmanaged switch network > >> > >> > >> Hi Folks, > >> > >> I have a IB network which consists of only a single unmanaged > switch, all > >> end nodes connecting with the switch only need to do RDMA read/write > >> operation with each other. My question is, what are the indispensable > >> modules in driver's core and opensm that make the network up and run? > >> > >> I've been using only ib_mad module in driver's core with a managed > switch > >> before, and the network works fine. So I assume that only the ib_mad > >> module > >> in driver's core and SM in opensm are mandatory in my network. The LIDs > >> are > >> assigned by them. The SA and CM modules are not useful in my case. Am I > >> right? > >> > >> I need to minimize driver and opensm to fit them in my network, the HCA > >> driver is mthca. > >> > >> Best, > >> Yicheng > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit > http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit > http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> _______________________________________________ > >> general mailing list > >> general at lists.openfabrics.org > >> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >> > >> To unsubscribe, please visit > >> http://openib.org/mailman/listinfo/openib-general > >> > > > > > _____________________________________________________________________________ > > Scanned by IBM Email Security Management Services powered by > MessageLabs. > > For more information please visit http://www.ers.ibm.com > > > _____________________________________________________________________________ > > > > > > > _____________________________________________________________________________ > > Scanned by IBM Email Security Management Services powered by > MessageLabs. > > For more information please visit > http://www.ers.ibm.com > > > _____________________________________________________________________________ > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by > MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by > MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ From vlad at dev.mellanox.co.il Thu Aug 21 13:01:22 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Thu, 21 Aug 2008 23:01:22 +0300 Subject: [ofa-general] OFED 1.4 beta release is ready Message-ID: <48ADC992.3080000@dev.mellanox.co.il> Hi, OFED 1.4 beta release is available on http://www.openfabrics.org/downloads/OFED/ofed-1.4/OFED-1.4-beta1.tgz md5sum: 40fb1d0d72943203f9c112c05143f2da To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 1.4 (You may get this email twice, I didn't see the first email on the list). - Vladimir ======================================================================== Release information: -------------------- Linux Operating Systems: - RedHat EL4 up5: 2.6.9-55.ELsmp - RedHat EL4 up6: 2.6.9-67.ELsmp - RedHat EL4 up7: 2.6.9-78.ELsmp - RedHat EL5: 2.6.18-8.el5 - RedHat EL5 up1: 2.6.18-53.el5 - RedHat EL5 up2: 2.6.18-92.el5 - CentOS 5.2: 2.6.18-92.el5 - Fedora C9: 2.6.25-14.fc9 * - SLES10: 2.6.16.21-0.8-smp - SLES10 SP1: 2.6.16.46-0.12-smp - SLES10 SP1 up1: 2.6.16.53-0.16-smp - SLES10 SP2: 2.6.16.60-0.21-smp - OpenSuSE 10.3: 2.6.22.5-31 * - kernel.org: 2.6.26 * OSes that are partially tested Systems: * x86_64 * x86 * ia64 * ppc64 Main Changes from OFED 1.3 ========================== 1. General changes o Kernel code based on 2.6.27-rc3 - New verbs to support BMME (Fast memory thru send queue, Local invalidate send work requests, Read with invalidate. o Added iSER target package o Added NFS-RDMA support (for 2.6.26 only for now) 2. IPoIB o LRO support 3. SDP o Bug fixes in the state machine and close flow 4. qlgc_vnic o Support for hotswap of EVIC and dynamic update of existing connections with the addition of QLogic dynamic update daemon. o Performance improvements in handling of Ethernetbroadcast/multicast traffic. 5. RDS o GA of RDMA API (using FMRs) - RDS API version 3 o iWARP support 6. uDAPL o Added socket based CM - for both scalability and interop with Windows o Added UD extensions - for version 2.0 only o v1 library package has been renamed to compat-dapl-1.2.8-1 7. Management o OpenSM - APM - disjoint paths - Path balancing for LMC + console diagnostics - OpenSM configuration unification - MGID to MLID mapping for IPv6 SNM - Routing engines chain - IBA 1.2.1 additions - Failover/Handover improvements o ibutils: - Congestion Control - Report created in CSV format o Diagnostic tools: - ibnetdiscover library - to accelerate another tools 8. Low level drivers: o mlx4: Enable Virtual Protocol Interconnect - Eth and IB on the same device o mlx4_en: Mellanox ConnectX HCA Ethernet driver 9. New MPI: o MVAPICH 1.1 Tasks that should be completed for the rc: ============================================ 1. NFS-RDMA to work on distro OSes 2. iSER backports 3. New MPI versions: OpenMPI 1.3, MVAPICH2 1.2 4. OSM: Cashed routing From YJia at tmriusa.com Thu Aug 21 13:02:43 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Thu, 21 Aug 2008 15:02:43 -0500 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: <48ADD5BB.60901@gmail.com> Message-ID: > In the past i wrote even more complicated flows than this when the SM > was down... Can you point to me where it is? Thanks! Yicheng Dotan Barak 08/21/2008 02:52 PM To Yicheng Jia cc general at lists.openfabrics.org, Hal Rosenstock Subject Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network Yicheng Jia wrote: > > My operation is quite simple: connect QPs and do RDMA read/write. In > this case, the opensm is not in need when the subnet is up, correct? Basically yes, but it depends on how you connect the QPs ... In the past i wrote even more complicated flows than this when the SM was down... If you'll connect the QPs using the sockets and you won't depend on other ULP (such as IPoIB, SDP or any other) you will be fine .. Dotan > > Thanks! > Yicheng > > > > *"Dotan Barak" * > > 08/21/2008 02:33 PM > > > To > "Yicheng Jia" > cc > "Hal Rosenstock" , > general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm > in a single unmanaged switch network > > > > > > > > > > On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > > > > Hi Hal, > > > > Can opensm just run once? When the subnet is up, it can exit assume > that no > > change will be made in the subnet. > > > Yes, depend on the serives that you will need/use. > > For example: if you use operations that requires SA query, you must > have a live SM. > > If you will connect the QPs in the subnet by yourself (for example, > using socket) you can manage without a live SM in the subnet ... > > Dotan > > Thanks! > > Yicheng > > > > > > > > "Hal Rosenstock" > > > > 07/10/2008 09:15 PM > > > > To > > "Yicheng Jia" > > cc > > "Jim Mott" , general at lists.openfabrics.org > > Subject > > Re: [ofa-general] minimum sw components requirement for > driver/opensm in a > > single unmanaged switch network > > > > > > > > > > On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: > >> > >>> If you want to avoid all the SM stuff, and are willing to program the > >>> switches directly (a few mads) > >> > >> Is it done by opensm? > > > > Yes. > > > >> What information should be set up in the switch by > >> opensm? > > > > Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 > > > >>> Then to figure out QP connections, you just use a function of 3 > >>> parameters: > >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) > >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) > >>> Where qp_num is a small number between 0 and the maximum number of QPs > >>> you > >>> need active between any 2 endpoints. > >> > >> Can the qp_num be manually assigned? > >> Does it need opensm be involved? > > > > SM has nothing to do with QP numbers. > > > >>> If it works, you are done. If not, reset, up, wait for him to connect > >>> and > >>> send something to you. > >> > >> Is it reliable? I mean the QPs connection will keep alive during > the QPs > >> lifecycle? > > > > For one thing, SM needs to try to keep ports at active. > > > > -- Hal > > > >> Best, > >> Yicheng > >> > >> > >> > >> "Jim Mott" > >> > >> 07/10/2008 04:17 PM > >> > >> To > >> "Yicheng Jia" , > >> cc > >> Subject > >> RE: [ofa-general] minimum sw components requirement for > driver/opensm in a > >> single unmanaged switch network > >> > >> > >> > >> > >> If you want to avoid all the SM stuff, and are willing to program the > >> switches directly (a few mads), then I've used schemes like: > >> > >> Node LID=base + (switch port * constant) (base=0, constant = 1 works) > >> > >> Then to figure out QP connections, you just use a function of 3 > >> parameters: > >> my_qp_num = fn_sqp(my_node, target_node, qp_num) > >> target_qp_num = fn_tqp(my_node, target_node, qp_num) > >> Where qp_num is a small number between 0 and the maximum number of > QPs you > >> need active between any 2 endpoints. > >> > >> With the above scheme, you know your node_id (switch port number), your > >> lid, > >> the lid of the target node, and the QPs on both sides. From there > on, it > >> is clear sailing. You don't even need to send MADs; just > transition the > >> QP > >> up and try and use it. If it works, you are done. If not, reset, up, > >> wait > >> for him to connect and send something to you. A little timer to > make sure > >> everybody retries once in awhile and what can go wrong? > >> > >> Jim > >> From: general-bounces at lists.openfabrics.org > >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia > >> Sent: Thursday, July 10, 2008 2:59 PM > >> To: general at lists.openfabrics.org > >> Subject: [ofa-general] minimum sw components requirement for > driver/opensm > >> in a single unmanaged switch network > >> > >> > >> Hi Folks, > >> > >> I have a IB network which consists of only a single unmanaged > switch, all > >> end nodes connecting with the switch only need to do RDMA read/write > >> operation with each other. My question is, what are the indispensable > >> modules in driver's core and opensm that make the network up and run? > >> > >> I've been using only ib_mad module in driver's core with a managed > switch > >> before, and the network works fine. So I assume that only the ib_mad > >> module > >> in driver's core and SM in opensm are mandatory in my network. The LIDs > >> are > >> assigned by them. The SA and CM modules are not useful in my case. Am I > >> right? > >> > >> I need to minimize driver and opensm to fit them in my network, the HCA > >> driver is mthca. > >> > >> Best, > >> Yicheng > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit > http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> > >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by > MessageLabs. > >> For more information please visit > http://www.ers.ibm.com > >> > >> > _____________________________________________________________________________ > >> > >> _______________________________________________ > >> general mailing list > >> general at lists.openfabrics.org > >> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >> > >> To unsubscribe, please visit > >> http://openib.org/mailman/listinfo/openib-general > >> > > > > > _____________________________________________________________________________ > > Scanned by IBM Email Security Management Services powered by > MessageLabs. > > For more information please visit http://www.ers.ibm.com > > > _____________________________________________________________________________ > > > > > > > _____________________________________________________________________________ > > Scanned by IBM Email Security Management Services powered by > MessageLabs. > > For more information please visit > http://www.ers.ibm.com > > > _____________________________________________________________________________ > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by > MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by > MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From totunel at yahoo.com Fri Aug 22 02:12:24 2008 From: totunel at yahoo.com (Podar Valentin) Date: Fri, 22 Aug 2008 02:12:24 -0700 (PDT) Subject: [ofa-general] ***SPAM*** new with IB Message-ID: <230765.25942.qm@web56504.mail.re3.yahoo.com> Dear all, I am new to this field and I have some questions. I run a cluster with IB Mellanox. I have two subnets each with its own opensm (running on different port P1 or P2). mixed hardware MT25204 and MT23108. all are at 4x rate. mixed drivers IBGold1.8.2 and MLNX_OFED_LINUX-1.3.1-rhel4 when I issue ibdiagnet -p 1 -lw 4x I get -I- Stages Status Report:     STAGE                                    Errors Warnings     Bad GUIDs/LIDs Check                     0      0     Link State Active Check                  0      0     Performance Counters Report              0      0     Specific Link Width Check                0      0     Partitions Check                         0      0     IPoIB Subnets Check                      0      16 BUT -I--------------------------------------------------- -I- IPoIB Subnets Check -I--------------------------------------------------- -I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:4096Byte rate:120Gbps SL:0x00 -W- Port h1/P1 lid=0x0011 guid=0x dev=23108 can not join due     to rate:10Gbps < group:120Gbps -W- Port h2/P1 lid=0x0321 guid=0x dev=23108 can not join    due to rate:10Gbps < group:120Gbps -W- Port h3/P1 lid=0x0069 guid=0x dev=23108 can not join due to    rate:10Gbps < group:120Gbps -W- Port h4/P1 lid=0x0010 guid=0x dev=23108 can not    join due to rate:10Gbps < group:120Gbps and so on with all the nodes. switch type MT47396. I think the problem is this line Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:4096Byte rate:120Gbps SL:0x00 but I don't know how to set the Mtu to 2048 and rate to 10G. more  /usr/sbin/saquery -d -g MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x00000000ffffffff                 Mlid....................0xC000                 Mtu.....................0x5                 pkey....................0xFFFF                 Rate....................0xA MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x0000000000000001                 Mlid....................0xC001                 Mtu.....................0x5                 pkey....................0xFFFF                 Rate....................0xA MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x0000000000656565                 Mlid....................0xC002                 Mtu.....................0x4                 pkey....................0xFFFF                 Rate....................0x3 MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x0000000000a847ff                 Mlid....................0xC003                 Mtu.....................0x4                 pkey....................0xFFFF                 Rate....................0x3 MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x0000000000000000                 Mlid....................0xC007                 Mtu.....................0x4                 pkey....................0xFFFF                 Rate....................0x2 MCMemberRecord group dump:                 MGID....................0xff12401bffff0000 : 0x000000000202c902                 Mlid....................0xC008                 Mtu.....................0x4                 pkey....................0xFFFF                 Rate....................0x2 the question is how can I set the group rate to 10G and not 120G? and the group MTU to 2048 as on some nodes I get " failed to join multicast or setting  MTU>4096 will ...generate...some errors" thank you very much! Vali -------------- next part -------------- An HTML attachment was scrubbed... URL: From kovlensky at interia.pl Fri Aug 22 05:25:34 2008 From: kovlensky at interia.pl (kovlensky at interia.pl) Date: 22 Aug 2008 14:25:34 +0200 Subject: [ofa-general] ***SPAM*** mixing ofed releases Message-ID: <20080822122534.7A2431E303F@f03.poczta.interia.pl> Hi, I'm wondering what restrictions apply to mixing different ofed releases in one network. Would every mixture of machines with ofed 1.1, 1.2.5, 1.3, Infinipath 2.1 and Infinipath 2.2 work correctly in ib network layer? Any subnet manager restrictions? We're talking about ib network layer only, everything above (like software etc.) is user responsibility to have correctly configured. ---------------------------------------------------------------------- Mapa Polski w Twoim telefonie! Sprawdz >> http://link.interia.pl/f1ee8 From PHF at zurich.ibm.com Fri Aug 22 08:28:22 2008 From: PHF at zurich.ibm.com (Philip Frey1) Date: Fri, 22 Aug 2008 17:28:22 +0200 Subject: [ofa-general] Symbols for iw_cxgb3 (OFED-1.3.1) Message-ID: Hello, I am trying to profile what is expensive when registering an MR (assuming that pages are already resident in main memory). For that purpose I am running oprofile and opreport but when it comes to 'iw_cxgb3' the report says 'no symbols' (same thing for 'ib_core'). Can you point me to the place where I need to change the CFLAGS (or alike) to get those symbols? Many thanks for your advice, Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.olson at qlogic.com Fri Aug 22 14:00:47 2008 From: dave.olson at qlogic.com (Dave Olson) Date: Fri, 22 Aug 2008 14:00:47 -0700 (PDT) Subject: [ofa-general] mixing ofed releases In-Reply-To: <20080822122534.7A2431E303F@f03.poczta.interia.pl> References: <20080822122534.7A2431E303F@f03.poczta.interia.pl> Message-ID: On Fri, 22 Aug 2008, kovlensky at interia.pl wrote: | I'm wondering what restrictions apply to mixing different ofed releases in one network. Would every mixture of machines with ofed 1.1, 1.2.5, 1.3, Infinipath 2.1 and Infinipath 2.2 work correctly in ib network layer? Any subnet manager restrictions? We're talking about ib network layer only, everything above (like software etc.) is user responsibility to have correctly configured. Infinipath 2.1 and 2.2 should work together, both for PSM and ipoib UD (not CM, obviously). We didn't do a lot of interoperatibility OFED testing between the two, but we found SDP not to be compatible between the two, and I recall other issues as well, but not details. Infinipath 2.1 is OFED 1.2-based, not 1.2.5. We know there are issues in some areas between 2.1 and 1.2.5. I'm pretty sure that for most things, ofed 1.1 will not be compatible with 1.2 or later. 1.2.5 had some pretty large changes in it for connectX. If you can avoid mixing verbs-based tasks between the releases, then they should mostly coexist on the same fabric, but you'd want the SM to be from the newest of the releases, if you use opensm. Dave Olson dave.olson at qlogic.com From weiyi.huang at gmail.com Fri Aug 22 22:56:16 2008 From: weiyi.huang at gmail.com (Huang Weiyi) Date: Sat, 23 Aug 2008 13:56:16 +0800 Subject: [ofa-general] ***SPAM*** IB/ipath: remove unused #include Message-ID: <20080823131721.1277.WEIYI.HUANG@gmail.com> The driver(s) below do not use LINUX_VERSION_CODE nor KERNEL_VERSION. drivers/infiniband/hw/ipath/ipath_fs.c This patch removes the said #include . Signed-off-by: Huang Weiyi diff --git a/drivers/infiniband/hw/ipath/ipath_fs.c b/drivers/infiniband/hw/ipath/ipath_fs.c index 23faba9..8bb5170 100644 --- a/drivers/infiniband/hw/ipath/ipath_fs.c +++ b/drivers/infiniband/hw/ipath/ipath_fs.c @@ -31,7 +31,6 @@ * SOFTWARE. */ -#include #include #include #include From ogerlitz at voltaire.com Sun Aug 24 22:32:30 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Mon, 25 Aug 2008 08:32:30 +0300 Subject: [ofa-general][PATCH 01/11 v3] mlx4: Qp range reservation In-Reply-To: <48AC18F2.80600@mellanox.co.il> References: <48AC18F2.80600@mellanox.co.il> Message-ID: <48B243EE.3080503@voltaire.com> Yevgeny Petrilin wrote: > Diff from previous version: > -mlx4_bitmap_free() uses mlx4_bitmap_free_range() with range=1. > -Free qp range if failed to allocate qp. I wasn't sure if this series is a repost of what was sent to the openfabrics mainling list on July 9th or a new version of the patches, can you clarify that? Or. From yevgenyp at mellanox.co.il Sun Aug 24 23:26:36 2008 From: yevgenyp at mellanox.co.il (Yevgeny Petrilin) Date: Mon, 25 Aug 2008 09:26:36 +0300 Subject: [ofa-general][PATCH 01/11 v3] mlx4: Qp range reservation In-Reply-To: <48B243EE.3080503@voltaire.com> References: <48AC18F2.80600@mellanox.co.il> <48B243EE.3080503@voltaire.com> Message-ID: <48B2509C.1070307@mellanox.co.il> Or Gerlitz wrote: > I wasn't sure if this series is a repost of what was sent to the > openfabrics mainling list on July 9th or a new version of the patches, > can you clarify that? > > Or. > This is a repost of the same patches. Yevgeny From tziporet at dev.mellanox.co.il Mon Aug 25 01:45:47 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 25 Aug 2008 11:45:47 +0300 Subject: [ofa-general] How many processes on a node can open IB device ? In-Reply-To: <58C6777539C300489D145B0F8E29C32815F875FBD0@GVW0673EXC.americas.hpqcorp.net> References: <58C6777539C300489D145B0F8E29C32815EA8DD024@GVW0673EXC.americas.hpqcorp.net> <58C6777539C300489D145B0F8E29C32815F875FBD0@GVW0673EXC.americas.hpqcorp.net> Message-ID: <48B2713B.5000708@mellanox.co.il> Tang, Changqing wrote: > Roland: > Thank you very much for the info. > > I hope Mellanox can tell me what to do next, Our project needs to run 2048 ranks > on a node, every rank has IB communication(most of them are sleeping, only a few are active). > > I think you already got the answer. Please reply if this is still open > Roland Dreier wrote: > >> You should be able to build firmware that supports more >> processes, but I believe there may be some >> performance/stability tradeoffs related to that -- Mellanox >> could tell you more. >> >> There is no performance tradeoffs. The only problem we once saw with larger UAR page is that on some x86 systems the BIOS had a problem with it. Tziporet From tziporet at dev.mellanox.co.il Mon Aug 25 01:49:37 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 25 Aug 2008 11:49:37 +0300 Subject: [ofa-general] Re: [ewg] [PATCH 1/2 v2]libibvers: add create_qp_expanded In-Reply-To: <3b5e77ad0808190444u732afbadnae40c74a73ab45f2@mail.gmail.com> References: <3b5e77ad0808190444u732afbadnae40c74a73ab45f2@mail.gmail.com> Message-ID: <48B27221.6050609@mellanox.co.il> Ron Livne wrote: > OK, but doesn't it contradict the approach you agreed on? > > > What do you think of the following approach? > > Instead of adding creation flags to the qp_init_attr, I can add a new verb: > > ibv_qp *create_qp_extended(struct ibv_pd *pd, struct ibv_qp_init_attr, > > *init_attr, enum ibv_qp_create_flags create_flags) > > > > I'm aware that adding a new verb isn't optimal, but at least we can > > avoid incrementing the libibverbs version. > > I think this new verb seems like a better approach right now. > > When will we have all patches ready? It must be this week if you still want it in OFED 1.4 Tziporet > > From tziporet at mellanox.co.il Mon Aug 25 04:40:42 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Mon, 25 Aug 2008 14:40:42 +0300 Subject: [ofa-general] OFED meeting agenda for today (Aug 25, 2008) Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD5D827F@mtlexch01.mtl.com> This is the agenda for OFED meeting today (Aug 25, 2008): 1. OFED 1.4 status: - Beta was done on Aug 21 - Based on kernel 2.6.27-rc4 - RHEL 4.7 is supported Still missing: - iSER (disabled from OFED now) - Voltaire - NFS/RDMA - no backport for distros yet - Jeff B. - MVAPICH2 1.1 - under work now - Open MPI 1.3 - Jeff S. - extended QP verb - Voltaire 2. OFED 1.4 schedule - Alpha Release - July 24, 2008 - done - Beta Release - Aug 21, 2008 - done Suggestion for the RCs plan: - RC1 - Sept 3, 2008 - RC2 - Sept 17, 2008 - RC3 - Sept 25, 2008 - more RCs - as needed - GA - Discussion on the expected date 3. OFA BOF in SC08 - Woody & Betsy 4. OFA server upgrade update - Jeff Becker 5. Open discussion Tziporet From tziporet at dev.mellanox.co.il Mon Aug 25 06:29:20 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 25 Aug 2008 16:29:20 +0300 Subject: [ofa-general] Re: [ewg] [PATCH 1/2 v2]libibvers: add create_qp_expanded In-Reply-To: <39C75744D164D948A170E9792AF8E7CA016EAB87@exil.voltaire.com> References: <3b5e77ad0808190444u732afbadnae40c74a73ab45f2@mail.gmail.com> <48B27221.6050609@mellanox.co.il> <39C75744D164D948A170E9792AF8E7CA016EAB87@exil.voltaire.com> Message-ID: <48B2B3B0.1050408@mellanox.co.il> Olga Shern wrote: > Tziporet, > > Ron cannot work with Roland's tree, > because not all XRC patches are there. > > But he should prepare us patches for OFED, or maybe we skip this? Tziporet From yossi.openib at gmail.com Mon Aug 25 08:18:18 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Mon, 25 Aug 2008 18:18:18 +0300 Subject: [ofa-general] ***SPAM*** Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets In-Reply-To: <1219590681.1564.10.camel@amirv-laptop> References: <48AC445D.2050704@gmail.com> <5D49E7A8952DC44FB38C38FA0D758EAD5865EA@mtlexch01.mtl.com> <48AD9C80.8030305@gmail.com> <1219590681.1564.10.camel@amirv-laptop> Message-ID: <48B2CD3A.5020509@gmail.com> Hi Amir, The single case in which we block connect() here (and only on SDP, which is rather fast) is the case that is currenlty not supported anyway. It can also be configurable. Anyway, we have a client which uses non-blocking sockets and really needs that feature. How about putting this to OFED now and writing something better later on? --Yossi Amir Vadai wrote: > See below > > On Thu, 2008-08-21 at 19:49 +0300, Yossi Etigin wrote: >> Hi Amir, >> >> What you suggesting is to replace almost all socket functions, and I >> don't think that this is good either. > I agree - but to break the non-blocking semantics is worse. > >> It would be write(), send(), recv(), sendto(), recvfrom(), sendmsg(), >> recvmsg(), and also need to change select() (to not return when >> fallback >> happens if SDP fails), and maybe also poll(). libsdp tries to avoid >> the fast path. > I don't see another option. We could have a #ifdef to enable the user > to choose - non blocking support or cleaner fast-path. >> Besides, how do we know when to do fallback - can we safely assume >> that if some socket operation fails, then it happened because >> connect() failed? >>From a brief look at connect man page, they say we should use select for > writing on the socket. after select indicates writability, use > getsockopt to determine whether connect() completed successfully or not. >> Anyway, if I understand correctly, you suggest something like: >> >> int connect(fd, ...) >> { >> ... >> set_state(fd, SDP) >> ... >> } >> >> >> int read(int fd, ...) >> { >> int res = socket_funcs.read(shadow_fd(fd), ...); >> if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP) { >> sock_state = TCP; >> sockt_funs.connect(fd,...); >> close(shadow_fd(fd)); >> errno = EAGAIN; >> } >> return res; >> } >> >> > ... again, I don't like it too - but I don't think we should block > connect when the user asks not to. > - Amir. >> --Yossi >> >> Amir Vadai wrote: >>> Yossi Hi, >>> >>> I think that breaking the semantic of non blocking socket is a bad >> idea. >>> There is a solution that won't break this semantics: >>> >>> 1. User app calls connect(). >>> - libsdp try to connect through sdp. >>> 2. User app try another operation on the socket (e.g read/write) >>> - if sdp connection established successfully - great >>> - if sdp still not established - return -EAGAIN. This is the >>> same behaviour as if the tcp connection wasn't connected yet. >>> - if sdp timedout - return -EAGAIN and initiate TCP connect. >>> - if tcp connection established - use it >>> - if tcp connection timedout - return error. >>> >>> Maybe we could optimize it and initiate a tcp connection in parallel >>> with the sdp connection and use it only when the sdp connect is >>> timedout. >>> >>> I will add only the second patch (the debug print fix). >>> >>> - Amir >>> >>> >> >> > From sean.hefty at intel.com Mon Aug 25 10:16:51 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 25 Aug 2008 10:16:51 -0700 Subject: [ofa-general] uDAPL private data size issue In-Reply-To: References: <58C6777539C300489D145B0F8E29C32815F87601C2@GVW0673EXC.americas.hpqcorp.net> Message-ID: >I guess that makes the uDAPL cma provider non-compliant unless >there is a way for rdma_cm to give back some of IB CM request >private data area. > >Sean, is there anything that can be done here? The cma header is defined as part of the spec, so it won't be changing. If more private data is needed than what DAPL provides, the IB CM interface can be used directly, or simply exchange the data after connecting. - Sean From sean.hefty at intel.com Mon Aug 25 10:35:19 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 25 Aug 2008 10:35:19 -0700 Subject: [ofa-general] [RDMA CM IPv6 support. PATCHv3 ] Comparision to PATCHv2 In-Reply-To: <1218616779.4186.4.camel@linux-zn6t.site> References: <1218614650.19941.4.camel@linux-zn6t.site> <1218616779.4186.4.camel@linux-zn6t.site> Message-ID: Roland, Taken collectively, I don't see any major issues with these changes for 2.6.28. I hope to test these by the end of the week, but whenever you're happy with them, feel free to merge them. Sean From sean.hefty at intel.com Mon Aug 25 12:13:15 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 25 Aug 2008 12:13:15 -0700 Subject: [ofa-general] [PATCH 2.6.27] ib/cm: free cm_device structure Message-ID: commit 110cf374a809817d5c080c0ac82d65d029820a66 introduced a memory leak. Free the leaked structure. Signed-off-by: Sean Hefty --- drivers/infiniband/core/cm.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 922d35f..3cab0ce 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -3748,6 +3748,7 @@ error1: cm_remove_port_fs(port); } device_unregister(cm_dev->device); + kfree(cm_dev); } static void cm_remove_one(struct ib_device *ib_device) @@ -3776,6 +3777,7 @@ static void cm_remove_one(struct ib_device *ib_device) cm_remove_port_fs(port); } device_unregister(cm_dev->device); + kfree(cm_dev); } static int __init ib_cm_init(void) From rdreier at cisco.com Mon Aug 25 13:33:01 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 25 Aug 2008 13:33:01 -0700 Subject: [ofa-general] [PATCH 2.6.27] ib/cm: free cm_device structure In-Reply-To: (Sean Hefty's message of "Mon, 25 Aug 2008 12:13:15 -0700") References: Message-ID: > device_unregister(cm_dev->device); > + kfree(cm_dev); Is this really correct? Couldn't something (eg an open file from sysfs) still have a reference to cm_dev->device even after device_unregister() returns, in which case this ends up freeing an object that's still live? From matthias at sgi.com Mon Aug 25 14:09:55 2008 From: matthias at sgi.com (Matthias Blankenhaus) Date: Mon, 25 Aug 2008 14:09:55 -0700 (PDT) Subject: [ofa-general] ibdiagnet FM master / standby error report Message-ID: Howdy ! I have noticed that ibdiagnet reports an error when using a master / standby FM configuration. I am using OFED-1.3.1. Here it goes: # ibdiagnet .... -I--------------------------------------------------- -I- Bad Fabric SM Info -I--------------------------------------------------- -E- Found more then one master SM in the discover fabric r1lead/P1 priority:15 r2lead/P1 priority:0 .... -I- Stages Status Report: STAGE Errors Warnings Bad GUIDs/LIDs Check 0 0 Link State Active Check 0 0 SM Info Check 1 0 Performance Counters Report 0 6 Partitions Check 0 0 IPoIB Subnets Check 0 0 This is incorrect as we have only one master namely r1lead. r2lead is a standby only. The culprit for this problem seems to be this file: /usr/lib64/ibdiagnet1.2/ibdebug.tcl Here is the if stmt that creates the problem: ... 2988 proc CheckSM {} { 2989 global SM G 2990 set master 3 2991 if {![info exists SM($master)]} { 2992 inform "-I-ibdiagnet:bad.sm.header" 2993 inform "-E-ibdiagnet:no.SM" 2994 } else { 2995 if {[llength $SM($master)] != 1} { ==> ^^^^ 2996 inform "-I-ibdiagnet:bad.sm.header" 2997 inform "-E-ibdiagnet:many.SM.master" 2998 foreach element $SM($master) { 2999 set tmpDirectPath [lindex $element 0] 3000 set nodeName [DrPath2Name $tmpDirectPath -port [GetEntryPort $tmpDirectPath]] 3001 if { $tmpDirectPath == "" } { .... It appears that this code does not factor in the priority of an individual FM. It simply counts the FM instances and if the resulting number not equals 1, then this tools indicates an error. >From studying the OFED code (osm_state_mgr.h::osm_sm_is_greater_than()) it is clear that, even if two FM instances for the same fabric have an identical priority, there is always only one winner by resolving the tie via guids. Here is the relevant OFED code: static inline boolean_t osm_sm_is_greater_than(IN const uint8_t l_priority, IN const ib_net64_t l_guid, IN const uint8_t r_priority, IN const ib_net64_t r_guid) { if (l_priority > r_priority) { return (TRUE); } else { if (l_priority == r_priority) { if (cl_ntoh64(l_guid) < cl_ntoh64(r_guid)) { return (TRUE); } } } return (FALSE); } Thus, in my opinion the check against number of FM instances in ibdebug.tcl is superfluous. And indeed, removing the check resolves the issue. The new version of the above func looks like this: proc CheckSM {} { global SM G set master 3 if {![info exists SM($master)]} { inform "-I-ibdiagnet:bad.sm.header" inform "-E-ibdiagnet:no.SM" } return 0 } This simply checks whether there is a FM instance at all. If there is none, then that constitutes an error. However, multiple FM instances should not create an error. Thanx, Matthias From sean.hefty at intel.com Mon Aug 25 15:34:44 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Mon, 25 Aug 2008 15:34:44 -0700 Subject: [ofa-general] [PATCH 2.6.27] ib/cm: free cm_device structure In-Reply-To: References: Message-ID: >Is this really correct? Couldn't something (eg an open file from sysfs) >still have a reference to cm_dev->device even after device_unregister() >returns, in which case this ends up freeing an object that's still live? I asked myself the same question. and I think this is okay. I'd appreciate if anyone else could review the changes though; the most relevant pieces of code are in cm.c immediately before and after cm_add_one(). The CM counter sysfs files are created off the cm_port object, which is allocated separately, and is freed through the kobject .release method. The cm_port object has a pointer to cm_dev, but only references it during typical CM operations (sending messages, creating reply AVs, etc.) The cm_port does make a reference to cm_dev->device->kobj when calling kobject_init_and_add(), but the lifetime for cm_dev->device->kobj should be managed separately from cm_dev. - Sean From matthias at sgi.com Mon Aug 25 16:18:03 2008 From: matthias at sgi.com (Matthias Blankenhaus) Date: Mon, 25 Aug 2008 16:18:03 -0700 (PDT) Subject: [ofa-general] osmtest dies with SIGABRT / buffer overflow Message-ID: Howdy ! I played around with osmtest and got it to a point where I can consistenly crash osmtest. Please, take a look at the following: OFED-1.3.1 HW: X86_64 OS: SLES10SP2 Here is what I did to crash it: # osmtest -f c // works fine and creates osmtest.dat # osmtest -v // crashes ... STACK TRACE =========== Aug 22 17:33:35 076768 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Expected and found 0 records Aug 22 17:33:35 076781 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Getting service record: id: 0x0000000019494496 Aug 22 17:33:35 076795 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 076925 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Found service record: name: osmt.srvc.719885380.6244 id: 0x0000000019494496 Aug 22 17:33:35 076939 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Expected and found 1 records Aug 22 17:33:35 076951 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Getting service record: id: 0x00007fff3b7751d0 Aug 22 17:33:35 076964 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077052 [41001940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 22 17:33:35 077064 [41001940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 22 17:33:35 077089 [6FCE12E0] 0x01 -> osmt_get_service_by_id: IS EXPECTED ERROR ^^^^ Aug 22 17:33:35 077100 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Found service record: name: id: 0x00007fff3b7751d0 Aug 22 17:33:35 077107 [6FCE12E0] 0x04 -> osmt_get_service_by_id: Expected and found 0 records Aug 22 17:33:35 077117 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Getting service record: id: 0x000000006b8b2d03 and name: osmt.srvc.1804289383.6244 Aug 22 17:33:35 077132 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077235 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Found service record: name: osmt.srvc.1804289383.6244 id: 0x000000006b8b2d03 Aug 22 17:33:35 077248 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Expected and found 1 records Aug 22 17:33:35 077261 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Getting service record: id: 0x0000000019494496 and name: osmt.srvc.719885380.6244 Aug 22 17:33:35 077274 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077368 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Found service record: name: osmt.srvc.719885380.6244 id: 0x0000000019494496 Aug 22 17:33:35 077379 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Expected and found 1 records Aug 22 17:33:35 077391 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Getting service record: id: 0x000000006b8b2d03 and name: osmt.srvc.1714636912.6244 Aug 22 17:33:35 077404 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077495 [41001940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 22 17:33:35 077507 [41001940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 22 17:33:35 077528 [6FCE12E0] 0x01 -> osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ Aug 22 17:33:35 077536 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Found service record: name: osmt.srvc.1714636912.6244 id: 0x000000006b8b2d03 Aug 22 17:33:35 077541 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Expected and found 0 records Aug 22 17:33:35 077555 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Getting service record: id: 0x000000006633300c and name: osmt.srvc.424238330.6244 Aug 22 17:33:35 077569 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077655 [41001940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 22 17:33:35 077664 [41001940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 22 17:33:35 077682 [6FCE12E0] 0x01 -> osmt_get_service_by_id_and_name: IS EXPECTED ERROR ^^^^ Aug 22 17:33:35 077689 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Found service record: name: osmt.srvc.424238330.6244 id: 0x000000006633300c Aug 22 17:33:35 077694 [6FCE12E0] 0x04 -> osmt_get_service_by_id_and_name: Expected and found 0 records Aug 22 17:33:35 077705 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Getting service record: name: osmt.srvc.1957747789.6244 Aug 22 17:33:35 077717 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077810 [41001940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 22 17:33:35 077819 [41001940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 22 17:33:35 077831 [6FCE12E0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 22 17:33:35 077839 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Found service record: name: osmt.srvc.1957747789.6244 id: 0x0900000000000000 Aug 22 17:33:35 077846 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Expected and found 0 records Aug 22 17:33:35 077857 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Getting service record: name: osmt.srvc.424238330.6244 Aug 22 17:33:35 077869 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 077958 [41001940] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0003 Aug 22 17:33:35 077970 [41001940] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR) Aug 22 17:33:35 077983 [6FCE12E0] 0x01 -> osmt_get_service_by_name: IS EXPECTED ERROR ^^^^ Aug 22 17:33:35 077992 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Found service record: name: osmt.srvc.424238330.6244 id: 0x0900000000000000 Aug 22 17:33:35 077997 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Expected and found 0 records Aug 22 17:33:35 078007 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Getting service record: name: osmt.srvc.719885380.6244 Aug 22 17:33:35 078020 [6FCE12E0] 0x04 -> osm_vendor_send: RMPP 0 length 256 Aug 22 17:33:35 078120 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Found service record: name: osmt.srvc.719885380.6244 id: 0x0000000019494496 Aug 22 17:33:35 078132 [6FCE12E0] 0x04 -> osmt_get_service_by_name: Expected and found 1 records *** buffer overflow detected ***: /usr/sbin/osmtest terminated Aug 22 17:33:35 079046 [41001940] 0x01 -> umad_receiver: ERR 5404: recv error on MAD sized umad (Interrupted system call) Aug 22 17:33:35 080420 [41001940] 0x01 -> umad_receiver: ERR 5404: recv error on MAD sized umad (Interrupted system call) ======= Backtrace: ========= /lib64/libc.so.6(__chk_fail+0x2f)[0x2b366fb7231f] /lib64/libc.so.6[0x2b366fb71859] /lib64/libc.so.6(_IO_default_xsputn+0x8e)[0x2b366fb09d0e] /lib64/libc.so.6(_IO_padn+0x9b)[0x2b366fafe60b] /lib64/libc.so.6(_IO_vfprintf+0x1467)[0x2b366fae2157] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x2b366fb718fd] /lib64/libc.so.6(__sprintf_chk+0x80)[0x2b366fb71840] /usr/sbin/osmtest[0x40fa51] /usr/sbin/osmtest[0x4110e4] /usr/sbin/osmtest[0x40cf13] /usr/sbin/osmtest[0x402821] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2b366fabd184] /usr/sbin/osmtest[0x401d79] ======= Memory map: ======== 00400000-00428000 r-xp 00000000 08:06 668362 /usr/sbin/osmtest 00528000-00529000 rw-p 00028000 08:06 668362 /usr/sbin/osmtest 00529000-005f0000 rw-p 00529000 00:00 0 [heap] 40000000-40001000 ---p 40000000 00:00 0 40001000-40801000 rw-p 40001000 00:00 0 40801000-40802000 ---p 40801000 00:00 0 40802000-41002000 rw-p 40802000 00:00 0 2aaaaaade000-2aaaaaaeb000 r-xp 00000000 08:06 536874380 /lib64/libgcc_s.so.1 2aaaaaaeb000-2aaaaabea000 ---p 0000d000 08:06 536874380 /lib64/libgcc_s.so.1 2aaaaabea000-2aaaaabeb000 rw-p 0000c000 08:06 536874380 /lib64/libgcc_s.so.1 2b366f330000-2b366f34b000 r-xp 00000000 08:06 536874326 /lib64/ld-2.4.so 2b366f34b000-2b366f34d000 rw-p 2b366f34b000 00:00 0 2b366f44a000-2b366f44c000 rw-p 0001a000 08:06 536874326 /lib64/ld-2.4.so 2b366f44c000-2b366f44f000 r-xp 00000000 08:06 612666 /usr/lib64/libibcommon.so.1.0.0 2b366f44f000-2b366f54e000 ---p 00003000 08:06 612666 /usr/lib64/libibcommon.so.1.0.0 2b366f54e000-2b366f54f000 rw-p 00002000 08:06 612666 /usr/lib64/libibcommon.so.1.0.0 2b366f54f000-2b366f55e000 r-xp 00000000 08:06 642309 /usr/lib64/libopensm.so.1.1.0 2b366f55e000-2b366f65e000 ---p 0000f000 08:06 642309 /usr/lib64/libopensm.so.1.1.0 2b366f65e000-2b366f660000 rw-p 0000f000 08:06 642309 /usr/lib64/libopensm.so.1.1.0 2b366f660000-2b366f66c000 r-xp 00000000 08:06 642311 /usr/lib64/libosmcomp.so.2.0.4 2b366f66c000-2b366f76c000 ---p 0000c000 08:06 642311 /usr/lib64/libosmcomp.so.2.0.4 2b366f76c000-2b366f76d000 rw-p 0000c000 08:06 642311 /usr/lib64/libosmcomp.so.2.0.4 2b366f76d000-2b366f774000 r-xp 00000000 08:06 642312 /usr/lib64/libosmvendor.so.2.0.0 2b366f774000-2b366f874000 ---p 00007000 08:06 642312 /usr/lib64/libosmvendor.so.2.0.0 2b366f874000-2b366f875000 rw-p 00007000 08:06 642312 /usr/lib64/libosmvendor.so.2.0.0 2b366f875000-2b366f876000 rw-p 2b366f875000 00:00 0 2b366f876000-2b366f87b000 r-xp 00000000 08:06 613219 /usr/lib64/libibumad.so.1.0.3 2b366f87b000-2b366f97a000 ---p 00005000 08:06 613219 /usr/lib64/libibumad.so.1.0.3 2b366f97a000-2b366f97b000 rw-p 00004000 08:06 613219 /usr/lib64/libibumad.so.1.0.3 2b366f97b000-2b366f97c000 rw-p 2b366f97b000 00:00 0 2b366f987000-2b366f99b000 r-xp 00000000 08:06 536874401 /lib64/libpthread-2.4.so 2b366f99b000-2b366fa9a000 ---p 00014000 08:06 536874401 /lib64/libpthread-2.4.so 2b366fa9a000-2b366fa9c000 rw-p 00013000 08:06 536874401 /lib64/libpthread-2.4.so 2b366fa9c000-2b366faa0000 rw-p 2b366fa9c000 00:00 0 2b366faa0000-2b366fbd6000 r-xp 00000000 08:06 536874368 /lib64/libc-2.4.so 2b366fbd6000-2b366fcd6000 ---p 00136000 08:06 536874368 /lib64/libc-2.4.so 2b366fcd6000-2b366fcd9000 r--p 00136000 08:06 536874368 /lib64/libc-2.4.so 2b366fcd9000-2b366fcdb000 rw-p 00139000 08:06 536874368 /lib64/libc-2.4.so 2b366fcdb000-2b366fce2000 rw-p 2b366fcdb000 00:00 0 7fff3b765000-7fff3b77a000 rw-p 7fff3b765000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] Program received signal SIGABRT, Aborted. [Switching to Thread 47512804004576 (LWP 6244)] 0x00002b366facfbb5 in raise () from /lib64/libc.so.6 (gdb) where #0 0x00002b366facfbb5 in raise () from /lib64/libc.so.6 #1 0x00002b366fad0fb0 in abort () from /lib64/libc.so.6 #2 0x00002b366fb0632b in __libc_message () from /lib64/libc.so.6 #3 0x00002b366fb7231f in __chk_fail () from /lib64/libc.so.6 #4 0x00002b366fb71859 in _IO_str_chk_overflow () from /lib64/libc.so.6 #5 0x00002b366fb09d0e in _IO_default_xsputn_internal () from /lib64/libc.so.6 #6 0x00002b366fafe60b in _IO_padn_internal () from /lib64/libc.so.6 #7 0x00002b366fae2157 in vfprintf () from /lib64/libc.so.6 #8 0x00002b366fb718fd in __vsprintf_chk () from /lib64/libc.so.6 #9 0x00002b366fb71840 in __sprintf_chk () from /lib64/libc.so.6 #10 0x000000000040fa51 in osmt_get_service_by_name_and_key (p_osmt=0x528680, sr_name=0x7fff3b774f40 "osmt.srvc.424238330.6244", rec_num=0, skey=0x7fff3b7751a0 "", p_out_rec=0x7fff3b775080) at osmt_service.c:755 #11 0x00000000004110e4 in osmt_run_service_records_flow (p_osmt=0x528680) at osmt_service.c:1571 #12 0x000000000040cf13 in osmtest_run (p_osmt=0x1864) at osmtest.c:7877 #13 0x0000000000402821 in main (argc=, argv=0x7fff3b778a38) at main.c:615 Further investigation show: (gdb) where #0 0x00002b366facfbb5 in raise () from /lib64/libc.so.6 #1 0x00002b366fad0fb0 in abort () from /lib64/libc.so.6 #2 0x00002b366fb0632b in __libc_message () from /lib64/libc.so.6 #3 0x00002b366fb7231f in __chk_fail () from /lib64/libc.so.6 #4 0x00002b366fb71859 in _IO_str_chk_overflow () from /lib64/libc.so.6 #5 0x00002b366fb09d0e in _IO_default_xsputn_internal () from /lib64/libc.so.6 #6 0x00002b366fafe60b in _IO_padn_internal () from /lib64/libc.so.6 #7 0x00002b366fae2157 in vfprintf () from /lib64/libc.so.6 #8 0x00002b366fb718fd in __vsprintf_chk () from /lib64/libc.so.6 #9 0x00002b366fb71840 in __sprintf_chk () from /lib64/libc.so.6 #10 0x000000000040fa51 in osmt_get_service_by_name_and_key (p_osmt=0x528680, sr_name=0x7fff3b774f40 "osmt.srvc.424238330.6244", rec_num=0, skey=0x7fff3b7751a0 "", p_out_rec=0x7fff3b775080) at osmt_service.c:755 #11 0x00000000004110e4 in osmt_run_service_records_flow (p_osmt=0x528680) at osmt_service.c:1571 #12 0x000000000040cf13 in osmtest_run (p_osmt=0x1864) at osmtest.c:7877 #13 0x0000000000402821 in main (argc=, argv=0x7fff3b778a38) at main.c:615 (gdb) up #1 0x00002b366fad0fb0 in abort () from /lib64/libc.so.6(gdb) up #2 0x00002b366fb0632b in __libc_message () from /lib64/libc.so.6(gdb) up #3 0x00002b366fb7231f in __chk_fail () from /lib64/libc.so.6(gdb) up #4 0x00002b366fb71859 in _IO_str_chk_overflow () from /lib64/libc.so.6(gdb) up #5 0x00002b366fb09d0e in _IO_default_xsputn_internal () from /lib64/libc.so.6(gdb) up #6 0x00002b366fafe60b in _IO_padn_internal () from /lib64/libc.so.6(gdb) up #7 0x00002b366fae2157 in vfprintf () from /lib64/libc.so.6(gdb) up #8 0x00002b366fb718fd in __vsprintf_chk () from /lib64/libc.so.6(gdb) up #9 0x00002b366fb71840 in __sprintf_chk () from /lib64/libc.so.6(gdb) up #10 0x000000000040fa51 in osmt_get_service_by_name_and_key (p_osmt=0x528680, sr_name=0x7fff3b774f40 "osmt.srvc.424238330.6244", rec_num=0, skey=0x7fff3b7751a0 "", p_out_rec=0x7fff3b775080) at osmt_service.c:755 Finally, looking at the code it looks like we have a buffer length problem: ofed/opensm/opensm-3.1.10.sgi/osmtest/osmt_service.c: 736 osmt_get_service_by_name_and_key(IN osmtest_t * const p_osmt, 737 IN char *sr_name, 738 IN uint32_t rec_num, 739 IN uint8_t * skey, 740 OUT ib_service_record_t * p_out_rec) 741 { 742 743 ib_api_status_t status = IB_SUCCESS; 744 osmtest_req_context_t context; 745 osmv_query_req_t req; 746 ib_service_record_t svc_rec, *p_rec; 747 uint32_t num_recs = 0, i; 748 osmv_user_query_t user; 749 750 OSM_LOG_ENTER(&p_osmt->log, osmt_get_service_by_name_and_key); 751 752 if (osm_log_is_active(&p_osmt->log, OSM_LOG_VERBOSE)) { 753 char buf_service_key[33]; 754 755 sprintf(buf_service_key, 756 "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", 757 skey[0], skey[1], skey[2], skey[3], skey[4], skey[5], 758 skey[6], skey[7], skey[8], skey[9], skey[10], skey[11], 759 skey[12], skey[13], skey[14], skey[15]); ... The local variable 'buf_service_key' is 33 bytes long: 0..32. However, the format string from sprintf() is 2*16+2=34 bytes long. Thus we arrive at a buffer overflow. Not knowing much about this code the fix seems obvious: crank up the size of buf_service_key to 34. Cheers, Matthias From jsquyres at cisco.com Mon Aug 25 17:05:01 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Mon, 25 Aug 2008 20:05:01 -0400 Subject: [ofa-general] New version of mpi-selector Message-ID: Vlad -- Michele Martone reported a minor typo to me in mpi-selector which I have just fixed. I have created an ofed_1_4 git branch in http://www.openfabrics.org/git/?p= ~jsquyres/mpi-selector.git;a=summary, and I have uploaded a v1.0.2 mpi- selector tarball to ~jsquyres/mpi-selector-1.0.2.tar.[gz|bz2]. Can these be added to the OFED 1.4 release? (pulling from git or using the tarball is fine -- whatever is most convenient) Thanks. -- Jeff Squyres Cisco Systems From vlad at mellanox.co.il Tue Aug 26 00:42:01 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 26 Aug 2008 10:42:01 +0300 Subject: [ofa-general] New version of mpi-selector In-Reply-To: References: Message-ID: <48B3B3C9.7070005@mellanox.co.il> Jeff Squyres wrote: > Vlad -- > > Michele Martone reported a minor typo to me in mpi-selector which I have > just fixed. I have created an ofed_1_4 git branch in > http://www.openfabrics.org/git/?p=~jsquyres/mpi-selector.git;a=summary, > and I have uploaded a v1.0.2 mpi-selector tarball to > ~jsquyres/mpi-selector-1.0.2.tar.[gz|bz2]. > > Can these be added to the OFED 1.4 release? (pulling from git or using > the tarball is fine -- whatever is most convenient) > > Thanks. > Done, Regards, Vladimir From vlad at lists.openfabrics.org Tue Aug 26 03:00:36 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 26 Aug 2008 03:00:36 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080826-0200 daily build status Message-ID: <20080826100037.05E01E6094C@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From tziporet at mellanox.co.il Tue Aug 26 07:10:13 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 26 Aug 2008 17:10:13 +0300 Subject: [ofa-general] OFED meeting agenda for today (Aug 25, 2008) Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD5D89FB@mtlexch01.mtl.com> OFED meeting summary for Aug 25, 2008 on 1.4 status and schedule ================================================================ Summary: ======== - 1.4 beta was done on Aug 21 - Agreed on release schedule: GA is expected at mid-late Oct - We will have weekly meetings starting Sep 8 - Decided on SCO8 OFA BOF subjects Details: ======== 1. OFED 1.4 status: - Beta was done on Aug 21 - Based on kernel 2.6.27-rc4 - RHEL 4.7 is supported Still missing: - iSER (disabled from OFED now) - Will be ready by end of this week/beginning of next week - NFS/RDMA - SLES 10 almost done and should be merged soon; next will work on RHEL5.1 - MVAPICH2 1.1 - Should be ready this week - Open MPI 1.3 - Decided to integrate 1.2.7 now since 1.3 beta is not ready yet. If 1.3 will become stable soon we may switch later - Extended QP verb - Voltaire and Mellanox to close this week Bugs: Tziporet will assign all bugs without assignee. In our next meeting we will review all bugs and decide which should be fixed 2. OFED 1.4 schedule: The RCs plan: - RC1 - Sept 3, 2008 - RC2 - Sept 17, 2008 - RC3 - Sept 25, 2008 - more RCs - as needed - GA - mid-late October (based on bugs we will decide to fix) This schedule meets the Interop event needs: RC on Sep 22, and GA by October 29. 3. OFA BOF in SC08: Subject we will cover: - Overview of OFED 1.4 - OFED status in Linux distros - Overview of WinOF 2.0 - Update of things we decided in Sonoma - Some time for feedback from the audience: Q&A and ideas/features for the future Woody & Betsy will coordinate this on September (need to work with Gilad to get the Windows data) 4. OFA server upgrade update: Server is ready but Jeff B. does not have time due to the NFS/RDMA backports. Need to see if he can get help with administration. Tziporet From christopher.tanner at gatech.edu Tue Aug 26 09:08:43 2008 From: christopher.tanner at gatech.edu (Christopher Tanner) Date: Tue, 26 Aug 2008 12:08:43 -0400 Subject: [ofa-general] Re: Infiniband packages for Hardy In-Reply-To: References: Message-ID: Thanks for the info Roland. I just subscribed to the openfabrics list, so I'll start emailing there from now on. I created your script and the mlx4_ib module is loading, so that's a good step. However, OpenMPI is still not finding the HCAs. I'm getting another error (on each node) that I didn't post in the Ubuntu forum: libibverbs: Fatal: couldn't read uverbs ABI version. I uninstalled and reinstalled the libibverbs1 package (apt-get purge, apt-get install), but the error persists. Attached is an output from an lsmod (just to confirm the loaded modules). -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lsmod_out.txt URL: -------------- next part -------------- ------------------------------------------- Chris Tanner Space Systems Design Lab Georgia Institute of Technology christopher.tanner at gatech.edu ------------------------------------------- On Aug 26, 2008, at 11:11 AM, Roland Dreier wrote: >> Apologies for contacting you directly - are you available to provide >> some support for the Infiniband packages you built for Ubuntu 8.04? I >> posted my issue on the Ubuntu forums a while ago >> (http://ubuntuforums.org/showthread.php?t=896924 ), but haven't >> received any response yet. > > No problem, I don't really read web forums. The best way to get help > with IB in general is to email the list general at lists.openfabrics.org. > > In your particular case I would guess the problem is that you don't > have > the mlx4_ib module loaded; by default only the mlx4_core module will > be > auto-loaded. To test this, you can do "sudo modprobe mlx4_ib" by hand > and try it. (This is all based on the fact that you installed the > libmlx4 packages, so I'm assuming you have ConnectX cards). > > A better solution would be to create a file named > /etc/modprobe.d/mlx4_core with the line > > install mlx4_core /sbin/modprobe --ignore-install mlx4_core && /sbin/ > modprobe mlx4_ib > > in it, which should make mlx4_ib load by default on boot. > >> Also, if you are willing to provide support, I would like your >> permission to post some of your suggestions and the ultimate >> resolution on the Ubuntu forum so that others can benefit from our >> dialogue. > > Go ahead and add it to the thread. > > - R. From rdreier at cisco.com Tue Aug 26 09:12:51 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 26 Aug 2008 09:12:51 -0700 Subject: [ofa-general] Re: Infiniband packages for Hardy In-Reply-To: (Christopher Tanner's message of "Tue, 26 Aug 2008 12:08:43 -0400") References: Message-ID: > libibverbs: Fatal: couldn't read uverbs ABI version. OK, probably another missing module. Try "sudo modprobe rdma_ucm" (and you can add an "rdma_ucm" line to /etc/modules to get it loaded automatically). - R. From rdreier at cisco.com Tue Aug 26 11:17:36 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 26 Aug 2008 11:17:36 -0700 Subject: [ofa-general] Re: IB/ipath: remove unused #include In-Reply-To: <20080823131721.1277.WEIYI.HUANG@gmail.com> (Huang Weiyi's message of "Sat, 23 Aug 2008 13:56:16 +0800") References: <20080823131721.1277.WEIYI.HUANG@gmail.com> Message-ID: > The driver(s) below do not use LINUX_VERSION_CODE nor KERNEL_VERSION. > drivers/infiniband/hw/ipath/ipath_fs.c > This patch removes the said #include . Looks like all includes of linux/version.h were already removed from drivers/infiniband by the commit 7a8fc9b2 ("removed unused #include 's"). Thanks, Roland From kliteyn at dev.mellanox.co.il Tue Aug 26 13:11:59 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 26 Aug 2008 23:11:59 +0300 Subject: [ofa-general] osmtest dies with SIGABRT / buffer overflow In-Reply-To: References: Message-ID: <48B4638F.7050500@dev.mellanox.co.il> Hi Matthias, Matthias Blankenhaus wrote: > Howdy ! > > I played around with osmtest and got it to a point where I can consistenly > crash osmtest. Please, take a look at the following: > > OFED-1.3.1 > HW: X86_64 > OS: SLES10SP2 > > > Here is what I did to crash it: > > # osmtest -f c // works fine and creates osmtest.dat > # osmtest -v // crashes ... > > ... > > Finally, looking at the code it looks like we have a buffer length > problem: > > ... > > 755 sprintf(buf_service_key, > 756 "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", > 757 skey[0], skey[1], skey[2], skey[3], skey[4], skey[5], > 758 skey[6], skey[7], skey[8], skey[9], skey[10], skey[11], > 759 skey[12], skey[13], skey[14], skey[15]); > ... > > > The local variable 'buf_service_key' is 33 bytes long: 0..32. However, > the format string from sprintf() is 2*16+2=34 bytes long. Thus we arrive > at a buffer overflow. Not knowing much about this code the fix seems > obvious: crank up the size of buf_service_key to 34. > You're right, looks like the leading "0x" was forgotten when buffer length was calculated. Thanks for the detailed analysis! You also right about the fix - the short fix would be just increasing the buffer size. The longer fix, however, would be getting rid of the unnecessary sprintf usage completely. In OFED 1.4 this code is already fixed (as well as all the other places in opensm/osmtest where sprintf was used). Anyway, I'll send an OFED 1.3.1 patch to Sasha - let's go with the short fix :) -- Yevgeny > > Cheers, > Matthias > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From kliteyn at dev.mellanox.co.il Tue Aug 26 13:26:30 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 26 Aug 2008 23:26:30 +0300 Subject: [ofa-general] ibdiagnet FM master / standby error report In-Reply-To: References: Message-ID: <48B466F6.6090602@dev.mellanox.co.il> Hi Matthias, Matthias Blankenhaus wrote: > Howdy ! > > I have noticed that ibdiagnet reports an error when using a > master / standby FM configuration. I am using OFED-1.3.1. > > Here it goes: > > # ibdiagnet > .... > -I--------------------------------------------------- > -I- Bad Fabric SM Info > -I--------------------------------------------------- > -E- Found more then one master SM in the discover fabric > r1lead/P1 priority:15 > r2lead/P1 priority:0 > .... > -I- Stages Status Report: > STAGE Errors Warnings > Bad GUIDs/LIDs Check 0 0 > Link State Active Check 0 0 > SM Info Check 1 0 > Performance Counters Report 0 6 > Partitions Check 0 0 > IPoIB Subnets Check 0 0 > > > This is incorrect as we have only one master namely r1lead. r2lead is a > standby only. > > > The culprit for this problem seems to be this file: > > /usr/lib64/ibdiagnet1.2/ibdebug.tcl > > Here is the if stmt that creates the problem: > ... > 2988 proc CheckSM {} { > 2989 global SM G > 2990 set master 3 > 2991 if {![info exists SM($master)]} { > 2992 inform "-I-ibdiagnet:bad.sm.header" > 2993 inform "-E-ibdiagnet:no.SM" > 2994 } else { > 2995 if {[llength $SM($master)] != 1} { > ==> ^^^^ > > 2996 inform "-I-ibdiagnet:bad.sm.header" > 2997 inform "-E-ibdiagnet:many.SM.master" > 2998 foreach element $SM($master) { > 2999 set tmpDirectPath [lindex $element 0] > 3000 set nodeName [DrPath2Name $tmpDirectPath -port [GetEntryPort $tmpDirectPath]] > 3001 if { $tmpDirectPath == "" } { > .... > > It appears that this code does not factor in the priority of an individual > FM. It simply counts the FM instances and if the resulting number not > equals 1, then this tools indicates an error. > > From studying the OFED code (osm_state_mgr.h::osm_sm_is_greater_than()) it > is clear that, even if two FM instances for the same fabric have an > identical priority, there is always only one winner by resolving the tie via guids. > > Here is the relevant OFED code: > > static inline boolean_t > osm_sm_is_greater_than(IN const uint8_t l_priority, > IN const ib_net64_t l_guid, > IN const uint8_t r_priority, IN const ib_net64_t r_guid) > { > if (l_priority > r_priority) { > return (TRUE); > } else { > if (l_priority == r_priority) { > if (cl_ntoh64(l_guid) < cl_ntoh64(r_guid)) { > return (TRUE); > } > } > } > return (FALSE); > } > > > Thus, in my opinion the check against number of FM instances in > ibdebug.tcl is superfluous. And indeed, removing the check resolves the > issue. The new version of the above func looks like this: > > > proc CheckSM {} { > global SM G > set master 3 > if {![info exists SM($master)]} { > inform "-I-ibdiagnet:bad.sm.header" > inform "-E-ibdiagnet:no.SM" > } > return 0 > } > > > This simply checks whether there is a FM instance at all. If there is > none, then that constitutes an error. However, multiple FM instances > should not create an error. This check in ibdiagnet is supposed to report an error in case of more than one *master* SM in the subnet. In some cases it may happen, so the check is valid. However, I think that only master SM can get into that SM list, so either you really have a problem with two master SMs in subnet, or there is a bug in ibdiagnet and somehow it included non-master SM in that list. -- Yevgeny > > Thanx, > Matthias > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From rdreier at cisco.com Tue Aug 26 15:31:59 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 26 Aug 2008 15:31:59 -0700 Subject: [ofa-general] Re: [PATCH] mlx4_ib: set lkey and rkey for the fast memory region. In-Reply-To: <20080825074315.GA10923@mellanox.co.il> (Vladimir Sokolovsky's message of "Mon, 25 Aug 2008 10:43:15 +0300") References: <20080825074315.GA10923@mellanox.co.il> Message-ID: thanks, applied for 2.6.27 From kliteyn at dev.mellanox.co.il Wed Aug 27 00:04:20 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 27 Aug 2008 10:04:20 +0300 Subject: [ofa-general] [PATCH] osmtest: fixing core dump (ofed_1_3) Message-ID: <48B4FC74.2070903@dev.mellanox.co.il> Hi Sasha, As Matthias points out, the buffer that is used by sprintf is too small. Looks like leading '0x' was overlooked when the buffer length was calculated. Please apply to OFED_1_3 branch only - you have already cleaned up all the sprintf usage in the trunk. Signed-off-by: Yevgeny Kliteynik --- opensm/osmtest/osmt_service.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/osmtest/osmt_service.c b/opensm/osmtest/osmt_service.c index 6d377af..b5fea32 100644 --- a/opensm/osmtest/osmt_service.c +++ b/opensm/osmtest/osmt_service.c @@ -750,7 +750,7 @@ osmt_get_service_by_name_and_key(IN osmtest_t * const p_osmt, OSM_LOG_ENTER(&p_osmt->log, osmt_get_service_by_name_and_key); if (osm_log_is_active(&p_osmt->log, OSM_LOG_VERBOSE)) { - char buf_service_key[33]; + char buf_service_key[35]; sprintf(buf_service_key, "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", -- 1.5.1.4 From vlad at lists.openfabrics.org Wed Aug 27 02:59:15 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 27 Aug 2008 02:59:15 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080827-0200 daily build status Message-ID: <20080827095915.E6C9CE60385@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ia64 with linux-2.6.24 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Passed on ppc64 with linux-2.6.19 Failed: From michael.brooks at qlogic.com Wed Aug 27 10:44:28 2008 From: michael.brooks at qlogic.com (Michael Brooks) Date: Wed, 27 Aug 2008 12:44:28 -0500 Subject: [ofa-general] [PATCH] Bug 988: BMA responses are discarded in kernel Message-ID: Notice of fix for bug 988 (https://bugs.openfabrics.org/show_bug.cgi?id=988). The following patch resolves an issue where incoming BMA responses are dropped due to a bad "is response" check. Fixed to use the ib_response_mad() predicate, which correctly handles BMA MADs. Signed-off-by: Michael Brooks --- diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 6f42877..19d9468 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -1693,9 +1693,8 @@ static inline int rcv_has_same_gid(struct ib_mad_agent_private *mad_agent_priv, u8 port_num = mad_agent_priv->agent.port_num; u8 lmc; - send_resp = ((struct ib_mad *)(wr->send_buf.mad))-> - mad_hdr.method & IB_MGMT_METHOD_RESP; - rcv_resp = rwc->recv_buf.mad->mad_hdr.method & IB_MGMT_METHOD_RESP; + send_resp = ib_response_mad((struct ib_mad *)wr->send_buf.mad); + rcv_resp = ib_response_mad(rwc->recv_buf.mad); if (send_resp == rcv_resp) /* both requests, or both responses. GIDs different */ From sean.hefty at intel.com Wed Aug 27 11:00:46 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Wed, 27 Aug 2008 11:00:46 -0700 Subject: [ofa-general] RE: [PATCH] ib/mad: BMA responses are discarded in kernel In-Reply-To: References: Message-ID: >Notice of fix for bug 988 >(https://bugs.openfabrics.org/show_bug.cgi?id=988). > >The following patch resolves an issue where incoming BMA responses are >dropped due to a bad "is response" check. Fixed to use the >ib_response_mad() predicate, which correctly handles BMA MADs. > >Signed-off-by: Michael Brooks Acked-by: Sean Hefty >--- Given how long this bug has been around, I'm guessing this is okay to wait for 2.6.28. Does that seem okay with you, Michael? >diff --git a/drivers/infiniband/core/mad.c >b/drivers/infiniband/core/mad.c >index 6f42877..19d9468 100644 >--- a/drivers/infiniband/core/mad.c >+++ b/drivers/infiniband/core/mad.c >@@ -1693,9 +1693,8 @@ static inline int rcv_has_same_gid(struct >ib_mad_agent_private *mad_agent_priv, > u8 port_num = mad_agent_priv->agent.port_num; > u8 lmc; > >- send_resp = ((struct ib_mad *)(wr->send_buf.mad))-> >- mad_hdr.method & IB_MGMT_METHOD_RESP; >- rcv_resp = rwc->recv_buf.mad->mad_hdr.method & >IB_MGMT_METHOD_RESP; >+ send_resp = ib_response_mad((struct ib_mad *)wr->send_buf.mad); >+ rcv_resp = ib_response_mad(rwc->recv_buf.mad); > > if (send_resp == rcv_resp) > /* both requests, or both responses. GIDs different */ >_______________________________________________ >general mailing list >general at lists.openfabrics.org >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From michael.brooks at qlogic.com Wed Aug 27 11:24:55 2008 From: michael.brooks at qlogic.com (Michael Brooks) Date: Wed, 27 Aug 2008 13:24:55 -0500 Subject: [ofa-general] RE: [PATCH] ib/mad: BMA responses are discarded in kernel In-Reply-To: References: Message-ID: > Given how long this bug has been around, I'm guessing this is okay to wait for 2.6.28. Does that seem okay with you, Michael? We've managed to work around it in past releases, but it's a thorn and would be nice to see resolved sooner than later. From amirv at mellanox.co.il Wed Aug 27 12:36:55 2008 From: amirv at mellanox.co.il (Amir Vadai) Date: Wed, 27 Aug 2008 22:36:55 +0300 Subject: [ofa-general] RE: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets References: <48AC445D.2050704@gmail.com> <5D49E7A8952DC44FB38C38FA0D758EAD5865EA@mtlexch01.mtl.com> <48AD9C80.8030305@gmail.com> <1219590681.1564.10.camel@amirv-laptop> <48B2CD3A.5020509@gmail.com> Message-ID: <5D49E7A8952DC44FB38C38FA0D758EAD61E699@mtlexch01.mtl.com> Yossi Hi, I'm on vacation till Monday. I'll check when can we have the full fix - and if it is not in the near future we'll put your patch till the full fix be prepared. - Amir -----Original Message----- From: Yossi Etigin [mailto:yossi.openib at gmail.com] Sent: Mon 8/25/2008 6:18 PM To: Amir Vadai Cc: general list; Oren Duer; Olga Shern Subject: Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets Hi Amir, The single case in which we block connect() here (and only on SDP, which is rather fast) is the case that is currenlty not supported anyway. It can also be configurable. Anyway, we have a client which uses non-blocking sockets and really needs that feature. How about putting this to OFED now and writing something better later on? --Yossi Amir Vadai wrote: > See below > > On Thu, 2008-08-21 at 19:49 +0300, Yossi Etigin wrote: >> Hi Amir, >> >> What you suggesting is to replace almost all socket functions, and I >> don't think that this is good either. > I agree - but to break the non-blocking semantics is worse. > >> It would be write(), send(), recv(), sendto(), recvfrom(), sendmsg(), >> recvmsg(), and also need to change select() (to not return when >> fallback >> happens if SDP fails), and maybe also poll(). libsdp tries to avoid >> the fast path. > I don't see another option. We could have a #ifdef to enable the user > to choose - non blocking support or cleaner fast-path. >> Besides, how do we know when to do fallback - can we safely assume >> that if some socket operation fails, then it happened because >> connect() failed? >>From a brief look at connect man page, they say we should use select for > writing on the socket. after select indicates writability, use > getsockopt to determine whether connect() completed successfully or not. >> Anyway, if I understand correctly, you suggest something like: >> >> int connect(fd, ...) >> { >> ... >> set_state(fd, SDP) >> ... >> } >> >> >> int read(int fd, ...) >> { >> int res = socket_funcs.read(shadow_fd(fd), ...); >> if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP) { >> sock_state = TCP; >> sockt_funs.connect(fd,...); >> close(shadow_fd(fd)); >> errno = EAGAIN; >> } >> return res; >> } >> >> > ... again, I don't like it too - but I don't think we should block > connect when the user asks not to. > - Amir. >> --Yossi >> >> Amir Vadai wrote: >>> Yossi Hi, >>> >>> I think that breaking the semantic of non blocking socket is a bad >> idea. >>> There is a solution that won't break this semantics: >>> >>> 1. User app calls connect(). >>> - libsdp try to connect through sdp. >>> 2. User app try another operation on the socket (e.g read/write) >>> - if sdp connection established successfully - great >>> - if sdp still not established - return -EAGAIN. This is the >>> same behaviour as if the tcp connection wasn't connected yet. >>> - if sdp timedout - return -EAGAIN and initiate TCP connect. >>> - if tcp connection established - use it >>> - if tcp connection timedout - return error. >>> >>> Maybe we could optimize it and initiate a tcp connection in parallel >>> with the sdp connection and use it only when the sdp connect is >>> timedout. >>> >>> I will add only the second patch (the debug print fix). >>> >>> - Amir >>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Aug 27 14:29:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 27 Aug 2008 14:29:57 -0700 Subject: [ofa-general] [PATCH] IB/mlx4: Actually return L_Key and R_Key for fast register MRs Message-ID: From: Vladimir Sokolovsky Initialize the L_Key and R_Key for memory regions returned from mlx4_ib_alloc_fast_reg_mr(). Otherwise callers just get garbage for the memory keys and can't do anything useful with these MRs. Signed-off-by: Vladimir Sokolovsky Signed-off-by: Roland Dreier --- Hi Linus, Please apply this for 2.6.27. This fixes a new feature we merged during the window but weren't able to test fully because device firmware wasn't ready at the time. I'm just sending this as a patch rather than a git pull request since I think applying one patch from email is, if anything, easier than pulling a git tree. If you'd rather get singleton patches via git in the future just let me know. Thanks, Roland drivers/infiniband/hw/mlx4/mr.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index a4cdb46..87f5c5a 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -204,6 +204,8 @@ struct ib_mr *mlx4_ib_alloc_fast_reg_mr(struct ib_pd *pd, if (err) goto err_mr; + mr->ibmr.rkey = mr->ibmr.lkey = mr->mmr.key; + return &mr->ibmr; err_mr: -- 1.5.6.3 From pmorreale at novell.com Wed Aug 27 15:16:04 2008 From: pmorreale at novell.com (Peter W. Morreale) Date: Wed, 27 Aug 2008 22:16:04 +0000 Subject: [ofa-general] ibv_reg_mr() failing to an mmap'ed iomemory region. Message-ID: <1219875364.23236.268.camel@hermosa.site> Hi all, I have an RDMA application that is failing in ibv_reg_mr() and I'm unsure why. The application consists of two parts, a userspace application that is performing the RDMA transfers via verbs, and a kernel module that maintains the memory space used in the transfers. This is on a SUSE 2.6.22 kernel. The kernel module manages a contiguous region of physical memory and provides a mmap interface to map the regions to userspace. The application dirties the memory and periodically transfers the contents to a remote node. The transfer algorithm supports multiple transports including Ethernet, SDP, and RDMA. Or rather, is designed to support RDMA. :-) Things work fine for Ethernet sockets and SDP, however I consistently fail attempting to ibv_reg_mr(). Since this is a boolean operation (returns a pointer or NULL) I'm not sure what is wrong. Note that I can bypass the "allocation" portion of the app and substitute a malloc() instead of asking for the memory region in the kernel module. In this case, RDMA (e.g: ibv_reg_mr()) works flawlessly. Since the other transports (and application) can reference the kernel memory region without issue, I'm a little lost as to what is preventing ibv_reg_mr() from accessing this space. Note that this memory is marked IORESOURCE_BUSY | IORESOURCE_IO. Does the kernel-side of ibv_reg_mr() (assuming there is one) do a __request_region()? Thanks for any and all comments... -PWM From halr at obsidianresearch.com Wed Aug 27 15:56:52 2008 From: halr at obsidianresearch.com (Hal Rosenstock) Date: Wed, 27 Aug 2008 16:56:52 -0600 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: References: Message-ID: <48B5DBB4.70108@obsidianresearch.com> Yicheng Jia wrote: > My operation is quite simple: connect QPs and do RDMA read/write. In this > case, the opensm is not in need when the subnet is up, correct? > Is this a production subnet ? Do you need to deal with any failures ? -- Hal > Thanks! > Yicheng > > > > > "Dotan Barak" > 08/21/2008 02:33 PM > > To > "Yicheng Jia" > cc > "Hal Rosenstock" , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > > > On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > >> Hi Hal, >> >> Can opensm just run once? When the subnet is up, it can exit assume that >> > no > >> change will be made in the subnet. >> >> > Yes, depend on the serives that you will need/use. > > For example: if you use operations that requires SA query, you must > have a live SM. > > If you will connect the QPs in the subnet by yourself (for example, > using socket) you can manage without a live SM in the subnet ... > > Dotan > >> Thanks! >> Yicheng >> >> >> >> "Hal Rosenstock" >> >> 07/10/2008 09:15 PM >> >> To >> "Yicheng Jia" >> cc >> "Jim Mott" , general at lists.openfabrics.org >> Subject >> Re: [ofa-general] minimum sw components requirement for driver/opensm in >> > a > >> single unmanaged switch network >> >> >> >> >> On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >> >>>> If you want to avoid all the SM stuff, and are willing to program the >>>> switches directly (a few mads) >>>> >>> Is it done by opensm? >>> >> Yes. >> >> >>> What information should be set up in the switch by >>> opensm? >>> >> Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >> >> >>>> Then to figure out QP connections, you just use a function of 3 >>>> parameters: >>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>> Where qp_num is a small number between 0 and the maximum number of QPs >>>> you >>>> need active between any 2 endpoints. >>>> >>> Can the qp_num be manually assigned? >>> Does it need opensm be involved? >>> >> SM has nothing to do with QP numbers. >> >> >>>> If it works, you are done. If not, reset, up, wait for him to connect >>>> and >>>> send something to you. >>>> >>> Is it reliable? I mean the QPs connection will keep alive during the >>> > QPs > >>> lifecycle? >>> >> For one thing, SM needs to try to keep ports at active. >> >> -- Hal >> >> >>> Best, >>> Yicheng >>> >>> >>> >>> "Jim Mott" >>> >>> 07/10/2008 04:17 PM >>> >>> To >>> "Yicheng Jia" , >>> cc >>> Subject >>> RE: [ofa-general] minimum sw components requirement for driver/opensm >>> > in a > >>> single unmanaged switch network >>> >>> >>> >>> >>> If you want to avoid all the SM stuff, and are willing to program the >>> switches directly (a few mads), then I've used schemes like: >>> >>> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >>> >>> Then to figure out QP connections, you just use a function of 3 >>> parameters: >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>> Where qp_num is a small number between 0 and the maximum number of QPs >>> > you > >>> need active between any 2 endpoints. >>> >>> With the above scheme, you know your node_id (switch port number), your >>> lid, >>> the lid of the target node, and the QPs on both sides. From there on, >>> > it > >>> is clear sailing. You don't even need to send MADs; just transition >>> > the > >>> QP >>> up and try and use it. If it works, you are done. If not, reset, up, >>> wait >>> for him to connect and send something to you. A little timer to make >>> > sure > >>> everybody retries once in awhile and what can go wrong? >>> >>> Jim >>> From: general-bounces at lists.openfabrics.org >>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia >>> Sent: Thursday, July 10, 2008 2:59 PM >>> To: general at lists.openfabrics.org >>> Subject: [ofa-general] minimum sw components requirement for >>> > driver/opensm > >>> in a single unmanaged switch network >>> >>> >>> Hi Folks, >>> >>> I have a IB network which consists of only a single unmanaged switch, >>> > all > >>> end nodes connecting with the switch only need to do RDMA read/write >>> operation with each other. My question is, what are the indispensable >>> modules in driver's core and opensm that make the network up and run? >>> >>> I've been using only ib_mad module in driver's core with a managed >>> > switch > >>> before, and the network works fine. So I assume that only the ib_mad >>> module >>> in driver's core and SM in opensm are mandatory in my network. The LIDs >>> are >>> assigned by them. The SA and CM modules are not useful in my case. Am I >>> right? >>> >>> I need to minimize driver and opensm to fit them in my network, the HCA >>> driver is mthca. >>> >>> Best, >>> Yicheng >>> >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >>> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> >> > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Wed Aug 27 16:01:37 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 27 Aug 2008 16:01:37 -0700 Subject: [ofa-general] ibv_reg_mr() failing to an mmap'ed iomemory region. In-Reply-To: <1219875364.23236.268.camel@hermosa.site> (Peter W. Morreale's message of "Wed, 27 Aug 2008 22:16:04 +0000") References: <1219875364.23236.268.camel@hermosa.site> Message-ID: > Note that this memory is marked IORESOURCE_BUSY | IORESOURCE_IO. Does > the kernel-side of ibv_reg_mr() (assuming there is one) do a > __request_region()? No, but it does get_user_pages(), which might not work for memory like this. That's probably where your issue is. - R. From halr at obsidianresearch.com Wed Aug 27 16:07:53 2008 From: halr at obsidianresearch.com (Hal Rosenstock) Date: Wed, 27 Aug 2008 17:07:53 -0600 Subject: [ofa-general] ***SPAM*** new with IB In-Reply-To: <230765.25942.qm@web56504.mail.re3.yahoo.com> References: <230765.25942.qm@web56504.mail.re3.yahoo.com> Message-ID: <48B5DE49.3070102@obsidianresearch.com> Podar Valentin wrote: > Dear all, > I am new to this field and I have some questions. > I run a cluster with IB Mellanox. I have two subnets each with its own opensm (running on different port P1 or P2). Are these opensms on the same machine ? Are the two subnets physically separate (no connections between the two in terms of switches) ? > mixed hardware MT25204 and MT23108. all are at 4x rate. > mixed drivers IBGold1.8.2 and MLNX_OFED_LINUX-1.3.1-rhel4 > when I issue > ibdiagnet -p 1 -lw 4x I get > -I- Stages Status Report: > STAGE Errors Warnings > Bad GUIDs/LIDs Check 0 0 > Link State Active Check 0 0 > Performance Counters Report 0 0 > Specific Link Width Check 0 0 > Partitions Check 0 0 > IPoIB Subnets Check 0 16 > BUT > -I--------------------------------------------------- > -I- IPoIB Subnets Check > -I--------------------------------------------------- > -I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:4096Byte rate:120Gbps SL:0x00 > -W- Port h1/P1 lid=0x0011 guid=0x dev=23108 can not join due to rate:10Gbps < group:120Gbps > -W- Port h2/P1 lid=0x0321 guid=0x dev=23108 can not join due to rate:10Gbps < group:120Gbps > -W- Port h3/P1 lid=0x0069 guid=0x dev=23108 can not join due to rate:10Gbps < group:120Gbps > -W- Port h4/P1 lid=0x0010 guid=0x dev=23108 can not join due to rate:10Gbps < group:120Gbps > and so on with all the nodes. > switch type MT47396. > > I think the problem is this line > Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:4096Byte rate:120Gbps SL:0x00 > but I don't know how to set the Mtu to 2048 and rate to 10G. > You need to set mtu to 4 and rate to 3 to do that. There's a partitions file which defines this with examples. It depends on what version of OpenSM you are running and how it's configured. On another note, I don't understand how you got MC groups with these parameters. Are you running IPoIB-CM ? What switches do you have ? -- Hal > more > /usr/sbin/saquery -d -g > > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > Mlid....................0xC000 > Mtu.....................0x5 > pkey....................0xFFFF > Rate....................0xA > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x0000000000000001 > Mlid....................0xC001 > Mtu.....................0x5 > pkey....................0xFFFF > Rate....................0xA > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x0000000000656565 > Mlid....................0xC002 > Mtu.....................0x4 > pkey....................0xFFFF > Rate....................0x3 > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x0000000000a847ff > Mlid....................0xC003 > Mtu.....................0x4 > pkey....................0xFFFF > Rate....................0x3 > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x0000000000000000 > Mlid....................0xC007 > Mtu.....................0x4 > pkey....................0xFFFF > Rate....................0x2 > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x000000000202c902 > Mlid....................0xC008 > Mtu.....................0x4 > pkey....................0xFFFF > Rate....................0x2 > > the question is how can I set the group rate to 10G and not 120G? and the group MTU to 2048 as on some nodes I get " failed to join multicast or setting MTU>4096 will ...generate...some errors" > > thank you very much! > Vali > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From halr at obsidianresearch.com Wed Aug 27 16:45:21 2008 From: halr at obsidianresearch.com (Hal Rosenstock) Date: Wed, 27 Aug 2008 17:45:21 -0600 Subject: [ofa-general] Re: [PATCH] ibsim: Add support for vendor ID and system image GUID In-Reply-To: <20080818201718.GJ27204@sashak.voltaire.com> References: <48A30108.4010307@obsidianresearch.com> <20080818201718.GJ27204@sashak.voltaire.com> Message-ID: <48B5E711.7030503@obsidianresearch.com> Sasha, Sasha Khapyorsky wrote: > Hi Hal, > > On 09:43 Wed 13 Aug , Hal Rosenstock wrote: > >> ibsim: Add support for vendor ID and system image GUID >> >> Signed-off-by: Hal Rosenstock >> --- >> >> diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c >> index 6bf1e29..ef99db8 100644 >> --- a/ibsim/sim_cmd.c >> +++ b/ibsim/sim_cmd.c >> @@ -483,7 +483,8 @@ static int dump_net(FILE * f, char *line) >> fprintf(f, "\n%s %d \"%s\"", >> node_type_name(node->type), >> node->numports, node->nodeid); >> - fprintf(f, "\tnodeguid %" PRIx64 "\n", node->nodeguid); >> + fprintf(f, "\tnodeguid %" PRIx64 "\tsysimgguid %" PRIx64 "\n", >> + node->nodeguid, node->sysguid); >> >> nports = node->numports; >> if (node->type == SWITCH_NODE) { >> diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c >> index 6e3c0e9..146bcde 100644 >> --- a/ibsim/sim_net.c >> +++ b/ibsim/sim_net.c >> @@ -190,7 +190,9 @@ char (*aliases)[NODEIDLEN + NODEPREFIX + 1]; // aliases map format: "%s@%s" >> >> int netnodes, netswitches, netports, netaliases; >> char netprefix[NODEPREFIX + 1]; >> +int netvendid; >> int netdevid; >> +uint64_t netsysimgguid; >> int netwidth = DEFAULT_LINKWIDTH; >> int netspeed = DEFAULT_LINKSPEED; >> >> @@ -324,11 +326,12 @@ static Node *new_node(int type, char *nodename, char *nodedesc, int nodeports) >> } >> >> mad_set_field(nd->nodeinfo, 0, IB_NODE_NPORTS_F, nd->numports); >> + mad_set_field(nd->nodeinfo, 0, IB_NODE_VENDORID_F, netvendid); >> mad_set_field(nd->nodeinfo, 0, IB_NODE_DEVID_F, netdevid); >> >> mad_encode_field(nd->nodeinfo, IB_NODE_GUID_F, &nd->nodeguid); >> mad_encode_field(nd->nodeinfo, IB_NODE_PORT_GUID_F, &nd->nodeguid); >> - mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &nd->nodeguid); >> + mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &netsysimgguid); >> > > And when netsysimgguid was not parsed for this node, it will put previous > value there (or "0" if it was never parsed)? > Is "state" for a node in the topology file needed to deal with this ? Something like the following: When the vendor ID line is seen, reset netsysimgguid and if 0 when new_node is invoked, then use the node GUID as currently done. Does that make sense ? Do you see a better way ? -- Hal > Sasha > > >> >> if ((nd->portsbase = new_ports(nd, nodeports, firstport)) < 0) { >> IBWARN("can't alloc %d ports for node %s", nodeports, >> @@ -805,6 +808,20 @@ static int parse_guidbase(int fd, char *line, int type) >> return 1; >> } >> >> +static int parse_vendid(int fd, char *line) >> +{ >> + char *s; >> + >> + if (!(s = strchr(line, '='))) { >> + IBWARN("bad assignment: missing '=' sign"); >> + return -1; >> + } >> + >> + netvendid = strtol(s + 1, 0, 0); >> + >> + return 1; >> +} >> + >> static int parse_devid(int fd, char *line) >> { >> char *s; >> @@ -819,6 +836,20 @@ static int parse_devid(int fd, char *line) >> return 1; >> } >> >> +static uint64_t parse_sysimgguid(int fd, char *line) >> +{ >> + char *s; >> + >> + if (!(s = strchr(line, '='))) { >> + IBWARN("bad assignment: missing '=' sign"); >> + return -1; >> + } >> + >> + netsysimgguid = strtoull(s + 1, 0, 0); >> + >> + return 1; >> +} >> + >> static int parse_width(int fd, char *line) >> { >> char *s; >> @@ -935,8 +966,12 @@ static int parse_netconf(int fd, FILE * out) >> r = parse_guidbase(fd, line, HCA_NODE); >> else if (!strncmp(line, "rtguid", 6)) >> r = parse_guidbase(fd, line, ROUTER_NODE); >> + else if (!strncmp(line, "vendid", 6)) >> + r = parse_vendid(fd, line); >> else if (!strncmp(line, "devid", 5)) >> r = parse_devid(fd, line); >> + else if (!strncmp(line, "sysimgguid", 10)) >> + r = parse_sysimgguid(fd, line); >> else if (!strncmp(line, "width", 5)) >> r = parse_width(fd, line); >> else if (!strncmp(line, "speed", 5)) >> > > From YJia at tmriusa.com Wed Aug 27 22:22:00 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Thu, 28 Aug 2008 00:22:00 -0500 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: <48B5DBB4.70108@obsidianresearch.com> Message-ID: Yes. My basic idea is, the opensm set up the subnet during initialization, it will report errors during this process. After the subnet is up, the environment is fixed and stable. If some failure happens, opensm could be used again to diagnose the failure. From my understanding, in this case, the only work that opensm does after subnet is up is to log the status. Thanks! Yicheng Hal Rosenstock 08/27/2008 05:55 PM To Yicheng Jia cc Dotan Barak , general at lists.openfabrics.org Subject Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network Yicheng Jia wrote: > My operation is quite simple: connect QPs and do RDMA read/write. In this > case, the opensm is not in need when the subnet is up, correct? > Is this a production subnet ? Do you need to deal with any failures ? -- Hal > Thanks! > Yicheng > > > > > "Dotan Barak" > 08/21/2008 02:33 PM > > To > "Yicheng Jia" > cc > "Hal Rosenstock" , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > > > On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: > >> Hi Hal, >> >> Can opensm just run once? When the subnet is up, it can exit assume that >> > no > >> change will be made in the subnet. >> >> > Yes, depend on the serives that you will need/use. > > For example: if you use operations that requires SA query, you must > have a live SM. > > If you will connect the QPs in the subnet by yourself (for example, > using socket) you can manage without a live SM in the subnet ... > > Dotan > >> Thanks! >> Yicheng >> >> >> >> "Hal Rosenstock" >> >> 07/10/2008 09:15 PM >> >> To >> "Yicheng Jia" >> cc >> "Jim Mott" , general at lists.openfabrics.org >> Subject >> Re: [ofa-general] minimum sw components requirement for driver/opensm in >> > a > >> single unmanaged switch network >> >> >> >> >> On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >> >>>> If you want to avoid all the SM stuff, and are willing to program the >>>> switches directly (a few mads) >>>> >>> Is it done by opensm? >>> >> Yes. >> >> >>> What information should be set up in the switch by >>> opensm? >>> >> Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >> >> >>>> Then to figure out QP connections, you just use a function of 3 >>>> parameters: >>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>> Where qp_num is a small number between 0 and the maximum number of QPs >>>> you >>>> need active between any 2 endpoints. >>>> >>> Can the qp_num be manually assigned? >>> Does it need opensm be involved? >>> >> SM has nothing to do with QP numbers. >> >> >>>> If it works, you are done. If not, reset, up, wait for him to connect >>>> and >>>> send something to you. >>>> >>> Is it reliable? I mean the QPs connection will keep alive during the >>> > QPs > >>> lifecycle? >>> >> For one thing, SM needs to try to keep ports at active. >> >> -- Hal >> >> >>> Best, >>> Yicheng >>> >>> >>> >>> "Jim Mott" >>> >>> 07/10/2008 04:17 PM >>> >>> To >>> "Yicheng Jia" , >>> cc >>> Subject >>> RE: [ofa-general] minimum sw components requirement for driver/opensm >>> > in a > >>> single unmanaged switch network >>> >>> >>> >>> >>> If you want to avoid all the SM stuff, and are willing to program the >>> switches directly (a few mads), then I've used schemes like: >>> >>> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >>> >>> Then to figure out QP connections, you just use a function of 3 >>> parameters: >>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>> Where qp_num is a small number between 0 and the maximum number of QPs >>> > you > >>> need active between any 2 endpoints. >>> >>> With the above scheme, you know your node_id (switch port number), your >>> lid, >>> the lid of the target node, and the QPs on both sides. From there on, >>> > it > >>> is clear sailing. You don't even need to send MADs; just transition >>> > the > >>> QP >>> up and try and use it. If it works, you are done. If not, reset, up, >>> wait >>> for him to connect and send something to you. A little timer to make >>> > sure > >>> everybody retries once in awhile and what can go wrong? >>> >>> Jim >>> From: general-bounces at lists.openfabrics.org >>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng Jia >>> Sent: Thursday, July 10, 2008 2:59 PM >>> To: general at lists.openfabrics.org >>> Subject: [ofa-general] minimum sw components requirement for >>> > driver/opensm > >>> in a single unmanaged switch network >>> >>> >>> Hi Folks, >>> >>> I have a IB network which consists of only a single unmanaged switch, >>> > all > >>> end nodes connecting with the switch only need to do RDMA read/write >>> operation with each other. My question is, what are the indispensable >>> modules in driver's core and opensm that make the network up and run? >>> >>> I've been using only ib_mad module in driver's core with a managed >>> > switch > >>> before, and the network works fine. So I assume that only the ib_mad >>> module >>> in driver's core and SM in opensm are mandatory in my network. The LIDs >>> are >>> assigned by them. The SA and CM modules are not useful in my case. Am I >>> right? >>> >>> I need to minimize driver and opensm to fit them in my network, the HCA >>> driver is mthca. >>> >>> Best, >>> Yicheng >>> >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> > MessageLabs. > >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >>> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> >> > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at dev.mellanox.co.il Wed Aug 27 23:39:30 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Thu, 28 Aug 2008 09:39:30 +0300 Subject: [ofa-general] RE: [PATCH] ib/mad: BMA responses are discarded in kernel In-Reply-To: References: Message-ID: <48B64822.70609@mellanox.co.il> Michael Brooks wrote: >> Given how long this bug has been around, I'm guessing this is okay to >> > wait for 2.6.28. Does that seem okay with you, Michael? > > We've managed to work around it in past releases, but it's a thorn and > would be nice to see resolved sooner than later. > > I suggest we will take it for OFED 1.4 Tziporet From vlad at lists.openfabrics.org Thu Aug 28 03:00:22 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 28 Aug 2008 03:00:22 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080828-0200 daily build status Message-ID: <20080828100022.1147AE60D01@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.24 Failed: Build failed on i686 with linux-2.6.24 Build failed on ia64 with linux-2.6.16 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.17 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.21.1 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.21.1_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.21.1' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.18 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.19 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.22 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.22_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.22' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.23 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.23_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.23' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.16 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.16_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.16' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.17 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.17_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4/en_resources.c:81: warning: assignment makes pointer from integer without a cast /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4/en_resources.c:84: warning: statement with no effect /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4/en_resources.c: In function 'mlx4_en_unmap_buffer': /home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4/en_resources.c:94: error: implicit declaration of function 'vunmap' make[3]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4/en_resources.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_4_kernel-20080828-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From ogerlitz at voltaire.com Thu Aug 28 03:43:25 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 28 Aug 2008 13:43:25 +0300 Subject: [ofa-general] enabling invalidate operations in libibverbs API In-Reply-To: References: Message-ID: <48B6814D.5040605@voltaire.com> Roland Dreier wrote: > Here's the complete (untested) libibverbs patch I am working on now... I think I got everything needed... Hi Roland, Where does this patch stands? specifically, with the integration of the kernel part into 2.6.27 When discussed back on June 08, there were some minor fix you agreed that need to be done here (remove the LSO bit) Or. > > diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h > index 0db083a..b503a28 100644 > --- a/include/infiniband/kern-abi.h > +++ b/include/infiniband/kern-abi.h > @@ -306,7 +306,10 @@ struct ibv_kern_wc { > __u32 opcode; > __u32 vendor_err; > __u32 byte_len; > - __u32 imm_data; > + union { > + __u32 imm_data; > + __u32 invalidate_rkey; > + }; > __u32 qp_num; > __u32 src_qp; > __u32 wc_flags; > @@ -572,7 +575,10 @@ struct ibv_kern_send_wr { > __u32 num_sge; > __u32 opcode; > __u32 send_flags; > - __u32 imm_data; > + union { > + __u32 imm_data; > + __u32 invalidate_rkey; > + }; > union { > struct { > __u64 remote_addr; > diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h > index a04cc62..e6e2b10 100644 > --- a/include/infiniband/verbs.h > +++ b/include/infiniband/verbs.h > @@ -92,7 +92,17 @@ enum ibv_device_cap_flags { > IBV_DEVICE_SYS_IMAGE_GUID = 1 << 11, > IBV_DEVICE_RC_RNR_NAK_GEN = 1 << 12, > IBV_DEVICE_SRQ_RESIZE = 1 << 13, > - IBV_DEVICE_N_NOTIFY_CQ = 1 << 14 > + IBV_DEVICE_N_NOTIFY_CQ = 1 << 14, > + /* > + * IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS is used by libibverbs to > + * signal to low-level driver libraries that the kernel set > + * the "send with invalidate" capaibility bit. Applications > + * should only test IBV_DEVICE_MEM_MGT_EXTENSIONS and never > + * look at IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS. > + */ > + IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS = 1 << 16, > + IBV_DEVICE_MEM_WINDOW = 1 << 17, > + IBV_DEVICE_MEM_MGT_EXTENSIONS = 1 << 21, > }; > > enum ibv_atomic_cap { > @@ -257,7 +267,8 @@ enum ibv_wc_opcode { > > enum ibv_wc_flags { > IBV_WC_GRH = 1 << 0, > - IBV_WC_WITH_IMM = 1 << 1 > + IBV_WC_WITH_IMM = 1 << 1, > + IBV_WC_WITH_INVALIDATE = 1 << 2, > }; > > struct ibv_wc { > @@ -266,7 +277,10 @@ struct ibv_wc { > enum ibv_wc_opcode opcode; > uint32_t vendor_err; > uint32_t byte_len; > - uint32_t imm_data; /* in network byte order */ > + union { > + uint32_t imm_data; /* in network byte order */ > + uint32_t invalidate_rkey; > + }; > uint32_t qp_num; > uint32_t src_qp; > enum ibv_wc_flags wc_flags; > @@ -486,7 +500,11 @@ enum ibv_wr_opcode { > IBV_WR_SEND_WITH_IMM, > IBV_WR_RDMA_READ, > IBV_WR_ATOMIC_CMP_AND_SWP, > - IBV_WR_ATOMIC_FETCH_AND_ADD > + IBV_WR_ATOMIC_FETCH_AND_ADD, > + IBV_WR_LSO, > + IBV_WR_SEND_WITH_INV, > + IBV_WR_RDMA_READ_WITH_INV, > + IBV_WR_LOCAL_INV, > }; > > enum ibv_send_flags { > @@ -509,7 +527,10 @@ struct ibv_send_wr { > int num_sge; > enum ibv_wr_opcode opcode; > enum ibv_send_flags send_flags; > - uint32_t imm_data; /* in network byte order */ > + union { > + uint32_t imm_data; /* in network byte order */ > + uint32_t invalidate_rkey; > + }; > union { > struct { > uint64_t remote_addr; > diff --git a/src/cmd.c b/src/cmd.c > index 66d7134..1945143 100644 > --- a/src/cmd.c > +++ b/src/cmd.c > @@ -159,6 +159,18 @@ int ibv_cmd_query_device(struct ibv_context *context, > device_attr->local_ca_ack_delay = resp.local_ca_ack_delay; > device_attr->phys_port_cnt = resp.phys_port_cnt; > > + /* > + * If the kernel driver says that it supports memory > + * management extensions, then move the flag to > + * IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS so that the low-level > + * driver needs to move the flag back to show it supports the > + * operations as well. > + */ > + if (device_attr->device_cap_flags & IBV_DEVICE_MEM_MGT_EXTENSIONS) { > + device_attr->device_cap_flags &= ~IBV_DEVICE_MEM_MGT_EXTENSIONS; > + device_attr->device_cap_flags |= IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS; > + } > + > return 0; > } > > diff --git a/src/compat-1_0.c b/src/compat-1_0.c > index 459ade9..0df8b68 100644 > --- a/src/compat-1_0.c > +++ b/src/compat-1_0.c > @@ -535,7 +535,18 @@ symver(__ibv_ack_async_event_1_0, ibv_ack_async_event, IBVERBS_1.0); > int __ibv_query_device_1_0(struct ibv_context_1_0 *context, > struct ibv_device_attr *device_attr) > { > - return ibv_query_device(context->real_context, device_attr); > + int ret; > + > + ret = ibv_query_device(context->real_context, device_attr); > + > + /* > + * ABI 1.0 consumers are never expecting memory management > + * extension support. > + */ > + device_attr->device_cap_flags &= ~(IBV_DEVICE_MEM_MGT_EXTENSIONS | > + IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS); > + > + return ret; > } > symver(__ibv_query_device_1_0, ibv_query_device, IBVERBS_1.0); > > diff --git a/src/verbs.c b/src/verbs.c > index 9e370ce..4ea342f 100644 > --- a/src/verbs.c > +++ b/src/verbs.c > @@ -79,7 +79,13 @@ enum ibv_rate mult_to_ibv_rate(int mult) > int __ibv_query_device(struct ibv_context *context, > struct ibv_device_attr *device_attr) > { > - return context->ops.query_device(context, device_attr); > + int ret; > + > + ret = context->ops.query_device(context, device_attr); > + > + device_attr->device_cap_flags &= ~IBV_DEVICE_KERN_MEM_MGT_EXTENSIONS; > + > + return ret; > } > default_symver(__ibv_query_device, ibv_query_device); > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From halr at obsidianresearch.com Thu Aug 28 04:32:48 2008 From: halr at obsidianresearch.com (Hal Rosenstock) Date: Thu, 28 Aug 2008 05:32:48 -0600 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: References: Message-ID: <48B68CE0.1010701@obsidianresearch.com> Yicheng Jia wrote: > Yes. My basic idea is, the opensm set up the subnet during initialization, > it will report errors during this process. After the subnet is up, the > environment is fixed and stable. If some failure happens, opensm could be > used again to diagnose the failure. From my understanding, in this case, > the only work that opensm does after subnet is up is to log the status. Wouldn't opensm also repair the failure if it could ? -- Hal > > Thanks! > Yicheng > > > > > Hal Rosenstock > 08/27/2008 05:55 PM > > To > Yicheng Jia > cc > Dotan Barak , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > > > Yicheng Jia wrote: > >> My operation is quite simple: connect QPs and do RDMA read/write. In >> > this > >> case, the opensm is not in need when the subnet is up, correct? >> >> > Is this a production subnet ? Do you need to deal with any failures ? > > -- Hal > > >> Thanks! >> Yicheng >> >> >> >> >> "Dotan Barak" >> 08/21/2008 02:33 PM >> >> To >> "Yicheng Jia" >> cc >> "Hal Rosenstock" , >> > general at lists.openfabrics.org > >> Subject >> Re: [ofa-general] minimum sw components requirement for driver/opensm in >> > a > >> single unmanaged switch network >> >> >> >> >> >> >> On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: >> >> >>> Hi Hal, >>> >>> Can opensm just run once? When the subnet is up, it can exit assume >>> > that > >> no >> >> >>> change will be made in the subnet. >>> >>> >>> >> Yes, depend on the serives that you will need/use. >> >> For example: if you use operations that requires SA query, you must >> have a live SM. >> >> If you will connect the QPs in the subnet by yourself (for example, >> using socket) you can manage without a live SM in the subnet ... >> >> Dotan >> >> >>> Thanks! >>> Yicheng >>> >>> >>> >>> "Hal Rosenstock" >>> >>> 07/10/2008 09:15 PM >>> >>> To >>> "Yicheng Jia" >>> cc >>> "Jim Mott" , general at lists.openfabrics.org >>> Subject >>> Re: [ofa-general] minimum sw components requirement for driver/opensm >>> > in > >> a >> >> >>> single unmanaged switch network >>> >>> >>> >>> >>> On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >>> >>> >>>>> If you want to avoid all the SM stuff, and are willing to program the >>>>> switches directly (a few mads) >>>>> >>>>> >>>> Is it done by opensm? >>>> >>>> >>> Yes. >>> >>> >>> >>>> What information should be set up in the switch by >>>> opensm? >>>> >>>> >>> Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >>> >>> >>> >>>>> Then to figure out QP connections, you just use a function of 3 >>>>> parameters: >>>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>>> Where qp_num is a small number between 0 and the maximum number of >>>>> > QPs > >>>>> you >>>>> need active between any 2 endpoints. >>>>> >>>>> >>>> Can the qp_num be manually assigned? >>>> Does it need opensm be involved? >>>> >>>> >>> SM has nothing to do with QP numbers. >>> >>> >>> >>>>> If it works, you are done. If not, reset, up, wait for him to >>>>> > connect > >>>>> and >>>>> send something to you. >>>>> >>>>> >>>> Is it reliable? I mean the QPs connection will keep alive during the >>>> >>>> >> QPs >> >> >>>> lifecycle? >>>> >>>> >>> For one thing, SM needs to try to keep ports at active. >>> >>> -- Hal >>> >>> >>> >>>> Best, >>>> Yicheng >>>> >>>> >>>> >>>> "Jim Mott" >>>> >>>> 07/10/2008 04:17 PM >>>> >>>> To >>>> "Yicheng Jia" , >>>> cc >>>> Subject >>>> RE: [ofa-general] minimum sw components requirement for driver/opensm >>>> >>>> >> in a >> >> >>>> single unmanaged switch network >>>> >>>> >>>> >>>> >>>> If you want to avoid all the SM stuff, and are willing to program the >>>> switches directly (a few mads), then I've used schemes like: >>>> >>>> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >>>> >>>> Then to figure out QP connections, you just use a function of 3 >>>> parameters: >>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>> Where qp_num is a small number between 0 and the maximum number of QPs >>>> > > >> you >> >> >>>> need active between any 2 endpoints. >>>> >>>> With the above scheme, you know your node_id (switch port number), >>>> > your > >>>> lid, >>>> the lid of the target node, and the QPs on both sides. From there >>>> > on, > >> it >> >> >>>> is clear sailing. You don't even need to send MADs; just transition >>>> >>>> >> the >> >> >>>> QP >>>> up and try and use it. If it works, you are done. If not, reset, up, >>>> wait >>>> for him to connect and send something to you. A little timer to make >>>> >>>> >> sure >> >> >>>> everybody retries once in awhile and what can go wrong? >>>> >>>> Jim >>>> From: general-bounces at lists.openfabrics.org >>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng >>>> > Jia > >>>> Sent: Thursday, July 10, 2008 2:59 PM >>>> To: general at lists.openfabrics.org >>>> Subject: [ofa-general] minimum sw components requirement for >>>> >>>> >> driver/opensm >> >> >>>> in a single unmanaged switch network >>>> >>>> >>>> Hi Folks, >>>> >>>> I have a IB network which consists of only a single unmanaged switch, >>>> >>>> >> all >> >> >>>> end nodes connecting with the switch only need to do RDMA read/write >>>> operation with each other. My question is, what are the indispensable >>>> modules in driver's core and opensm that make the network up and run? >>>> >>>> I've been using only ib_mad module in driver's core with a managed >>>> >>>> >> switch >> >> >>>> before, and the network works fine. So I assume that only the ib_mad >>>> module >>>> in driver's core and SM in opensm are mandatory in my network. The >>>> > LIDs > >>>> are >>>> assigned by them. The SA and CM modules are not useful in my case. Am >>>> > I > >>>> right? >>>> >>>> I need to minimize driver and opensm to fit them in my network, the >>>> > HCA > >>>> driver is mthca. >>>> >>>> Best, >>>> Yicheng >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>>> http://openib.org/mailman/listinfo/openib-general >>>> >>>> >>>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. >> >> >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. >> >> >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >>> >>> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. For more information please visit http://www.ers.ibm.com > > _____________________________________________________________________________ > >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> > http://openib.org/mailman/listinfo/openib-general > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > From brunel at diku.dk Thu Aug 28 06:31:07 2008 From: brunel at diku.dk (Julien Brunel) Date: Thu, 28 Aug 2008 15:31:07 +0200 Subject: [ofa-general] [PATCH] drivers/infiniband/core: Use a NULL test rather than an IS_ERR test Message-ID: <200808281531.07942.brunel@diku.dk> From: Julien Brunel In case of error, the function ucma_alloc_multicast returns a NULL pointer, but never returns an ERR pointer. So after a call to this function, an IS_ERR test should be replaced by a NULL test. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // @match bad_is_err_test@ expression x, E; @@ x = ucma_alloc_multicast(...) ... when != x = E IS_ERR(x) // Signed-off-by: Julien Brunel Signed-off-by: Julia Lawall --- drivers/infiniband/core/ucma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -u -p a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -904,8 +904,8 @@ static ssize_t ucma_join_multicast(struc mutex_lock(&file->mut); mc = ucma_alloc_multicast(ctx); - if (IS_ERR(mc)) { - ret = PTR_ERR(mc); + if (mc == NULL) { + ret = -ENOMEM; goto err1; } From orenk at dev.mellanox.co.il Thu Aug 28 08:36:01 2008 From: orenk at dev.mellanox.co.il (Oren Kladnitsky) Date: Thu, 28 Aug 2008 18:36:01 +0300 Subject: [ofa-general] ibdiagnet FM master / standby error report In-Reply-To: <48B466F6.6090602@dev.mellanox.co.il> References: <48B466F6.6090602@dev.mellanox.co.il> Message-ID: <48B6C5E1.6040100@dev.mellanox.co.il> Yevgeny Kliteynik wrote: > Hi Matthias, > > Matthias Blankenhaus wrote: >> Howdy ! >> >> I have noticed that ibdiagnet reports an error when using a >> master / standby FM configuration. I am using OFED-1.3.1. >> >> Here it goes: >> >> # ibdiagnet .... >> -I--------------------------------------------------- >> -I- Bad Fabric SM Info >> -I--------------------------------------------------- >> -E- Found more then one master SM in the discover fabric >> r1lead/P1 priority:15 >> r2lead/P1 priority:0 >> .... >> -I- Stages Status Report: >> STAGE Errors Warnings >> Bad GUIDs/LIDs Check 0 0 >> Link State Active Check 0 0 >> SM Info Check 1 0 >> Performance Counters Report 0 6 >> Partitions Check 0 0 >> IPoIB Subnets Check 0 0 >> >> >> This is incorrect as we have only one master namely r1lead. r2lead >> is a standby only. >> >> >> The culprit for this problem seems to be this file: >> >> /usr/lib64/ibdiagnet1.2/ibdebug.tcl >> >> Here is the if stmt that creates the problem: >> ... >> 2988 proc CheckSM {} { >> 2989 global SM G >> 2990 set master 3 >> 2991 if {![info exists SM($master)]} { >> 2992 inform "-I-ibdiagnet:bad.sm.header" >> 2993 inform "-E-ibdiagnet:no.SM" >> 2994 } else { >> 2995 if {[llength $SM($master)] != 1} { >> ==> ^^^^ >> 2996 inform "-I-ibdiagnet:bad.sm.header" >> 2997 inform "-E-ibdiagnet:many.SM.master" >> 2998 foreach element $SM($master) { >> 2999 set tmpDirectPath [lindex $element 0] >> 3000 set nodeName [DrPath2Name $tmpDirectPath -port >> [GetEntryPort $tmpDirectPath]] >> 3001 if { $tmpDirectPath == "" } { >> .... >> >> It appears that this code does not factor in the priority of an >> individual FM. It simply counts the FM instances and if the resulting >> number not equals 1, then this tools indicates an error. >> From studying the OFED code >> (osm_state_mgr.h::osm_sm_is_greater_than()) it is clear that, even if >> two FM instances for the same fabric have an identical priority, >> there is always only one winner by resolving the tie via guids. >> >> Here is the relevant OFED code: >> >> static inline boolean_t >> osm_sm_is_greater_than(IN const uint8_t l_priority, >> IN const ib_net64_t l_guid, >> IN const uint8_t r_priority, IN const ib_net64_t r_guid) >> { >> if (l_priority > r_priority) { >> return (TRUE); >> } else { >> if (l_priority == r_priority) { >> if (cl_ntoh64(l_guid) < cl_ntoh64(r_guid)) { >> return (TRUE); >> } >> } >> } >> return (FALSE); >> } >> >> >> Thus, in my opinion the check against number of FM instances in >> ibdebug.tcl is superfluous. And indeed, removing the check resolves >> the issue. The new version of the above func looks like this: >> >> >> proc CheckSM {} { >> global SM G >> set master 3 >> if {![info exists SM($master)]} { >> inform "-I-ibdiagnet:bad.sm.header" >> inform "-E-ibdiagnet:no.SM" >> } >> return 0 >> } >> >> >> This simply checks whether there is a FM instance at all. If there >> is none, then that constitutes an error. However, multiple FM instances >> should not create an error. > > This check in ibdiagnet is supposed to report an error > in case of more than one *master* SM in the subnet. > In some cases it may happen, so the check is valid. > However, I think that only master SM can get into that > SM list, so either you really have a problem with > two master SMs in subnet, or there is a bug in ibdiagnet > and somehow it included non-master SM in that list. > > -- Yevgeny > >> >> Thanx, >> Matthias >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > Hi, The ibdiagnet is a diagnostic tool. Its goal is to report issues that "should work" on a good IB fabric. Therefore, it does some checks that may seem superfluous, such as: - Multiple SM masters - Duplicated device GUIDs - Credit loops in switch connectivity etc. A key feature in checking these issues, is to rely as little as possible on the correctness of other tools. Regarding the SM check: Only the number of Master SMs in (state 3) is checked and reported as error. See file /tmp/ibdiagnet.sm for a list of the SMs (and their priority / state) in the fabric. So I guess there is an issue with the SM in your setup (2 Masters in 1 subnet) - Please send the relevant info to the opensm maintainer. Thanks, ORen. From aj.guillon at gmail.com Thu Aug 28 08:45:29 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 11:45:29 -0400 Subject: [ofa-general] ***SPAM*** Thread Safety and Infiniband Verbs and RDMA Message-ID: <9870a2060808280845m29a0391ej270db02f4c76e34d@mail.gmail.com> Hey all, I'm working on my first RDMA application in C++ (hooray). Are functions from librdmacm and infiniband verbs thread safe, i.e. can many threads be posting work on the same connection at the same time, or do I have to wrap access with a mutex? Thanks, AJ From aj.guillon at gmail.com Thu Aug 28 08:48:40 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 11:48:40 -0400 Subject: [ofa-general] ***SPAM*** ibv_qp_cap- sensible settings for the parameters Message-ID: <9870a2060808280848p617223c9na94aaf042d62468@mail.gmail.com> Hi, I don't quite understand the parameters inside struct ibv_qp_init_attr. In particular the ibv_qp_cap structure... what are sensible defaults? How do these parameters affect overall functioning of the application? Thanks, AJ From aj.guillon at gmail.com Thu Aug 28 08:52:08 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 11:52:08 -0400 Subject: [ofa-general] Efficient management of many connections Message-ID: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> Hi, I am writing some code for a cluster, and I'm using RDMA. I would like each node to be able to access memory of every other node, which requires N-1 connections for N nodes on each. What can I do to implement this efficiently? I'm already using an SRQ for all connections. Can SRQs be used for the server application too? Each node will have a client with connections, and also a server instance... can these share an SRQ? Thanks a lot, AJ From aj.guillon at gmail.com Thu Aug 28 08:52:54 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 11:52:54 -0400 Subject: [ofa-general] ***SPAM*** IRC Channel? Message-ID: <9870a2060808280852q61b5fe6es4ea3c006d6fdfba2@mail.gmail.com> Is there an IRC channel where people lurk to discuss infiniband and RDMA? It would be nice to have a place for quick questions as opposed to just mailiing lists and docs. AJ From yossi.openib at gmail.com Thu Aug 28 10:54:07 2008 From: yossi.openib at gmail.com (Yossi Etigin) Date: Thu, 28 Aug 2008 20:54:07 +0300 Subject: [ofa-general] ***SPAM*** Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets In-Reply-To: <5D49E7A8952DC44FB38C38FA0D758EAD61E699@mtlexch01.mtl.com> References: <48AC445D.2050704@gmail.com> <5D49E7A8952DC44FB38C38FA0D758EAD5865EA@mtlexch01.mtl.com> <48AD9C80.8030305@gmail.com> <1219590681.1564.10.camel@amirv-laptop> <48B2CD3A.5020509@gmail.com> <5D49E7A8952DC44FB38C38FA0D758EAD61E699@mtlexch01.mtl.com> Message-ID: <48B6E63F.6060309@gmail.com> Hi, I'm attempting to do this with IO signals - install a signal handler that will be called when the connect fails, and it will do the fallback. --Yossi Amir Vadai wrote: > > Yossi Hi, > > I'm on vacation till Monday. > I'll check when can we have the full fix - and if it is not in the near > future > we'll put your patch till the full fix be prepared. > > - Amir > > -----Original Message----- > From: Yossi Etigin [mailto:yossi.openib at gmail.com] > Sent: Mon 8/25/2008 6:18 PM > To: Amir Vadai > Cc: general list; Oren Duer; Olga Shern > Subject: Re: [PATCH] libsdp: enable fallback to TCP for nonblocking sockets > > Hi Amir, > > The single case in which we block connect() here (and only on SDP, which > is rather fast) is the case that is currenlty not supported anyway. It can > also be configurable. > Anyway, we have a client which uses non-blocking sockets and really needs > that feature. How about putting this to OFED now and writing something > better > later on? > > --Yossi > > > Amir Vadai wrote: > > See below > > > > On Thu, 2008-08-21 at 19:49 +0300, Yossi Etigin wrote: > >> Hi Amir, > >> > >> What you suggesting is to replace almost all socket functions, and I > >> don't think that this is good either. > > I agree - but to break the non-blocking semantics is worse. > > > >> It would be write(), send(), recv(), sendto(), recvfrom(), sendmsg(), > >> recvmsg(), and also need to change select() (to not return when > >> fallback > >> happens if SDP fails), and maybe also poll(). libsdp tries to avoid > >> the fast path. > > I don't see another option. We could have a #ifdef to enable the user > > to choose - non blocking support or cleaner fast-path. > >> Besides, how do we know when to do fallback - can we safely assume > >> that if some socket operation fails, then it happened because > >> connect() failed? > >>From a brief look at connect man page, they say we should use select for > > writing on the socket. after select indicates writability, use > > getsockopt to determine whether connect() completed successfully or not. > >> Anyway, if I understand correctly, you suggest something like: > >> > >> int connect(fd, ...) > >> { > >> ... > >> set_state(fd, SDP) > >> ... > >> } > >> > >> > >> int read(int fd, ...) > >> { > >> int res = socket_funcs.read(shadow_fd(fd), ...); > >> if (res < 0 && errno != EAGAIN && sock_state(fd) == SDP) { > >> sock_state = TCP; > >> sockt_funs.connect(fd,...); > >> close(shadow_fd(fd)); > >> errno = EAGAIN; > >> } > >> return res; > >> } > >> > >> > > ... again, I don't like it too - but I don't think we should block > > connect when the user asks not to. > > - Amir. > >> --Yossi > >> > >> Amir Vadai wrote: > >>> Yossi Hi, > >>> > >>> I think that breaking the semantic of non blocking socket is a bad > >> idea. > >>> There is a solution that won't break this semantics: > >>> > >>> 1. User app calls connect(). > >>> - libsdp try to connect through sdp. > >>> 2. User app try another operation on the socket (e.g read/write) > >>> - if sdp connection established successfully - great > >>> - if sdp still not established - return -EAGAIN. This is the > >>> same behaviour as if the tcp connection wasn't connected yet. > >>> - if sdp timedout - return -EAGAIN and initiate TCP connect. > >>> - if tcp connection established - use it > >>> - if tcp connection timedout - return error. > >>> > >>> Maybe we could optimize it and initiate a tcp connection in parallel > >>> with the sdp connection and use it only when the sdp connect is > >>> timedout. > >>> > >>> I will add only the second patch (the debug print fix). > >>> > >>> - Amir > >>> > >>> > >> > >> > > > From dotanba at gmail.com Thu Aug 28 12:02:21 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:02:21 +0300 Subject: [ofa-general] Efficient management of many connections In-Reply-To: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> Message-ID: <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> On Thu, Aug 28, 2008 at 6:52 PM, Adrien Guillon wrote: > Hi, > > I am writing some code for a cluster, and I'm using RDMA. I would > like each node to be able to access memory of every other node, which > requires N-1 connections for N nodes on each. What can I do to > implement this efficiently? I'm already using an SRQ for all > connections. Can SRQs be used for the server application too? Each > node will have a client with connections, and also a server > instance... can these share an SRQ? If the server will be implemented as one process, the answer is yes. Dotan > > Thanks a lot, > > AJ > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From dotanba at gmail.com Thu Aug 28 12:04:26 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:04:26 +0300 Subject: [ofa-general] ***SPAM*** ibv_qp_cap- sensible settings for the parameters In-Reply-To: <9870a2060808280848p617223c9na94aaf042d62468@mail.gmail.com> References: <9870a2060808280848p617223c9na94aaf042d62468@mail.gmail.com> Message-ID: <2f3bf9a60808281204y8f2f7c0yc80454dfbff8faab@mail.gmail.com> On Thu, Aug 28, 2008 at 6:48 PM, Adrien Guillon wrote: > Hi, > > I don't quite understand the parameters inside struct > ibv_qp_init_attr. In particular the ibv_qp_cap structure... what are > sensible defaults? How do these parameters affect overall functioning > of the application? > One should use the minimum values that he needs for his application (for example: if you don't use any scatter/gather entries in your application, so limit this value to 1 in both RQ and SQ of the QP). If high values will be used in the QP capabilities structure, more memory will be allocated (and pinned) for this QP. Dotan > Thanks, > > AJ > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From dotanba at gmail.com Thu Aug 28 12:06:26 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:06:26 +0300 Subject: [ofa-general] ***SPAM*** Thread Safety and Infiniband Verbs and RDMA In-Reply-To: <9870a2060808280845m29a0391ej270db02f4c76e34d@mail.gmail.com> References: <9870a2060808280845m29a0391ej270db02f4c76e34d@mail.gmail.com> Message-ID: <2f3bf9a60808281206r5582df74m43187904bd489291@mail.gmail.com> On Thu, Aug 28, 2008 at 6:45 PM, Adrien Guillon wrote: > Hey all, > > I'm working on my first RDMA application in C++ (hooray). > > Are functions from librdmacm and infiniband verbs thread safe, i.e. > can many threads be posting work on the same connection at the same > time, or do I have to wrap access with a mutex? > The infiniband verbs library is a thread safe library (i think that i even mentioned it in the man page of verbs.h). For the librdmacm, you should ask Sean Hefty, but i'm quite sure that this library is thread safe too. Do you have any reason to think that they (or one of them) is not thread safe? Dotan > Thanks, > > AJ > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From aj.guillon at gmail.com Thu Aug 28 12:11:05 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 15:11:05 -0400 Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** ibv_qp_cap- sensible settings for the parameters In-Reply-To: <2f3bf9a60808281204y8f2f7c0yc80454dfbff8faab@mail.gmail.com> References: <9870a2060808280848p617223c9na94aaf042d62468@mail.gmail.com> <2f3bf9a60808281204y8f2f7c0yc80454dfbff8faab@mail.gmail.com> Message-ID: <9870a2060808281211p64c5284hcb754b3e51f46f7c@mail.gmail.com> What happens if any of these values are too low? From YJia at tmriusa.com Thu Aug 28 12:09:39 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Thu, 28 Aug 2008 14:09:39 -0500 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: <48B68CE0.1010701@obsidianresearch.com> Message-ID: I missed it. So it's better to keep opensm running all the time? Thanks! Yicheng Hal Rosenstock 08/28/2008 06:31 AM To Yicheng Jia cc Dotan Barak , general at lists.openfabrics.org Subject Re: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network Yicheng Jia wrote: > Yes. My basic idea is, the opensm set up the subnet during initialization, > it will report errors during this process. After the subnet is up, the > environment is fixed and stable. If some failure happens, opensm could be > used again to diagnose the failure. From my understanding, in this case, > the only work that opensm does after subnet is up is to log the status. Wouldn't opensm also repair the failure if it could ? -- Hal > > Thanks! > Yicheng > > > > > Hal Rosenstock > 08/27/2008 05:55 PM > > To > Yicheng Jia > cc > Dotan Barak , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > > > Yicheng Jia wrote: > >> My operation is quite simple: connect QPs and do RDMA read/write. In >> > this > >> case, the opensm is not in need when the subnet is up, correct? >> >> > Is this a production subnet ? Do you need to deal with any failures ? > > -- Hal > > >> Thanks! >> Yicheng >> >> >> >> >> "Dotan Barak" >> 08/21/2008 02:33 PM >> >> To >> "Yicheng Jia" >> cc >> "Hal Rosenstock" , >> > general at lists.openfabrics.org > >> Subject >> Re: [ofa-general] minimum sw components requirement for driver/opensm in >> > a > >> single unmanaged switch network >> >> >> >> >> >> >> On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: >> >> >>> Hi Hal, >>> >>> Can opensm just run once? When the subnet is up, it can exit assume >>> > that > >> no >> >> >>> change will be made in the subnet. >>> >>> >>> >> Yes, depend on the serives that you will need/use. >> >> For example: if you use operations that requires SA query, you must >> have a live SM. >> >> If you will connect the QPs in the subnet by yourself (for example, >> using socket) you can manage without a live SM in the subnet ... >> >> Dotan >> >> >>> Thanks! >>> Yicheng >>> >>> >>> >>> "Hal Rosenstock" >>> >>> 07/10/2008 09:15 PM >>> >>> To >>> "Yicheng Jia" >>> cc >>> "Jim Mott" , general at lists.openfabrics.org >>> Subject >>> Re: [ofa-general] minimum sw components requirement for driver/opensm >>> > in > >> a >> >> >>> single unmanaged switch network >>> >>> >>> >>> >>> On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >>> >>> >>>>> If you want to avoid all the SM stuff, and are willing to program the >>>>> switches directly (a few mads) >>>>> >>>>> >>>> Is it done by opensm? >>>> >>>> >>> Yes. >>> >>> >>> >>>> What information should be set up in the switch by >>>> opensm? >>>> >>>> >>> Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >>> >>> >>> >>>>> Then to figure out QP connections, you just use a function of 3 >>>>> parameters: >>>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>>> Where qp_num is a small number between 0 and the maximum number of >>>>> > QPs > >>>>> you >>>>> need active between any 2 endpoints. >>>>> >>>>> >>>> Can the qp_num be manually assigned? >>>> Does it need opensm be involved? >>>> >>>> >>> SM has nothing to do with QP numbers. >>> >>> >>> >>>>> If it works, you are done. If not, reset, up, wait for him to >>>>> > connect > >>>>> and >>>>> send something to you. >>>>> >>>>> >>>> Is it reliable? I mean the QPs connection will keep alive during the >>>> >>>> >> QPs >> >> >>>> lifecycle? >>>> >>>> >>> For one thing, SM needs to try to keep ports at active. >>> >>> -- Hal >>> >>> >>> >>>> Best, >>>> Yicheng >>>> >>>> >>>> >>>> "Jim Mott" >>>> >>>> 07/10/2008 04:17 PM >>>> >>>> To >>>> "Yicheng Jia" , >>>> cc >>>> Subject >>>> RE: [ofa-general] minimum sw components requirement for driver/opensm >>>> >>>> >> in a >> >> >>>> single unmanaged switch network >>>> >>>> >>>> >>>> >>>> If you want to avoid all the SM stuff, and are willing to program the >>>> switches directly (a few mads), then I've used schemes like: >>>> >>>> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >>>> >>>> Then to figure out QP connections, you just use a function of 3 >>>> parameters: >>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>> Where qp_num is a small number between 0 and the maximum number of QPs >>>> > > >> you >> >> >>>> need active between any 2 endpoints. >>>> >>>> With the above scheme, you know your node_id (switch port number), >>>> > your > >>>> lid, >>>> the lid of the target node, and the QPs on both sides. From there >>>> > on, > >> it >> >> >>>> is clear sailing. You don't even need to send MADs; just transition >>>> >>>> >> the >> >> >>>> QP >>>> up and try and use it. If it works, you are done. If not, reset, up, >>>> wait >>>> for him to connect and send something to you. A little timer to make >>>> >>>> >> sure >> >> >>>> everybody retries once in awhile and what can go wrong? >>>> >>>> Jim >>>> From: general-bounces at lists.openfabrics.org >>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng >>>> > Jia > >>>> Sent: Thursday, July 10, 2008 2:59 PM >>>> To: general at lists.openfabrics.org >>>> Subject: [ofa-general] minimum sw components requirement for >>>> >>>> >> driver/opensm >> >> >>>> in a single unmanaged switch network >>>> >>>> >>>> Hi Folks, >>>> >>>> I have a IB network which consists of only a single unmanaged switch, >>>> >>>> >> all >> >> >>>> end nodes connecting with the switch only need to do RDMA read/write >>>> operation with each other. My question is, what are the indispensable >>>> modules in driver's core and opensm that make the network up and run? >>>> >>>> I've been using only ib_mad module in driver's core with a managed >>>> >>>> >> switch >> >> >>>> before, and the network works fine. So I assume that only the ib_mad >>>> module >>>> in driver's core and SM in opensm are mandatory in my network. The >>>> > LIDs > >>>> are >>>> assigned by them. The SA and CM modules are not useful in my case. Am >>>> > I > >>>> right? >>>> >>>> I need to minimize driver and opensm to fit them in my network, the >>>> > HCA > >>>> driver is mthca. >>>> >>>> Best, >>>> Yicheng >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >> MessageLabs. >> >> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>>> http://openib.org/mailman/listinfo/openib-general >>>> >>>> >>>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. >> >> >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. >> >> >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >>> >>> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. For more information please visit http://www.ers.ibm.com > > _____________________________________________________________________________ > >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> > http://openib.org/mailman/listinfo/openib-general > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj.guillon at gmail.com Thu Aug 28 12:13:23 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 15:13:23 -0400 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> Message-ID: <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> I still allocate a separate CQ for each QP right? Also, I read http://www.hpcwire.com/features/17886984.html the author mentions doom and gloom for scalability. Is this just fear mongering, or is this a real problem? If I use SRQ's am I in the clear for scalability? Thanks. AJ From dotanba at gmail.com Thu Aug 28 12:13:39 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:13:39 +0300 Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** ibv_qp_cap- sensible settings for the parameters In-Reply-To: <9870a2060808281211p64c5284hcb754b3e51f46f7c@mail.gmail.com> References: <9870a2060808280848p617223c9na94aaf042d62468@mail.gmail.com> <2f3bf9a60808281204y8f2f7c0yc80454dfbff8faab@mail.gmail.com> <9870a2060808281211p64c5284hcb754b3e51f46f7c@mail.gmail.com> Message-ID: <2f3bf9a60808281213ma42848dw61e52b69920c2f04@mail.gmail.com> On Thu, Aug 28, 2008 at 10:11 PM, Adrien Guillon wrote: > What happens if any of these values are too low? > If you'll try to post a SR/RR with more scatter/gather entries than QP was created with, you'll get an immediate error (QP state won't be changed). If you'll try to post more SR/RR than the outstaning number of WR that the work queue (SQ/RQ) was created with, you'll get an immediate error (QP state won't be changed). Dotan From dotanba at gmail.com Thu Aug 28 12:19:39 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:19:39 +0300 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> Message-ID: <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> On Thu, Aug 28, 2008 at 10:13 PM, Adrien Guillon wrote: > I still allocate a separate CQ for each QP right? You may use a separate CQ for each QP, if this is what you want (or you can use one CQ for all of those QPs). > > Also, I read > > http://www.hpcwire.com/features/17886984.html > > the author mentions doom and gloom for scalability. Is this just fear > mongering, or is this a real problem? If I use SRQ's am I in the > clear for scalability? Sorry, but i didn't read this article before and i don't have the time to do it now. But SRQ is an object which helps you create more scalable SW: if you have N QPs, and every QP may get M messages; in the past you had to post N*M WRs (M WRs to every QP). When using a SRQ, you can post much less WRs and SRQ is easier to manage because you can use the LIMIT event to know how many RR there are in the SRQ (QP don't have this feature). Dotan > > Thanks. > > AJ > From aj.guillon at gmail.com Thu Aug 28 12:21:33 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 15:21:33 -0400 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> Message-ID: <9870a2060808281221m4e14ef9dmfb7aa616abc48f62@mail.gmail.com> I didn't know I could use the same CQ for all QPs. Is there a downside? Thanks AJ From dotanba at gmail.com Thu Aug 28 12:25:59 2008 From: dotanba at gmail.com (Dotan Barak) Date: Thu, 28 Aug 2008 22:25:59 +0300 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <9870a2060808281221m4e14ef9dmfb7aa616abc48f62@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> <9870a2060808281221m4e14ef9dmfb7aa616abc48f62@mail.gmail.com> Message-ID: <2f3bf9a60808281225t59eb42eeka4923fbd1f3c03b4@mail.gmail.com> You may get CQ overrun if you are not carfull enough ... Using one CQ can make your life easier, but i suggest to use different CQ for RQ and SQ. (The SQ is the only queue that you can control it's message rate...) Dotan On Thu, Aug 28, 2008 at 10:21 PM, Adrien Guillon wrote: > I didn't know I could use the same CQ for all QPs. Is there a downside? > > Thanks > > AJ > From aj.guillon at gmail.com Thu Aug 28 12:28:33 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 15:28:33 -0400 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <2f3bf9a60808281225t59eb42eeka4923fbd1f3c03b4@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> <9870a2060808281221m4e14ef9dmfb7aa616abc48f62@mail.gmail.com> <2f3bf9a60808281225t59eb42eeka4923fbd1f3c03b4@mail.gmail.com> Message-ID: <9870a2060808281228r22b3fb80we7ff0ca2ac4a149b@mail.gmail.com> Thanks! From halr at obsidianresearch.com Thu Aug 28 12:30:53 2008 From: halr at obsidianresearch.com (Hal Rosenstock) Date: Thu, 28 Aug 2008 13:30:53 -0600 Subject: [ofa-general] minimum sw components requirement for driver/opensm in a single unmanaged switch network In-Reply-To: References: Message-ID: <48B6FCED.106@obsidianresearch.com> Yicheng Jia wrote: > I missed it. I'm not following what you mean by that. > So it's better to keep opensm running all the time? > Yes, that's what the architecture says and it's for sound reasons. -- Hal > Thanks! > Yicheng > > > > Hal Rosenstock > 08/28/2008 06:31 AM > > To > Yicheng Jia > cc > Dotan Barak , general at lists.openfabrics.org > Subject > Re: [ofa-general] minimum sw components requirement for driver/opensm in a > single unmanaged switch network > > > > > > > Yicheng Jia wrote: > >> Yes. My basic idea is, the opensm set up the subnet during >> > initialization, > >> it will report errors during this process. After the subnet is up, the >> environment is fixed and stable. If some failure happens, opensm could >> > be > >> used again to diagnose the failure. From my understanding, in this case, >> > > >> the only work that opensm does after subnet is up is to log the status. >> > Wouldn't opensm also repair the failure if it could ? > > -- Hal > > > >> Thanks! >> Yicheng >> >> >> >> >> Hal Rosenstock >> 08/27/2008 05:55 PM >> >> To >> Yicheng Jia >> cc >> Dotan Barak , general at lists.openfabrics.org >> Subject >> Re: [ofa-general] minimum sw components requirement for driver/opensm in >> > a > >> single unmanaged switch network >> >> >> >> >> >> >> Yicheng Jia wrote: >> >> >>> My operation is quite simple: connect QPs and do RDMA read/write. In >>> >>> >> this >> >> >>> case, the opensm is not in need when the subnet is up, correct? >>> >>> >>> >> Is this a production subnet ? Do you need to deal with any failures ? >> >> -- Hal >> >> >> >>> Thanks! >>> Yicheng >>> >>> >>> >>> >>> "Dotan Barak" >>> 08/21/2008 02:33 PM >>> >>> To >>> "Yicheng Jia" >>> cc >>> "Hal Rosenstock" , >>> >>> >> general at lists.openfabrics.org >> >> >>> Subject >>> Re: [ofa-general] minimum sw components requirement for driver/opensm >>> > in > >> a >> >> >>> single unmanaged switch network >>> >>> >>> >>> >>> >>> >>> On Thu, Aug 21, 2008 at 10:16 PM, Yicheng Jia wrote: >>> >>> >>> >>>> Hi Hal, >>>> >>>> Can opensm just run once? When the subnet is up, it can exit assume >>>> >>>> >> that >> >> >>> no >>> >>> >>> >>>> change will be made in the subnet. >>>> >>>> >>>> >>>> >>> Yes, depend on the serives that you will need/use. >>> >>> For example: if you use operations that requires SA query, you must >>> have a live SM. >>> >>> If you will connect the QPs in the subnet by yourself (for example, >>> using socket) you can manage without a live SM in the subnet ... >>> >>> Dotan >>> >>> >>> >>>> Thanks! >>>> Yicheng >>>> >>>> >>>> >>>> "Hal Rosenstock" >>>> >>>> 07/10/2008 09:15 PM >>>> >>>> To >>>> "Yicheng Jia" >>>> cc >>>> "Jim Mott" , general at lists.openfabrics.org >>>> Subject >>>> Re: [ofa-general] minimum sw components requirement for driver/opensm >>>> >>>> >> in >> >> >>> a >>> >>> >>> >>>> single unmanaged switch network >>>> >>>> >>>> >>>> >>>> On Thu, Jul 10, 2008 at 7:39 PM, Yicheng Jia wrote: >>>> >>>> >>>> >>>>>> If you want to avoid all the SM stuff, and are willing to program >>>>>> > the > >>>>>> switches directly (a few mads) >>>>>> >>>>>> >>>>>> >>>>> Is it done by opensm? >>>>> >>>>> >>>>> >>>> Yes. >>>> >>>> >>>> >>>> >>>>> What information should be set up in the switch by >>>>> opensm? >>>>> >>>>> >>>>> >>>> Things like the PortInfos and LFT. See IBA spec vol 1 14.2.5 >>>> >>>> >>>> >>>> >>>>>> Then to figure out QP connections, you just use a function of 3 >>>>>> parameters: >>>>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>>>> Where qp_num is a small number between 0 and the maximum number of >>>>>> >>>>>> >> QPs >> >> >>>>>> you >>>>>> need active between any 2 endpoints. >>>>>> >>>>>> >>>>>> >>>>> Can the qp_num be manually assigned? >>>>> Does it need opensm be involved? >>>>> >>>>> >>>>> >>>> SM has nothing to do with QP numbers. >>>> >>>> >>>> >>>> >>>>>> If it works, you are done. If not, reset, up, wait for him to >>>>>> >>>>>> >> connect >> >> >>>>>> and >>>>>> send something to you. >>>>>> >>>>>> >>>>>> >>>>> Is it reliable? I mean the QPs connection will keep alive during the >>>>> >>>>> >>>>> >>> QPs >>> >>> >>> >>>>> lifecycle? >>>>> >>>>> >>>>> >>>> For one thing, SM needs to try to keep ports at active. >>>> >>>> -- Hal >>>> >>>> >>>> >>>> >>>>> Best, >>>>> Yicheng >>>>> >>>>> >>>>> >>>>> "Jim Mott" >>>>> >>>>> 07/10/2008 04:17 PM >>>>> >>>>> To >>>>> "Yicheng Jia" , >>>>> cc >>>>> Subject >>>>> RE: [ofa-general] minimum sw components requirement for driver/opensm >>>>> > > >>>>> >>> in a >>> >>> >>> >>>>> single unmanaged switch network >>>>> >>>>> >>>>> >>>>> >>>>> If you want to avoid all the SM stuff, and are willing to program the >>>>> switches directly (a few mads), then I've used schemes like: >>>>> >>>>> Node LID=base + (switch port * constant) (base=0, constant = 1 works) >>>>> >>>>> Then to figure out QP connections, you just use a function of 3 >>>>> parameters: >>>>> my_qp_num = fn_sqp(my_node, target_node, qp_num) >>>>> target_qp_num = fn_tqp(my_node, target_node, qp_num) >>>>> Where qp_num is a small number between 0 and the maximum number of >>>>> > QPs > >> >>> you >>> >>> >>> >>>>> need active between any 2 endpoints. >>>>> >>>>> With the above scheme, you know your node_id (switch port number), >>>>> >>>>> >> your >> >> >>>>> lid, >>>>> the lid of the target node, and the QPs on both sides. From there >>>>> >>>>> >> on, >> >> >>> it >>> >>> >>> >>>>> is clear sailing. You don't even need to send MADs; just transition >>>>> >>>>> >>>>> >>> the >>> >>> >>> >>>>> QP >>>>> up and try and use it. If it works, you are done. If not, reset, >>>>> > up, > >>>>> wait >>>>> for him to connect and send something to you. A little timer to make >>>>> > > >>>>> >>> sure >>> >>> >>> >>>>> everybody retries once in awhile and what can go wrong? >>>>> >>>>> Jim >>>>> From: general-bounces at lists.openfabrics.org >>>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Yicheng >>>>> >>>>> >> Jia >> >> >>>>> Sent: Thursday, July 10, 2008 2:59 PM >>>>> To: general at lists.openfabrics.org >>>>> Subject: [ofa-general] minimum sw components requirement for >>>>> >>>>> >>>>> >>> driver/opensm >>> >>> >>> >>>>> in a single unmanaged switch network >>>>> >>>>> >>>>> Hi Folks, >>>>> >>>>> I have a IB network which consists of only a single unmanaged switch, >>>>> > > >>>>> >>> all >>> >>> >>> >>>>> end nodes connecting with the switch only need to do RDMA read/write >>>>> operation with each other. My question is, what are the indispensable >>>>> modules in driver's core and opensm that make the network up and run? >>>>> >>>>> I've been using only ib_mad module in driver's core with a managed >>>>> >>>>> >>>>> >>> switch >>> >>> >>> >>>>> before, and the network works fine. So I assume that only the ib_mad >>>>> module >>>>> in driver's core and SM in opensm are mandatory in my network. The >>>>> >>>>> >> LIDs >> >> >>>>> are >>>>> assigned by them. The SA and CM modules are not useful in my case. Am >>>>> > > >> I >> >> >>>>> right? >>>>> >>>>> I need to minimize driver and opensm to fit them in my network, the >>>>> >>>>> >> HCA >> >> >>>>> driver is mthca. >>>>> >>>>> Best, >>>>> Yicheng >>>>> >>>>> >>>>> >>>>> >>>>> > _____________________________________________________________________________ > >>>>> Scanned by IBM Email Security Management Services powered by >>>>> >>>>> >>>>> >>> MessageLabs. >>> >>> >>> >>>>> For more information please visit http://www.ers.ibm.com >>>>> >>>>> >>>>> >>>>> >>>>> > _____________________________________________________________________________ > > _____________________________________________________________________________ > >>>>> Scanned by IBM Email Security Management Services powered by >>>>> >>>>> >>>>> >>> MessageLabs. >>> >>> >>> >>>>> For more information please visit http://www.ers.ibm.com >>>>> >>>>> >>>>> >>>>> >>>>> > _____________________________________________________________________________ > > _____________________________________________________________________________ > >>>>> Scanned by IBM Email Security Management Services powered by >>>>> >>>>> >>>>> >>> MessageLabs. >>> >>> >>> >>>>> For more information please visit http://www.ers.ibm.com >>>>> >>>>> >>>>> >>>>> >>>>> > _____________________________________________________________________________ > >>>>> _______________________________________________ >>>>> general mailing list >>>>> general at lists.openfabrics.org >>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>>> >>>>> To unsubscribe, please visit >>>>> http://openib.org/mailman/listinfo/openib-general >>>>> >>>>> >>>>> >>>>> > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >>>> >>> MessageLabs. >>> >>> >>> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > > _____________________________________________________________________________ > >>>> Scanned by IBM Email Security Management Services powered by >>>> >>>> >>>> >>> MessageLabs. >>> >>> >>> >>>> For more information please visit http://www.ers.ibm.com >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>>> http://openib.org/mailman/listinfo/openib-general >>>> >>>> >>>> >>>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. >> >> >>> For more information please visit http://www.ers.ibm.com >>> >>> >>> > _____________________________________________________________________________ > >>> >>> > _____________________________________________________________________________ > >>> Scanned by IBM Email Security Management Services powered by >>> >>> >> MessageLabs. For more information please visit http://www.ers.ibm.com >> >> >> > _____________________________________________________________________________ > > ------------------------------------------------------------------------ > >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> >>> >> http://openib.org/mailman/listinfo/openib-general >> >> >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. > >> For more information please visit http://www.ers.ibm.com >> >> > _____________________________________________________________________________ > >> >> >> > _____________________________________________________________________________ > >> Scanned by IBM Email Security Management Services powered by >> > MessageLabs. For more information please visit http://www.ers.ibm.com > > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > From aj.guillon at gmail.com Thu Aug 28 12:33:38 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Thu, 28 Aug 2008 15:33:38 -0400 Subject: ***SPAM*** Re: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <6.2.0.14.2.20080828122631.023e6420@esmail.cup.hp.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> <6.2.0.14.2.20080828122631.023e6420@esmail.cup.hp.com> Message-ID: <9870a2060808281233x7159235fi6dd964595c4462f1@mail.gmail.com> In my case, I have a cluster in which each node will be connected to every other for RDMA operations. So each node will act as a server for incoming connections, and have connections to all other nodes. Any suggestions on best approach? Thanks, AJ From mkrause at hp.com Thu Aug 28 12:29:56 2008 From: mkrause at hp.com (Michael Krause) Date: Thu, 28 Aug 2008 12:29:56 -0700 Subject: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.co m> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> Message-ID: <6.2.0.14.2.20080828122631.023e6420@esmail.cup.hp.com> An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Thu Aug 28 13:36:33 2008 From: sean.hefty at intel.com (Hefty, Sean) Date: Thu, 28 Aug 2008 13:36:33 -0700 Subject: [ofa-general] ***SPAM*** Thread Safety and Infiniband Verbs and RDMA In-Reply-To: <2f3bf9a60808281206r5582df74m43187904bd489291@mail.gmail.com> References: <9870a2060808280845m29a0391ej270db02f4c76e34d@mail.gmail.com> <2f3bf9a60808281206r5582df74m43187904bd489291@mail.gmail.com> Message-ID: >For the librdmacm, you should ask Sean Hefty, but i'm quite sure that >this library is thread safe too. The librdmacm is thread safe, with the exception of rdma_migrate_id, which has some noted limitations. See its man page for more details if you're using that call. - Sean From andy.grover at oracle.com Thu Aug 28 14:32:01 2008 From: andy.grover at oracle.com (Andy Grover) Date: Thu, 28 Aug 2008 14:32:01 -0700 Subject: [ofa-general] IRC Channel? In-Reply-To: <9870a2060808280852q61b5fe6es4ea3c006d6fdfba2@mail.gmail.com> References: <9870a2060808280852q61b5fe6es4ea3c006d6fdfba2@mail.gmail.com> Message-ID: <48B71951.4070603@oracle.com> Adrien Guillon wrote: > Is there an IRC channel where people lurk to discuss infiniband and > RDMA? It would be nice to have a place for quick questions as opposed > to just mailiing lists and docs. #ofed on irc.oftc.net, but right now there's only 2 people on it. Come on by! :) Regards -- Andy From chu11 at llnl.gov Thu Aug 28 15:30:24 2008 From: chu11 at llnl.gov (Al Chu) Date: Thu, 28 Aug 2008 15:30:24 -0700 Subject: [ofa-general] [IBSIM][Trivial] fix error message typo Message-ID: <1219962624.29252.310.camel@cardanus.llnl.gov> Hey Sasha, Probably due to a cut and paste. Nothing fancy. Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-fix-newline-typo.patch Type: text/x-patch Size: 783 bytes Desc: not available URL: From meier3 at llnl.gov Thu Aug 28 15:36:20 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Thu, 28 Aug 2008 15:36:20 -0700 Subject: [ofa-general] [PATCH] opensm: osm_opensm.c - changed load_plugins() arg to const Message-ID: <48B72864.60105@llnl.gov> Sasha, I am doing some plugin work, and ran into both of the cases this patch addresses. From 680195f393a8c3a5036fc1c8afe76ae7bac5d3cb Mon Sep 17 00:00:00 2001 From: Tim Meier Date: Thu, 28 Aug 2008 15:10:35 -0700 Subject: [PATCH] opensm: osm_opensm.c - changed load_plugins() arg to const Made a copy of the list of plugin names, so the parser would not destroy the original copy. Also now supports passing in pointers to string constants. Signed-off-by: Tim Meier --- opensm/opensm/osm_opensm.c | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c index 6cf4726..65fd632 100644 --- a/opensm/opensm/osm_opensm.c +++ b/opensm/opensm/osm_opensm.c @@ -244,12 +244,13 @@ void osm_opensm_destroy(IN osm_opensm_t * const p_osm) osm_log_destroy(&p_osm->log); } -static void load_plugins(osm_opensm_t *osm, char *plugin_names) +static void load_plugins(osm_opensm_t *osm, const char *plugin_names) { osm_epi_plugin_t *epi; - char *name, *p; + char *p_names, *name, *p; - name = strtok_r(plugin_names, " \t\n", &p); + p_names = strdup(plugin_names); + name = strtok_r(p_names, " \t\n", &p); while (name && *name) { epi = osm_epi_construct(osm, name); if (!epi) @@ -259,6 +260,7 @@ static void load_plugins(osm_opensm_t *osm, char *plugin_names) cl_qlist_insert_tail(&osm->plugin_list, &epi->list); name = strtok_r(NULL, " \t\n", &p); } + free(p_names); } /********************************************************************** -- 1.5.4.5 -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 0002-opensm-osm_opensm.c-changed-load_plugins-arg-to.patch URL: From sean.hefty at intel.com Thu Aug 28 15:42:38 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 28 Aug 2008 15:42:38 -0700 Subject: [ofa-general] vacation September and October Message-ID: <000101c9095f$5f7c8930$d0d8180a@amr.corp.intel.com> I will be on sabbatical the months of September and October. In my absence, Arlin Davis will cover my Linux development, Jerrie Coffman - IBTA participation, and Stan Smith - Windows development. At this point, I plan on resuming my normal Linux and Windows OFA development and IBTA activities on my return. - Sean From chu11 at llnl.gov Thu Aug 28 16:01:27 2008 From: chu11 at llnl.gov (Al Chu) Date: Thu, 28 Aug 2008 16:01:27 -0700 Subject: [ofa-general] [IBSIM] add ReLink command Message-ID: <1219964487.29252.318.camel@cardanus.llnl.gov> Hey Sasha, This adds a "ReLink" command to ibsim. If a link was previously unlinked, you can run "ReLink" to reconnect it to whatever it was connected to before. It's easier than having to figure out what it was connected to previously and input both the local and remote ends under the "Link" command. The idea for this option came up when I was trying to simulate an entire cluster going down then going back up. Scripting the cluster to go down was easy ("Unlink" all CAs), but scripting it to come back up was a little harder since I had to figure out all the other end ports to input into "Link". Al -- Albert Chu chu11 at llnl.gov 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-add-relink-command.patch Type: text/x-patch Size: 3623 bytes Desc: not available URL: From iuzzolin at nmia.com Thu Aug 28 22:38:12 2008 From: iuzzolin at nmia.com (Harold Iuzzolino) Date: Thu, 28 Aug 2008 23:38:12 -0600 (MDT) Subject: [ofa-general] Good fix for dat_strerror.c NULL undefined problem Message-ID: <20080828233657.K58477@plato.nmia.com> Dear Arlin Davis and Openfabrics, arlin.r.davis at intel.com general at lists.openfabrics.org I've got some good news and some bad news, but we're getting there, ie compiling the OFED software, VERY SLOWLY. Has the team ever gotten this to compile correctly on 1. A 64 bit Linux machine, 2. On a 64 bit machine with Fedora 8 or 9 on it? Concerning the problem in package compat-dapl-1.2.8: dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this function) I believe your correction for Fedora 9, namely adding as line 136 of the file dat/include/dat/dat_platform_specific.h works this line: #include /* Linux begins */ #elif defined(__linux__) /* Linux */ #if defined(__KERNEL__) #include #else #include #include <--------------------- line 136 #endif /* defined(__KERNEL__) */ Could you put that into the official version of compat-dapl-1.2.8? ----------------- I tested it by running the install.pl until it broke. Then I made the change in /var/tmp/OFED_topdir/BUILD/compat-dapl-1.2.8/dat/include/dat/dat_platform_specific.h did 'make clean' and then 'make' The subroutine dat/common/dat_strerror.c compiled this time. --------------------------------------------- Now for the bad news: Some more files further on in that package had compile errors. It is possible that the reason for the format errors are that we are compiling on a 64 bit machine. [root at treebeard compat-dapl-1.2.8]# uname -a Linux treebeard 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux File dtest.c dtest.c:193: warning: return type defaults to 'int' dtest.c: In function 'main': dtest.c:362: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_PORT_QUAL' dtest.c:367: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_PORT_QUAL' dtest.c: In function 'send_msg': dtest.c:583: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_VLEN' dtest.c:583: warning: too many arguments for format dtest.c: In function 'connect_ep': dtest.c:613: warning: format '%d' expects type 'int', but argument 4 has type 'long unsigned int' dtest.c:805: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'DAT_VADDR' dtest.c:868: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:868: warning: too many arguments for format dtest.c:876: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'DAT_VADDR' dtest.c:876: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:602: warning: unused variable 'ep_attr' dtest.c: In function 'disconnect_ep': dtest.c:890: warning: unused variable 'flush_cnt' dtest.c:890: warning: unused variable 'i' dtest.c: In function 'do_rdma_write_with_msg': dtest.c:997: warning: format '%d' expects type 'int', but argument 5 has type 'DAT_VLEN' dtest.c:1087: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:1098: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:971: warning: unused variable 'their_context' dtest.c:965: warning: unused variable 'region' dtest.c: In function 'do_rdma_read_with_msg': dtest.c:1197: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_VLEN' dtest.c:1197: warning: too many arguments for format dtest.c:1279: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:1279: warning: too many arguments for format dtest.c:1288: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:1121: warning: unused variable 'their_context' dtest.c:1115: warning: unused variable 'region' dtest.c: In function 'do_ping_pong_msg': dtest.c:1431: warning: too many arguments for format dtest.c: In function 'DT_RetToString': dtest.c:1763: warning: unused variable 'sz' dtest.c: In function 'main': dtest.c:495: warning: control reaches end of non-void function dtest.c: At top level: dtest.c:145: warning: 'parent' defined but not used dtest.c:154: warning: 'pin_memory' defined but not used dtest.c:158: warning: 'post_recv_count' defined but not used dtest.c:163: warning: 'child' defined but not used --------------------------------------------------------------- File dapl_netaddr.cc if gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I include -I mdep/linux -I ./../../dat/include -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dapl_transaction_cmd.o -MD -MP -MF ".deps/dapl_transaction_cmd.Tpo" -c -o dapl_transaction_cmd.o `test -f 'cmd/dapl_transaction_cmd.c' || echo './'`cmd/dapl_transaction_cmd.c; \ then mv -f ".deps/dapl_transaction_cmd.Tpo" ".deps/dapl_transaction_cmd.Po"; else rm -f ".deps/dapl_transaction_cmd.Tpo"; exit 1; fi cmd/dapl_netaddr.c: In function 'DT_NetAddrLookupHostAddress': cmd/dapl_netaddr.c:94: error: 'EAI_ADDRFAMILY' undeclared (first use in this function) cmd/dapl_netaddr.c:94: error: (Each undeclared identifier is reported only once cmd/dapl_netaddr.c:94: error: for each function it appears in.) cmd/dapl_netaddr.c:99: error: 'EAI_NODATA' undeclared (first use in this function) make[2]: *** [dapl_netaddr.o] Error 1 Any ideas how to fix these subroutines? Carlyn <<< <-----Counting in binary is just like counting in decimal if you're all thumbs / <<< Glaser and Way-----> From iuzzolin at nmia.com Thu Aug 28 22:45:25 2008 From: iuzzolin at nmia.com (Harold Iuzzolino) Date: Thu, 28 Aug 2008 23:45:25 -0600 (MDT) Subject: [ofa-general] dtest.c: mismatch between %d formats and 64 bit (unsigned) integers Message-ID: <20080828234256.J58477@plato.nmia.com> Dear Openfabrics, general at lists.openfabrics.org Bug #1 in dtest.c: mismatch between %d formats and 64 bit (unsigned) integers gcc complains about the formats used (%d, %llx, %x) for 64 bit integers. Should the %d be %ld and the %x be %lx? I'm not sure what the %llx should be changed to. I'm running on a 64 bit machine, but as far as I can tell, DAT_PORT_QUAL, DAT_VLEN and DAT_VADDR are defined as 64 bit integers even on a 32 bit machine. -------------------------------------- gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I ./../../dat/include -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dtest.o -MD -MP -MF ".deps/dtest.Tpo" -c -o dtest.o dtest.c dtest.c:362: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_PORT_QUAL' dtest.c:367: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_PORT_QUAL' dtest.c:583: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_VLEN' dtest.c:613: warning: format '%d' expects type 'int', but argument 4 has type 'long unsigned int' dtest.c:805: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'DAT_VADDR' dtest.c:805: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:868: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:876: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'DAT_VADDR' dtest.c:876: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:997: warning: format '%d' expects type 'int', but argument 5 has type 'DAT_VLEN' dtest.c:1087: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:1098: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' dtest.c:1197: warning: format '%d' expects type 'int', but argument 4 has type 'DAT_VLEN' dtest.c:1279: warning: format '%d' expects type 'int', but argument 5 has type 'long unsigned int' dtest.c:1288: warning: format '%x' expects type 'unsigned int', but argument 6 has type 'DAT_VLEN' I'm running on a 64 bit machine with Fedora 9. [root at treebeard BUILD]# uname -a Linux treebeard 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root at treebeard BUILD]# gcc --version gcc (GCC) 4.3.0 20080428 (Red Hat 4.3.0-8) [root at treebeard BUILD]# pwd /var/tmp/OFED_topdir/BUILD To figure out what DAT_PORT_QUAL, DAT_VLEN, DAT_VADDR are defined as, I ran for each of those: [root at treebeard BUILD]# grep -r DAT_PORT_QUAL *|egrep 'typedef|define' compat-dapl-1.2.8/dat/include/dat/dat.h:typedef DAT_UINT64 DAT_PORT_QUAL; compat-dapl-1.2.8/dat/include/dat/dat.h:typedef DAT_UINT64 DAT_VLEN; compat-dapl-1.2.8/dat/include/dat/dat.h:typedef DAT_UINT64 DAT_VADDR; The important information for DAT_UINT64 is compat-dapl-1.2.8/dat/include/dat/dat_platform_specific.h:typedef uint64_t DAT_UINT64; /* unsigned host order, 64 bits */ compat-dapl-1.2.8/dat/include/dat/dat_platform_specific.h:typedef unsigned __int64 DAT_UINT64; /* unsigned host order, 64 bits */ So instead of %d, should you use %ld, and instead of %x, should you use %lx? And somebody needs to figure out what to use for %llx. Carlyn Iuzzolino iuzzolin at nmia.com From iuzzolin at nmia.com Thu Aug 28 22:47:18 2008 From: iuzzolin at nmia.com (Harold Iuzzolino) Date: Thu, 28 Aug 2008 23:47:18 -0600 (MDT) Subject: [ofa-general] dtest.c: various warning complaints Message-ID: <20080828234624.T58477@plato.nmia.com> Dear Openfabrics, general at lists.openfabrics.org Bug #3 in dtest.c: various warning complaints. General warnings in ./test/dtest/dtest.c gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I ./../../dat/include -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dtest.o -MD -MP -MF ".deps/dtest.Tpo" -c -o dtest.o dtest.c dtest.c:193: warning: return type defaults to 'int' dtest.c: In function 'connect_ep': dtest.c:602: warning: unused variable 'ep_attr' dtest.c: In function 'disconnect_ep': dtest.c:890: warning: unused variable 'flush_cnt' dtest.c:890: warning: unused variable 'i' dtest.c: In function 'do_rdma_write_with_msg': dtest.c:971: warning: unused variable 'their_context' dtest.c:965: warning: unused variable 'region' dtest.c: In function 'do_rdma_read_with_msg': dtest.c:1121: warning: unused variable 'their_context' dtest.c:1115: warning: unused variable 'region' dtest.c: In function 'do_ping_pong_msg': dtest.c: In function 'DT_RetToString': dtest.c:1763: warning: unused variable 'sz' dtest.c: In function 'main': dtest.c:495: warning: control reaches end of non-void function dtest.c: At top level: dtest.c:145: warning: 'parent' defined but not used dtest.c:154: warning: 'pin_memory' defined but not used dtest.c:158: warning: 'post_recv_count' defined but not used dtest.c:163: warning: 'child' defined but not used Carlyn Iuzzolino iuzzolin at nmia.com From iuzzolin at nmia.com Thu Aug 28 22:48:40 2008 From: iuzzolin at nmia.com (Harold Iuzzolino) Date: Thu, 28 Aug 2008 23:48:40 -0600 (MDT) Subject: [ofa-general] dtest.c: complaints about "too many arguments for format" Message-ID: <20080828234735.F58477@plato.nmia.com> Dear Openfabrics, general at lists.openfabrics.org Bug #2 in dtest.c: complaints about "too many arguments for format". I believe the problem is caused by 'PRIx64', either its definition or use, namely a missing '%' before PRIx64. gcc -DHAVE_CONFIG_H -I. -I. -I../.. -I ./../../dat/include -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -MT dtest.o -MD -MP -MF ".deps/dtest.Tpo" -c -o dtest.o dtest.c dtest.c:583: warning: too many arguments for format dtest.c:868: warning: too many arguments for format dtest.c:1087: warning: too many arguments for format dtest.c:1197: warning: too many arguments for format dtest.c:1279: warning: too many arguments for format dtest.c:1431: warning: too many arguments for format Line 583 has 3 arguments but 2 formats %d, %d fprintf(stderr, "%d: ERROR: DTO len %d or cookie " PRIx64 "\n", getpid(), event.event_data.dto_completion_event_data.transfered_length, event.event_data.dto_completion_event_data.user_cookie.as_64 ); Line 868 has 4 arguments but 3 formats %d, %d, %d fprintf(stderr,"ERR recv event: len=%d cookie=" PRIx64 " expected %d/%d\n", (int)event.event_data.dto_completion_event_data.transfered_length, (int)event.event_data.dto_completion_event_data.user_cookie.as_64, sizeof(DAT_RMR_TRIPLET), recv_msg_index ); Line 1087 has 4 arguments but 3 formats %d, %d, %d fprintf(stderr,"unexpected event data for receive: len=%d cookie=" PRIx64 " exp %d/%d\n", (int)event.event_data.dto_completion_event_data.transfered_length, (int)event.event_data.dto_completion_event_data.user_cookie.as_64, sizeof(DAT_RMR_TRIPLET), recv_msg_index ); Line 1197 has 3 arguments but 2 formats %d, %d fprintf(stderr, "%d: ERROR: DTO len %d or cookie " PRIx64 "\n", getpid(), event.event_data.dto_completion_event_data.transfered_length, event.event_data.dto_completion_event_data.user_cookie.as_64 ); Lines 1279 and 1431 have the same format and I think, the same problem with the definition or use of PRIx64. I am in the top level directory of the BUILD of compat-dapl-1.2.8 [root at treebeard BUILD]# pwd /var/tmp/OFED_topdir/BUILD [root at treebeard BUILD]# grep -r PRIx64 * PRIx64 is USED only in dtest.c. It is not defined in any .h file in the compat-dapl-1.2.8 package [root at treebeard BUILD]# grep -r PRIx64 /usr/include/* /usr/include/inttypes.h:# define PRIx64 __PRI64_PREFIX "x" In the inttypes.h file you find # define PRIx64 __PRI64_PREFIX "x" # if __WORDSIZE == 64 # define __PRI64_PREFIX "l" # define __PRIPTR_PREFIX "l" # else # define __PRI64_PREFIX "ll" # define __PRIPTR_PREFIX # endif Since I'm using a 64 bit machine, the 'PRIx64' in the format would look like 'lx'. But a format string needs to be %lx. So, for example, in line 583 I think you need to change fprintf(stderr, "%d: ERROR: DTO len %d or cookie " PRIx64 "\n", to fprintf(stderr, "%d: ERROR: DTO len %d or cookie %" PRIx64 "\n", or maybe fprintf(stderr, "%d: ERROR: DTO len %d or cookie " "%" PRIx64 "\n", or maybe #define PRI64_FMT "%" PRIx64 fprintf(stderr, "%d: ERROR: DTO len %d or cookie " PRI64_FMT "\n", The format needs a '%' in front of PRIx64. Carlyn Iuzzolino iuzzolin at nmia.com From iuzzolin at nmia.com Thu Aug 28 22:58:02 2008 From: iuzzolin at nmia.com (Harold Iuzzolino) Date: Thu, 28 Aug 2008 23:58:02 -0600 (MDT) Subject: [ofa-general] Bug #1143 needs to be corrected Message-ID: <20080828235731.I58477@plato.nmia.com> Dear Openfabrics Bug Report Maintainer, Am I able to edit one of my bug reports after I've submitted it? If so, how? If not, could you fix one of my bug reports? I was trying to submit 3 bugs, all about the file dtest.c. Somehow, I put two of the bugs in Bug #1143. Bug #1143 needs to be fixed. The real bug report ./compat-dapl-1.2.8/test/dtest/dtest.c : complaints about "too many arguments for format" is down in the Reply #1 from me. The bug report at the top is really Bug #1145: ./compat-dapl-1.2.8/test/dtest/dtest.c: mismatch between %d formats and 64 bit (unsigned) integers To fix the #1143 listing, please delete the top part, and delete the line ------- Comment #1 From Carlyn Iuzzolino 2008-08-25 12:26:12 [reply] -------so that the comment becomes Bug #1143. ie delete lines: I'm running on a 64 bit machine with Fedora 9. I'm not sure which Component dtest.c belongs to. THRU So instead of %d, should you use %ld, and instead of %x, should you use %lx? And somebody needs to figure out what to use for %llx. Thanks, Carlyn Iuzzolino <<< <<< * * <<< -//-//-_ <<< +>\ --__ <<< +>/ _------__ iuzzolin at nmia.com <<< -\\-\\-- <<< * * From vlad at lists.openfabrics.org Fri Aug 29 03:02:23 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 29 Aug 2008 03:02:23 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080829-0200 daily build status Message-ID: <20080829100223.5214BE60A82@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Failed: Build failed on i686 with linux-2.6.16 Build failed on i686 with linux-2.6.18 Build failed on i686 with linux-2.6.17 From arlin.r.davis at intel.com Fri Aug 29 09:31:26 2008 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Fri, 29 Aug 2008 09:31:26 -0700 Subject: [ofa-general] RE: Good fix for dat_strerror.c NULL undefined problem In-Reply-To: <20080828233657.K58477@plato.nmia.com> References: <20080828233657.K58477@plato.nmia.com> Message-ID: >Concerning the problem in package compat-dapl-1.2.8: >dat/common/dat_strerror.c:621: error: 'NULL' undeclared (first use in this >function) > >I believe your correction for Fedora 9, namely adding as line 136 of the >file >dat/include/dat/dat_platform_specific.h works this line: #include > > >/* Linux begins */ >#elif defined(__linux__) /* Linux */ >#if defined(__KERNEL__) >#include >#else >#include >#include <--------------------- line 136 >#endif /* defined(__KERNEL__) */ > > >Could you put that into the official version of compat-dapl-1.2.8? > The fix is in compat-dapl-1.2.9 > > >Any ideas how to fix these subroutines? I installed fedora 9 on an x86_64 system so I can debug the other issues. I will send out patches and release new packages soon. Targeting OFED 1.4 rc1. -arlin From ralph.campbell at qlogic.com Fri Aug 29 10:16:45 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Fri, 29 Aug 2008 10:16:45 -0700 Subject: [ofa-general] [PATCH] IB/ipath - fix SLID generation for RC/UC QPs Message-ID: <20080829171645.14033.34664.stgit@eng-46.mv.qlogic.com> The code to set the source LID in the LRH was not setting the low bits if LMC != 0 for RC/UC QPs. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_rc.c | 3 ++- drivers/infiniband/hw/ipath/ipath_ruc.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_rc.c b/drivers/infiniband/hw/ipath/ipath_rc.c index 9771052..7b93cda 100644 --- a/drivers/infiniband/hw/ipath/ipath_rc.c +++ b/drivers/infiniband/hw/ipath/ipath_rc.c @@ -675,7 +675,8 @@ static void send_rc_ack(struct ipath_qp *qp) hdr.lrh[0] = cpu_to_be16(lrh0); hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid); hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC); - hdr.lrh[3] = cpu_to_be16(dd->ipath_lid); + hdr.lrh[3] = cpu_to_be16(dd->ipath_lid | + qp->remote_ah_attr.src_path_bits); ohdr->bth[0] = cpu_to_be32(bth0); ohdr->bth[1] = cpu_to_be32(qp->remote_qpn); ohdr->bth[2] = cpu_to_be32(qp->r_ack_psn & IPATH_PSN_MASK); diff --git a/drivers/infiniband/hw/ipath/ipath_ruc.c b/drivers/infiniband/hw/ipath/ipath_ruc.c index af051f7..fc0f6d9 100644 --- a/drivers/infiniband/hw/ipath/ipath_ruc.c +++ b/drivers/infiniband/hw/ipath/ipath_ruc.c @@ -618,7 +618,8 @@ void ipath_make_ruc_header(struct ipath_ibdev *dev, struct ipath_qp *qp, qp->s_hdr.lrh[0] = cpu_to_be16(lrh0); qp->s_hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid); qp->s_hdr.lrh[2] = cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC); - qp->s_hdr.lrh[3] = cpu_to_be16(dev->dd->ipath_lid); + qp->s_hdr.lrh[3] = cpu_to_be16(dev->dd->ipath_lid | + qp->remote_ah_attr.src_path_bits); bth0 |= ipath_get_pkey(dev->dd, qp->s_pkey_index); bth0 |= extra_bytes << 20; ohdr->bth[0] = cpu_to_be32(bth0 | (1 << 22)); From vlad at lists.openfabrics.org Sat Aug 30 03:03:20 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 30 Aug 2008 03:03:20 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080830-0200 daily build status Message-ID: <20080830100321.0D629E60A8D@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.26 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Failed: Build failed on i686 with linux-2.6.16 Build failed on i686 with linux-2.6.18 Build failed on i686 with linux-2.6.17 From vlad at lists.openfabrics.org Sun Aug 31 03:01:40 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 31 Aug 2008 03:01:40 -0700 (PDT) Subject: [ofa-general] ofa_1_4_kernel 20080831-0200 daily build status Message-ID: <20080831100140.858B7E60B2E@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.24 Failed: Build failed on i686 with linux-2.6.16 Build failed on i686 with linux-2.6.18 Build failed on i686 with linux-2.6.17 From tziporet at dev.mellanox.co.il Sun Aug 31 03:49:01 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 31 Aug 2008 13:49:01 +0300 Subject: ***SPAM*** Re: ***SPAM*** Re: [ofa-general] Efficient management of many connections In-Reply-To: <9870a2060808281233x7159235fi6dd964595c4462f1@mail.gmail.com> References: <9870a2060808280852w7d9134b4x14aad3e7155fde4d@mail.gmail.com> <2f3bf9a60808281202j65a5d0fevb4c7a9133978922d@mail.gmail.com> <9870a2060808281213x2667d2a1g857859b37ddab968@mail.gmail.com> <2f3bf9a60808281219h7ded0c81w2f16684ca49e06c7@mail.gmail.com> <6.2.0.14.2.20080828122631.023e6420@esmail.cup.hp.com> <9870a2060808281233x7159235fi6dd964595c4462f1@mail.gmail.com> Message-ID: <48BA771D.1000700@mellanox.co.il> Adrien Guillon wrote: > In my case, I have a cluster in which each node will be connected to > every other for RDMA operations. So each node will act as a server > for incoming connections, and have connections to all other nodes. > > Any suggestions on best approach? > > If you multi processes on node (and not only one) you can also use XRC. Tziporet From sashak at voltaire.com Sun Aug 31 05:58:40 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 15:58:40 +0300 Subject: [ofa-general] Re: [PATCH] opensm/osm_qos_policy.c: removing some log messages In-Reply-To: <48AD7ACB.3050803@dev.mellanox.co.il> References: <48AD7ACB.3050803@dev.mellanox.co.il> Message-ID: <20080831125840.GH27535@sashak.voltaire.com> On 17:25 Thu 21 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Removing some log messages - all the info that they provide > is printed in the osm_sa_(multi)path_record.c anyway. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 31 06:05:29 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 16:05:29 +0300 Subject: [ofa-general] Re: [PATCH v4] opensm/osm_qos_policy.c: log matched qos criteria In-Reply-To: <48AD7BE2.7050402@dev.mellanox.co.il> References: <48AD7BE2.7050402@dev.mellanox.co.il> Message-ID: <20080831130529.GI27535@sashak.voltaire.com> On 17:29 Thu 21 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > Adding log message for matched criteria of the QoS > policy rule. > > This patch addresses all the issues that were brought > up during the previous versions: one log message for > all the criteria, no string manipulation/sprintf. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 31 06:31:22 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 16:31:22 +0300 Subject: [ofa-general] Re: [PATCH] opensm: osm_opensm.c - changed load_plugins() arg to const In-Reply-To: <48B72864.60105@llnl.gov> References: <48B72864.60105@llnl.gov> Message-ID: <20080831133122.GJ27535@sashak.voltaire.com> On 15:36 Thu 28 Aug , Timothy A. Meier wrote: > Sasha, > I am doing some plugin work, and ran into both of the cases this patch > addresses. > > From 680195f393a8c3a5036fc1c8afe76ae7bac5d3cb Mon Sep 17 00:00:00 2001 > From: Tim Meier > Date: Thu, 28 Aug 2008 15:10:35 -0700 > Subject: [PATCH] opensm: osm_opensm.c - changed load_plugins() arg to const > > Made a copy of the list of plugin names, so the parser would > not destroy the original copy. Also now supports passing in > pointers to string constants. > > Signed-off-by: Tim Meier Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 31 06:34:03 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 16:34:03 +0300 Subject: [ofa-general] Re: [IBSIM][Trivial] fix error message typo In-Reply-To: <1219962624.29252.310.camel@cardanus.llnl.gov> References: <1219962624.29252.310.camel@cardanus.llnl.gov> Message-ID: <20080831133403.GK27535@sashak.voltaire.com> On 15:30 Thu 28 Aug , Al Chu wrote: > Hey Sasha, > > Probably due to a cut and paste. Nothing fancy. > > Al > > -- > Albert Chu > chu11 at llnl.gov > 925-422-5311 > Computer Scientist > High Performance Systems Division > Lawrence Livermore National Laboratory > From 0db2f41ba9c3a12faaf7e0471cc44f6af50de5ef Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Thu, 28 Aug 2008 15:26:37 -0700 > Subject: [PATCH] fix newline typo > > > Signed-off-by: Albert Chu Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 31 06:45:03 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 16:45:03 +0300 Subject: [ofa-general] Re: [IBSIM] add ReLink command In-Reply-To: <1219964487.29252.318.camel@cardanus.llnl.gov> References: <1219964487.29252.318.camel@cardanus.llnl.gov> Message-ID: <20080831134503.GL27535@sashak.voltaire.com> Hi Al, On 16:01 Thu 28 Aug , Al Chu wrote: > Hey Sasha, > > This adds a "ReLink" command to ibsim. If a link was previously > unlinked, you can run "ReLink" to reconnect it to whatever it was > connected to before. It's easier than having to figure out what it was > connected to previously and input both the local and remote ends under > the "Link" command. > > The idea for this option came up when I was trying to simulate an entire > cluster going down then going back up. Scripting the cluster to go down > was easy ("Unlink" all CAs), but scripting it to come back up was a > little harder since I had to figure out all the other end ports to input > into "Link". > > Al > > -- > Albert Chu > chu11 at llnl.gov > 925-422-5311 > Computer Scientist > High Performance Systems Division > Lawrence Livermore National Laboratory > From ec9cf72ac3dc5950337aa577f49ada6b8887d579 Mon Sep 17 00:00:00 2001 > From: Albert Chu > Date: Thu, 28 Aug 2008 15:25:14 -0700 > Subject: [PATCH] add relink command > > > Signed-off-by: Albert Chu > --- > ibsim/sim.h | 2 + > ibsim/sim_cmd.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 70 insertions(+), 0 deletions(-) > > diff --git a/ibsim/sim.h b/ibsim/sim.h > index f989252..32a4e20 100644 > --- a/ibsim/sim.h > +++ b/ibsim/sim.h > @@ -206,6 +206,8 @@ struct Port { > char alias[ALIASLEN + 1]; > Node *remotenode; > int remoteport; > + Node *previous_remotenode; > + int previous_remoteport; > int errrate; > uint16_t errattr; > Node *node; > diff --git a/ibsim/sim_cmd.c b/ibsim/sim_cmd.c > index d55fb4c..39eb316 100644 > --- a/ibsim/sim_cmd.c > +++ b/ibsim/sim_cmd.c > @@ -149,14 +149,79 @@ static int do_link(FILE * f, char *line) > if (link_ports(lport, rport) < 0) > return -fprintf(f, > "# can't link: local/remote port are already connected\n"); > + > + lport->previous_remotenode = NULL; > + rport->previous_remotenode = NULL; > + > + return 0; > +} > + > +static int do_relink(FILE * f, char *line) > +{ > + Port *lport, *rport; > + Node *lnode; > + char *orig = 0; > + char *lnodeid = 0; > + char *s = line, name[NAMELEN], *sp; > + int lportnum = -1; > + > + // parse local > + if (strsep(&s, "\"")) > + orig = strsep(&s, "\""); > + > + lnodeid = expand_name(orig, name, &sp); > + if (!sp && s && *s == '[') > + sp = s + 1; > + > + DEBUG("lnodeid %s port [%s", lnodeid, sp); > + if (!(lnode = find_node(lnodeid))) { > + fprintf(f, "# nodeid \"%s\" (%s) not found\n", orig, lnodeid); > + return -1; > + } > + > + if (sp) { > + lportnum = strtoul(sp, &sp, 0); > + if (lportnum < 1 || lportnum > lnode->numports) { > + fprintf(f, "# nodeid \"%s\": bad port %d\n", > + lnodeid, lportnum); > + return -1; > + } > + } else { > + fprintf(f, "# no local port\n"); > + return -1; So if one asked for ReLinking whole node? I think it should be straightforward - restore links for all ports where previous_remote* exists. What do you think? > + } > + > + lport = node_get_port(lnode, lportnum); > + > + if (!lport->previous_remotenode) { > + fprintf(f, "# no previous link stored\n"); > + return -1; > + } > + > + rport = node_get_port(lport->previous_remotenode, lport->previous_remoteport); > + > + if (link_ports(lport, rport) < 0) > + return -fprintf(f, > + "# can't link: local/remote port are already connected\n"); > + > + lport->previous_remotenode = NULL; > + rport->previous_remotenode = NULL; > + > return 0; > } > > + No need extra lines between functions. > static void unlink_port(Node * lnode, Port * lport, Node * rnode, int rportnum) > { > Port *rport = node_get_port(rnode, rportnum); > Port *endport; > > + /* save current connection for potential relink later */ > + lport->previous_remotenode = lport->remotenode; > + lport->previous_remoteport = lport->remoteport; > + rport->previous_remotenode = rport->remotenode; > + rport->previous_remoteport = rport->remoteport; > + > lport->remotenode = rport->remotenode = 0; > lport->remoteport = rport->remoteport = 0; > lport->remotenodeid[0] = rport->remotenodeid[0] = 0; > @@ -713,6 +778,7 @@ static int dump_help(FILE * f) > fprintf(f, "\tDump [nodeid] (def all network)\n"); > fprintf(f, "\tRoute \n"); > fprintf(f, "\tLink \"nodeid\"[port] \"remoteid\"[port]\n"); > + fprintf(f, "\tReLink \"nodeid\"[port] : reconnect previously unconnected link\n"); Maybe "restore previously disconnected link(s)" help message? Actually it is almost same :) Sasha > fprintf(f, "\tUnlink \"nodeid\" : remove all links of the node\n"); > fprintf(f, "\tUnlink \"nodeid\"[port]\n"); > fprintf(f, > @@ -814,6 +880,8 @@ int do_cmd(char *buf, FILE *f) > * > * please specify new command support below this comment. > */ > + else if (!strncasecmp(line, "ReLink", cmd_len)) > + r = do_relink(f, line); > else if (*line != '\n' && *line != '\0') > fprintf(f, "command \'%s\' unknown - skipped\n", line); > > -- > 1.5.4.5 > From acv at linux.vnet.ibm.com Sun Aug 31 03:32:02 2008 From: acv at linux.vnet.ibm.com (Anoop) Date: Sun, 31 Aug 2008 16:02:02 +0530 Subject: [ofa-general] [PATCH] librdmacm: Add NULL pointer check in ucma_cleanup Message-ID: <48BA7322.3020209@linux.vnet.ibm.com> If ibv_open_device failed, Segfault will occur at libibverbs-1.1.1/src/device.c, since context is NULL int __ibv_close_device(struct ibv_context *context) { int async_fd = context->async_fd; <====== int cmd_fd = context->cmd_fd; int cq_fd = -1; The check is missing at librdmacm-1.0.7/src/cma.c static void ucma_cleanup(void) { if (cma_dev_cnt) { while (cma_dev_cnt) ibv_close_device(cma_dev_array[--cma_dev_cnt].verbs); <====== Signed-off-by: Anoop Vijayan --- librdmacm-1.0.7/src/cma.c.orig 2008-08-31 05:18:49.000000000 -0400 +++ librdmacm-1.0.7/src/cma.c 2008-08-31 05:20:37.000000000 -0400 @@ -163,9 +163,11 @@ static int abi_ver = RDMA_USER_CM_MAX_AB static void ucma_cleanup(void) { if (cma_dev_cnt) { - while (cma_dev_cnt) - ibv_close_device(cma_dev_array[--cma_dev_cnt].verbs); - + while (cma_dev_cnt) { + if (cma_dev_array[--cma_dev_cnt].verbs) + ibv_close_device(cma_dev_array[cma_dev_cnt].verbs); + } + free(cma_dev_array); cma_dev_cnt = 0; } Cheers! - Anoop. From sashak at voltaire.com Sun Aug 31 07:15:29 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 17:15:29 +0300 Subject: [ofa-general] Re: [PATCH] osmtest: fixing core dump (ofed_1_3) In-Reply-To: <48B4FC74.2070903@dev.mellanox.co.il> References: <48B4FC74.2070903@dev.mellanox.co.il> Message-ID: <20080831141529.GM27535@sashak.voltaire.com> On 10:04 Wed 27 Aug , Yevgeny Kliteynik wrote: > Hi Sasha, > > As Matthias points out, the buffer that is used by > sprintf is too small. Looks like leading '0x' was > overlooked when the buffer length was calculated. > > Please apply to OFED_1_3 branch only - you have already > cleaned up all the sprintf usage in the trunk. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Sun Aug 31 07:46:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 31 Aug 2008 17:46:45 +0300 Subject: [ofa-general] Re: [PATCH] ibsim: Add support for vendor ID and system image GUID In-Reply-To: <48B5E711.7030503@obsidianresearch.com> References: <48A30108.4010307@obsidianresearch.com> <20080818201718.GJ27204@sashak.voltaire.com> <48B5E711.7030503@obsidianresearch.com> Message-ID: <20080831144645.GN27535@sashak.voltaire.com> Hi Hal, On 17:45 Wed 27 Aug , Hal Rosenstock wrote: >>> diff --git a/ibsim/sim_net.c b/ibsim/sim_net.c >>> index 6e3c0e9..146bcde 100644 >>> --- a/ibsim/sim_net.c >>> +++ b/ibsim/sim_net.c >>> @@ -190,7 +190,9 @@ char (*aliases)[NODEIDLEN + NODEPREFIX + 1]; // >>> aliases map format: "%s@%s" >>> int netnodes, netswitches, netports, netaliases; >>> char netprefix[NODEPREFIX + 1]; >>> +int netvendid; >>> int netdevid; >>> +uint64_t netsysimgguid; >>> int netwidth = DEFAULT_LINKWIDTH; >>> int netspeed = DEFAULT_LINKSPEED; >>> @@ -324,11 +326,12 @@ static Node *new_node(int type, char *nodename, >>> char *nodedesc, int nodeports) >>> } >>> mad_set_field(nd->nodeinfo, 0, IB_NODE_NPORTS_F, nd->numports); >>> + mad_set_field(nd->nodeinfo, 0, IB_NODE_VENDORID_F, netvendid); >>> mad_set_field(nd->nodeinfo, 0, IB_NODE_DEVID_F, netdevid); >>> mad_encode_field(nd->nodeinfo, IB_NODE_GUID_F, &nd->nodeguid); >>> mad_encode_field(nd->nodeinfo, IB_NODE_PORT_GUID_F, &nd->nodeguid); >>> - mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &nd->nodeguid); >>> + mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, &netsysimgguid); >>> >> >> And when netsysimgguid was not parsed for this node, it will put previous >> value there (or "0" if it was never parsed)? >> > Is "state" for a node in the topology file needed to deal with this ? > Something like the following: When the vendor ID line is seen, reset > netsysimgguid and if 0 when new_node is invoked, then use the node GUID as > currently done. Does that make sense ? Why to not reset netsysimgguid unconditionally at end of new_node()? The rest could be as you said: mad_encode_field(nd->nodeinfo, IB_NODE_SYSTEM_GUID_F, netsysimgguid ? &netsysimgguid : &nd->nodeguid); Sasha From aj.guillon at gmail.com Sun Aug 31 11:48:47 2008 From: aj.guillon at gmail.com (Adrien Guillon) Date: Sun, 31 Aug 2008 14:48:47 -0400 Subject: [ofa-general] ***SPAM*** Interrupt RDMA Read Message-ID: <9870a2060808311148h65c7950g735e5d33d4690960@mail.gmail.com> Hey, How can I interrupt an RDMA read cleanly? In my case, I might decide that I don't need to read some memory anymore (because something else happened), so I want to abort. AJ From dotanba at gmail.com Sun Aug 31 23:00:04 2008 From: dotanba at gmail.com (Dotan Barak) Date: Mon, 1 Sep 2008 09:00:04 +0300 Subject: [ofa-general] ***SPAM*** Interrupt RDMA Read In-Reply-To: <9870a2060808311148h65c7950g735e5d33d4690960@mail.gmail.com> References: <9870a2060808311148h65c7950g735e5d33d4690960@mail.gmail.com> Message-ID: <2f3bf9a60808312300q778c7aaen23b2ca70d5f2c1ea@mail.gmail.com> As much as i know, once you posted a WR, you can not cancel it. The only thing that you can do is flush the whole QP by changing the QP state to ERROR (which flushes the work Queues and produces completion for every WR) or to RESET, which cleans the Queues from the WRs. Dotan On Sun, Aug 31, 2008 at 9:48 PM, Adrien Guillon wrote: > Hey, > > How can I interrupt an RDMA read cleanly? In my case, I might decide > that I don't need to read some memory anymore (because something else > happened), so I want to abort. > > AJ > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >